[Bug sanitizer/84508] Load of misaligned address using _mm_load_sd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84508 --- Comment #23 from Jeffrey Walton --- (In reply to Peter Cordes from comment #22) > [...] > That instruction is useless and should never be used in asm except for > code-alignment reasons (1 byte longer than MOVLPS, same length as MOVSD, all > three doing the same thing for the memory-destination form). But easy to > imagine some code using that intrinsic to store an unaligned double into a > byte buffer. Reading from and writing to a [unaligned] byte stream in 4 or 8 byte chunks is our use case. Eventually, we need to perform traditional SIMD processing. But the loads and stores have to occur using these old instrinsics due to the word types, data stream format and supported ISA's. I believe the other option is to memcpy the byte stream into a properly aligned intermediate buffer. But that could incur a performance hit if the optimizer misses the opportunity (and fails to elide the memcpy).
[Bug driver/81358] libatomic not automatically linked with C11 code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81358 --- Comment #13 from Jeffrey Walton --- Add a mee too. When using sanitizers, like -fsanitize=undefined, the compiler driver is not adding the necessary libraries to link the program. Ugh... https://github.com/weidai11/cryptopp/issues/1141#issuecomment-1224069820
[Bug rtl-optimization/106568] -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106568 --- Comment #20 from Jeffrey Walton --- Hi Andrew. I went through the list of options that are enabled at -O2 from [1]. I built the library with each option separately at "-DNDEBUG -g2 -O2 -fno-xxx". Here is the list of suspects. I seem to recall having trouble with -fdevirtualize in the past: -fno-devirtualize -fno-indirect-inlining -fno-devirtualize Here are the full results: Failed to execute with -fno-align-functions Failed to execute with -fno-align-jumps Failed to execute with -fno-align-labels Failed to execute with -fno-align-loops Failed to execute with -fno-caller-saves Failed to execute with -fno-code-hoisting Failed to execute with -fno-crossjumping Failed to execute with -fno-cse-follow-jumps Failed to execute with -fno-cse-skip-blocks Failed to execute with -fno-delete-null-pointer-checks Ok! Ok! Ok! Ok! Ok! -fno-devirtualize Failed to execute with -fno-devirtualize-speculatively Failed to execute with -fno-expensive-optimizations Failed to execute with -fno-finite-loops Failed to execute with -fno-gcse Failed to execute with -fno-gcse-lm Failed to execute with -fno-hoist-adjacent-loads Failed to execute with -fno-inline-functions Failed to execute with -fno-inline-small-functions Ok! Ok! Ok! Ok! Ok! -fno-indirect-inlining Failed to execute with -fno-ipa-bit-cp Failed to execute with -fno-ipa-cp Failed to execute with -fno-ipa-icf Failed to execute with -fno-ipa-ra Failed to execute with -fno-ipa-sra Failed to execute with -fno-ipa-vrp Failed to execute with -fno-isolate-erroneous-paths-dereference Failed to execute with -fno-lra-remat Failed to execute with -fno-optimize-sibling-calls Failed to execute with -fno-optimize-strlen Failed to execute with -fno-partial-inlining Failed to execute with -fno-peephole2 Failed to build with -fno-reorder-blocks-algorithm=stc Failed to execute with -fno-reorder-blocks-and-partition Failed to execute with -fno-reorder-functions Failed to execute with -fno-rerun-cse-after-loop Failed to execute with -fno-schedule-insns Failed to execute with -fno-schedule-insns2 Failed to execute with -fno-sched-interblock Failed to execute with -fno-sched-spec Failed to execute with -fno-store-merging Ok! Ok! Ok! Ok! Ok! -fno-strict-aliasing Failed to execute with -fno-thread-jumps Failed to execute with -fno-tree-builtin-call-dce Failed to execute with -fno-tree-loop-vectorize Failed to execute with -fno-tree-pre Ok! Ok! Ok! Ok! Ok! -fno-tree-slp-vectorize Failed to execute with -fno-tree-switch-conversion Failed to execute with -fno-tree-tail-merge Failed to execute with -fno-tree-vrp Failed to build with -fno-vect-cost-model=very-cheap Attached is the script I used to repeatedly build the library.
[Bug rtl-optimization/106568] -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106568 --- Comment #19 from Jeffrey Walton --- Created attachment 53427 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53427=edit Test script to build library at -O2 with -fno-xxx Test script to build library at -O2 with -fno-xxx
[Bug rtl-optimization/106568] -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106568 --- Comment #18 from Jeffrey Walton --- (In reply to Andrew Pinski from comment #17) > The other thing to try is -fstack-reuse=none. No joy with -fstack-reuse=none. The crash is still present.
[Bug rtl-optimization/106568] -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106568 --- Comment #15 from Jeffrey Walton --- It looks like -fno-strict-aliasing cleared the crash. This is bad because I thought we did not violate aliasing rules. Let me try to find it.
[Bug rtl-optimization/106568] -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106568 --- Comment #14 from Jeffrey Walton --- (In reply to Jeffrey Walton from comment #13) > (In reply to Andrew Pinski from comment #12) > > Can you try -fno-reorder-blocks-and-partition adding to the options? > > This would not be the first time this option caused issues with EH. > > No joy with -fno-reorder-blocks-and-partition . We still saw the crash with > CXXFLAGS="-DNDEBUG -g2 -O3 -fno-reorder-blocks-and-partition". I did notice that using -fno-reorder-functions results in "terminate called without an active exception". That's unusual because we have exception handlers in place. $ ./cryptest.exe vv 51 Using seed: 1660007252 ECGDSA validation suite running... passedbrainpoolP192r1 using SHA-1 passedsignature key validation passedsignature and verification passedchecking invalid signature passedbrainpoolP320r1 using SHA-224 passedsignature key validation passedsignature and verification passedchecking invalid signature passedbrainpoolP320r1 using SHA-256 passedsignature key validation passedsignature and verification passedchecking invalid signature passedbrainpoolP512r1 using SHA-384 passedsignature key validation passedsignature and verification passedchecking invalid signature passedbrainpoolP512r1 using SHA-512 passedsignature key validation passedsignature and verification passedchecking invalid signature terminate called without an active exception Aborted (core dumped)
[Bug rtl-optimization/106568] -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106568 --- Comment #13 from Jeffrey Walton --- (In reply to Andrew Pinski from comment #12) > Can you try -fno-reorder-blocks-and-partition adding to the options? > This would not be the first time this option caused issues with EH. No joy with -fno-reorder-blocks-and-partition . We still saw the crash with CXXFLAGS="-DNDEBUG -g2 -O3 -fno-reorder-blocks-and-partition".
[Bug rtl-optimization/106568] -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106568 --- Comment #10 from Jeffrey Walton --- I'm not sure if this is helpful, but Valgrind is showing invalid reads in the unwind gear: $ valgrind ./cryptest.exe vv 51 ==27339== Memcheck, a memory error detector ==27339== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==27339== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info ==27339== Command: ./cryptest.exe vv 51 ==27339== Using seed: 166617 ECGDSA validation suite running... ==27339== Invalid read of size 8 ==27339==at 0x4B8D673: _Unwind_Resume (unwind.inc:241) ==27339==by 0x433E55: ~vector (stl_vector.h:733) ==27339==by 0x433E55: ~OID (asn.h:267) ==27339==by 0x433E55: CryptoPP::Test::ValidateECGDSAStandard() [clone .cold] (validat9.cpp:370) ==27339==by 0x5409C3: CryptoPP::Test::ValidateECGDSA(bool) (validat9.cpp:663) ==27339==by 0x452476: CryptoPP::Test::Validate(int, bool) (test.cpp:995) ==27339==by 0x45ACA2: CryptoPP::Test::scoped_main(int, char**) (test.cpp:401) ==27339==by 0x4BB954F: (below main) (libc_start_call_main.h:58) ==27339== Address 0x4db3c00 is 0 bytes after a block of size 16 alloc'd ==27339==at 0x4847A83: memalign (vg_replace_malloc.c:1517) ==27339==by 0x5E4630: CryptoPP::AlignedAllocate(unsigned long) (allocate.cpp:49) ==27339==by 0x5D723C: allocate (secblock.h:215) ==27339==by 0x5D723C: SecBlock (secblock.h:767) ==27339==by 0x5D723C: CryptoPP::Integer::Integer() (integer.cpp:2967) ==27339==by 0x49AB74: DL_FixedBasePrecomputationImpl (eprecomp.h:133) ==27339==by 0x49AB74: CryptoPP::DL_PublicKeyImpl >::DL_PublicKeyImpl() (pubkey.h:1335) ==27339==by 0x540971: DL_PublicKey_ECGDSA (eccrypto.h:500) ==27339==by 0x540971: DL_ObjectImplBase (pubkey.h:1956) ==27339==by 0x540971: DL_ObjectImpl (pubkey.h:1996) ==27339==by 0x540971: DL_VerifierImpl (pubkey.h:2035) ==27339==by 0x540971: PK_FinalTemplate (pubkey.h:2209) ==27339==by 0x540971: CryptoPP::Test::ValidateECGDSAStandard() (validat9.cpp:340) ==27339==by 0x5409C3: CryptoPP::Test::ValidateECGDSA(bool) (validat9.cpp:663) ==27339==by 0x452476: CryptoPP::Test::Validate(int, bool) (test.cpp:995) ==27339==by 0x45ACA2: CryptoPP::Test::scoped_main(int, char**) (test.cpp:401) ==27339==by 0x4BB954F: (below main) (libc_start_call_main.h:58) ==27339== ==27339== Invalid read of size 8 ==27339==at 0x4B8CD14: _Unwind_RaiseException_Phase2 (unwind.inc:54) ==27339==by 0x4B8D6CC: _Unwind_Resume (unwind.inc:242) ==27339==by 0x433E55: ~vector (stl_vector.h:733) ==27339==by 0x433E55: ~OID (asn.h:267) ==27339==by 0x433E55: CryptoPP::Test::ValidateECGDSAStandard() [clone .cold] (validat9.cpp:370) ==27339==by 0x5409C3: CryptoPP::Test::ValidateECGDSA(bool) (validat9.cpp:663) ==27339==by 0x452476: CryptoPP::Test::Validate(int, bool) (test.cpp:995) ==27339==by 0x45ACA2: CryptoPP::Test::scoped_main(int, char**) (test.cpp:401) ==27339==by 0x4BB954F: (below main) (libc_start_call_main.h:58) ==27339== Address 0x4db3c08 is 8 bytes after a block of size 16 alloc'd ==27339==at 0x4847A83: memalign (vg_replace_malloc.c:1517) ==27339==by 0x5E4630: CryptoPP::AlignedAllocate(unsigned long) (allocate.cpp:49) ==27339==by 0x5D723C: allocate (secblock.h:215) ==27339==by 0x5D723C: SecBlock (secblock.h:767) ==27339==by 0x5D723C: CryptoPP::Integer::Integer() (integer.cpp:2967) ==27339==by 0x49AB74: DL_FixedBasePrecomputationImpl (eprecomp.h:133) ==27339==by 0x49AB74: CryptoPP::DL_PublicKeyImpl >::DL_PublicKeyImpl() (pubkey.h:1335) ==27339==by 0x540971: DL_PublicKey_ECGDSA (eccrypto.h:500) ==27339==by 0x540971: DL_ObjectImplBase (pubkey.h:1956) ==27339==by 0x540971: DL_ObjectImpl (pubkey.h:1996) ==27339==by 0x540971: DL_VerifierImpl (pubkey.h:2035) ==27339==by 0x540971: PK_FinalTemplate (pubkey.h:2209) ==27339==by 0x540971: CryptoPP::Test::ValidateECGDSAStandard() (validat9.cpp:340) ==27339==by 0x5409C3: CryptoPP::Test::ValidateECGDSA(bool) (validat9.cpp:663) ==27339==by 0x452476: CryptoPP::Test::Validate(int, bool) (test.cpp:995) ==27339==by 0x45ACA2: CryptoPP::Test::scoped_main(int, char**) (test.cpp:401) ==27339==by 0x4BB954F: (below main) (libc_start_call_main.h:58) ==27339== ==27339== Invalid read of size 8 ==27339==at 0x4B8D673: _Unwind_Resume (unwind.inc:241) ==27339==by 0x425BAB: CryptoPP::Test::scoped_main(int, char**) [clone .cold] (test.cpp:442) ==27339==by 0x4BB954F: (below main) (libc_start_call_main.h:58) ==27339== Address 0x4db3c00 is 0 bytes after a block of size 16 alloc'd ==27339==at 0x4847A83: memalign (vg_replace_malloc.c:1517) ==27339==by 0x5E4630: CryptoPP::AlignedAllocate(unsigned long) (allocate.cpp:49) ==27339==by 0x5D723C: allocate (secblock.h:215) ==27339==by 0x5D723C: SecBlock (secblock.h:767) ==27339==by 0x5D723C: CryptoPP::Integer::Integer() (integer.cpp:2967) ==27339==by
[Bug rtl-optimization/106568] -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106568 --- Comment #9 from Jeffrey Walton --- (In reply to Andrew Pinski from comment #8) > (In reply to Jeffrey Walton from comment #7) > > Try putting a breakpoint on the following functions: > _Unwind_RaiseException > _Unwind_ForcedUnwind > _Unwind_Resume > _Unwind_Resume_or_Rethrow > _Unwind_DeleteException > > besides _Unwind_Resume which will be hit at least once since the backtrace > shows it was hit, what is the backtrace for these breapoints? The only breakpoint that hits is _Unwind_Resume. The backtrace for _Unwind_Resume is: ECGDSA validation suite running... Breakpoint 3, _Unwind_Resume (exc=exc@entry=0x978460) at ../../../libgcc/unwind.inc:231 231 { (gdb) n 236 uw_init_context (_context); (gdb) 237 cur_context = this_context; (gdb) p this_context $2 = {reg = {0x7fffce88, 0x7fffce90, 0x0, 0x7fffce98, 0x0, 0x0, 0x7fffcec0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7fffcea0, 0x7fffcea8, 0x7fffceb0, 0x7fffceb8, 0x7fffcec8, 0x0}, cfa = 0x7fffced0, ra = 0x433e56 , lsda = 0x0, bases = {tbase = 0x0, dbase = 0x0, func = 0x77c965a0 <_Unwind_Resume>}, flags = 4611686018427387904, version = 0, args_size = 0, by_value = '\000' } 4611686018427387904 is 4000. (gdb) bt full #0 _Unwind_Resume (exc=exc@entry=0x978460) at ../../../libgcc/unwind.inc:246 this_context = {reg = {0x7fffce88, 0x7fffce90, 0x0, 0x7fffce98, 0x0, 0x0, 0x7fffcec0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7fffcea0, 0x7fffcea8, 0x7fffceb0, 0x7fffceb8, 0x7fffcec8, 0x0}, cfa = 0x7fffced0, ra = 0x433e56 , lsda = 0x0, bases = {tbase = 0x0, dbase = 0x0, func = 0x77c965a0 <_Unwind_Resume>}, flags = 4611686018427387904, version = 0, args_size = 0, by_value = '\000' } cur_context = {reg = {0x7fffce88, 0x7fffce90, 0x0, 0x7fffd608, 0x0, 0x0, 0x7fffd610, 0x0, 0x0, 0x0, 0x0, 0x0, 0x7fffd618, 0x7fffd620, 0x7fffd628, 0x7fffd630, 0x7fffd638, 0x0}, cfa = 0x7fffd640, ra = 0x45af7c , lsda = 0x8929a8, bases = {tbase = 0x0, dbase = 0x0, func = 0x4597f0 }, flags = 4611686018427387904, version = 0, args_size = 0, by_value = '\000' } code = _URC_INSTALL_CONTEXT frames = 4 #1 0x00433e56 in std::vector >::~vector (this=, __in_chrg=) --Type for more, q to quit, c to continue without paging--c at /usr/include/c++/12/bits/stl_vector.h:733 No locals. #2 CryptoPP::OID::~OID (this=, __in_chrg=) at /home/jwalton/cryptopp/asn.h:267 No locals. #3 CryptoPP::Test::ValidateECGDSAStandard () at validat9.cpp:370 e = msg = len = oid = {_vptr.OID = 0x8c4590 , m_values = std::vector of length 10, capacity 10 = {2421, 0, 2890278822, 2134504544, 3, 2, 8, 1, 1, 3}} r = maxLength = params = {, CryptoPP::DL_FixedBasePrecomputationImpl, CryptoPP::DL_GroupParameters >> = {> = { = { = { = { = {_vptr.NameValuePairs = 0x926060 +112>}, }, }, }, m_validationLevel = 0}, m_groupPrecomputation = {> = {_vptr.DL_GroupPrecomputation = 0x8c5850 +16>}, m_ec = {> = {m_p = 0x977750}, }, m_ecOriginal = {> = {m_p = 0x977d00}, }}, m_gpc = {> = {_vptr.DL_FixedBasePrecomputation = 0x925a60 +16>}, m_base = {_vptr.ECPPoint = 0x8c5810 , x = { = {}, = {_vptr.ASN1Object = 0x9170f8 }, reg = {m_alloc = {> = {}, }, m_mark = 2305843009213693951, m_size = 4, m_ptr = 0x977ca0}, sign = CryptoPP::Integer::POSITIVE}, y = { = {}, = {_vptr.ASN1Object = 0x9170f8 }, reg = {m_alloc = {> = {}, }, m_mark = 2305843009213693951, m_size = 4, m_ptr = 0x977cd0}, sign = CryptoPP::Integer::POSITIVE}, identity = false}, m_windowSize = 0, m_exponentBase = { = {}, = {_vptr.ASN1Object = 0x9170f8 }, reg = {m_alloc = {> = {}, }, m_mark = 2305843009213693951, m_size = 2, m_ptr = 0x9752b0}, sign = CryptoPP::Integer::POSITIVE}, m_bases = std::vector of length 1, capacity 1 = {{_vptr.ECPPoint = 0x8c5810 , x = { = {}, = {_vptr.ASN1Object = 0x9170f8 }, reg = {m_alloc = {> = {}, }, m_mark = 2305843009213693951, m_size = 4, m_ptr = 0x977f60}, sign = CryptoPP::Integer::POSITIVE}, y = { = {}, = {_vptr.ASN1Object = 0x9170f8 }, reg = {m_alloc = {> = {}, }, m_mark = 2305843009213693951, m_size = 4, m_ptr = 0x977f90}, sign = CryptoPP::Integer::POSITIVE}, identity = false, m_oid = {_vptr.OID = 0x8c4590 , m_values = std::vector of length 10, capacity 10 = {1, 3, 36, 3, 3, 2, 8, 1, 1, 3}}, m_n = { = {}, = {_vptr.ASN1Object = 0x9170f8 }, reg = {m_alloc = {> = {}, }, m_mark = 2305843009213693951, m_size = 4, m_ptr = 0x978070}, sign = CryptoPP::Integer::POSITIVE}, m_k = { = {}, = {_vptr.ASN1Object = 0x9170f8 }, reg = {m_alloc = {> = {}, }, m_mark = 2305843009213693951, m_size = 2, m_ptr = 0x975840}, sign = CryptoPP::Integer::POSITIVE}, m_compress = false, m_encodeAsOID = true}
[Bug rtl-optimization/106568] -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106568 --- Comment #7 from Jeffrey Walton --- (In reply to Andrew Pinski from comment #6) > >And the program does not take the exception path. Instead it segfaults. > > If I read the backtrace correctly, it is trying to resume an unwind because > it didn't find a catch that would hit in main but the following code hits > the assert while unwinding: > /* Choose between continuing to process _Unwind_RaiseException > or _Unwind_ForcedUnwind. */ > if (exc->private_1 == 0) > code = _Unwind_RaiseException_Phase2 (exc, _context, ); > else > code = _Unwind_ForcedUnwind_Phase2 (exc, _context, ); > > gcc_assert (code == _URC_INSTALL_CONTEXT); > > And then abort calls raise which then segfaults. Thanks again Andrew. We have exception handlers for both CryptoPP::Exception& and std::exception& starting for main() around https://github.com/weidai11/cryptopp/blob/master/test.cpp#L442 . However, we should not hit either of them. When they trigger there's a problem that needs to be fixed. The code in question tests for good and bad digital signatures. It should catch a SignatureVerificationFailed exception on occasion and this is expected. But it should catch closer to the the actual test (and not in main or Test::scoped_main). Not to mention 'catch throw' is not catching anything under gdb. Is there something we should be doing differently?
[Bug rtl-optimization/106568] -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106568 --- Comment #5 from Jeffrey Walton --- (In reply to Andrew Pinski from comment #3) > Though there might be an EH issue but there has not been an EH issue for a > long time . This is an interesting observation. The stack trace shows frame #0 is in pthread_kill_thread (or similar). But up in our program, around frame #4 or #5, gdb is identifying the line with a catch (CryptoPP::Exception& ). CryptoPP::Exception is the library's base class exception, so it should catch everything the library throws. This is the line gdb faults (https://github.com/weidai11/cryptopp/blob/master/test.cpp#L442) : catch(const Exception ) // 442 { std::cout << "\nException caught: " << e.what() << std::endl; return -1; } which makes no sense to me. And the program does not take the exception path. Instead it segfaults.
[Bug c++/106568] New: -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106568 Bug ID: 106568 Summary: -freorder-blocks-algorithm appears to causes a crash in stable code, no way to disable it Product: gcc Version: 12.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: noloader at gmail dot com Target Milestone: --- Hi Everyone, This is going to be a shitty bug report because we don't have a reproducer. We believe we have it narrowed down to a particular optimization, however. Debian Unstable, Fedora 37 and Gentoo 17.1 are reporting crashes in Crypto++ test program.[1,2] The distros use GCC 12. We found a particular function crashes without explanation (and a garbage backtrace) at -O2 and -O3. The function is Ok at -O0, -O1 and -Os. (The code has been fairly stable for years. It is -Wall, -Wextra, Asan, UBsan and Valgrind clean. We would be surprised to learn we have undetected UB. But we don't rule it out). According to GCC Optimization docs, the difference between -Os (no crash) and -O2 (crash) are:[3] -falign-functions -falign-jumps -falign-labels -falign-loops -fprefetch-loop-arrays -freorder-blocks-algorithm=stc We used CFLAGS and CXXFLAGS with -Os plus listed opts less -freorder-blocks-algorithm=stc. The crash went away. We are fairly certain the problem is with the -freorder-blocks-algorithm optimization. The problem we are now having is, we don't know how to disable it. The following fails to compile: -fno-reorder-blocks-algorithm -freorder-blocks-algorithm=none -freorder-blocks-algorithm= So, we believe we have a bad option in -freorder-blocks-algorithm, but we can't disable it for typical opt settings used by distros. The typical opt setting is -O2 or -O3. I sincerely apologize for not having a reproducer. I'm not sure where to begin when it comes to -freorder-blocks-algorithm. Please advise. [1] https://github.com/weidai11/cryptopp/issues/1134 [2] https://github.com/weidai11/cryptopp/issues/1141 [3] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
[Bug target/104455] Cannot select -march=armv7-a using GCC 11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104455 Jeffrey Walton changed: What|Removed |Added Resolution|INVALID |FIXED --- Comment #4 from Jeffrey Walton --- (In reply to Richard Earnshaw from comment #3) > Your compiler is configured to pick the fp architecture up from the -march > (or -mcpu) option (it's using an 'auto' fpu). Your ABI requires an FPU, so > you need to specify that as part of the -march command. Use > -march=armv7-a+fp or something like -march=armv7-a+simd. The various > options are described in the manual. Thanks again Richard. So stepping back to 10,000 feet, we now need to specify options and ISA's we are not using. That seems like a bug to me. I'm not sure I would consider this fixed. Where is it going to stop? How many non-used options that I am not aware of will I need to specify? I think either GCC or Debian needs to fix this. This could be a GCC bug because GCC apparently knows there's a fp unit but it chooses to ignore it. Instead it wants me to say it again. This could be a Debian bug because they need to completely (not partially) configure things.
[Bug target/104455] Cannot select -march=armv7-a using GCC 11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104455 --- Comment #2 from Jeffrey Walton --- (In reply to Richard Earnshaw from comment #1) > What's the configuration of the compiler? Eg, the output of gcc -v Thanks Richard. I set-up a Debian Qemu/Chroot for armhf. I can now duplicate the problem. # cat test.S .globl return_magic .type return_magic,%function return_magic: movw r0, #0x1234 movt r0, #0x5678 bl 0 # g++ -g2 -O3 -Wa,--noexecstack -march=armv7-a test.S -c cc1: error: ‘-mfloat-abi=hard’: selected architecture lacks an FPU # gcc --version -v Using built-in specs. COLLECT_AS_OPTIONS='--version' COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/arm-linux-gnueabihf/11/lto-wrapper Target: arm-linux-gnueabihf Configured with: ../src/configure -v --with-pkgversion='Debian 11.2.0-16' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=arm-linux-gnueabihf- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-libitm --disable-libquadmath --disable-libquadmath-support --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-sjlj-exceptions --with-arch=armv7-a+fp --with-float=hard --with-mode=thumb --disable-werror --enable-checking=release --build=arm-linux-gnueabihf --host=arm-linux-gnueabihf --target=arm-linux-gnueabihf Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.2.0 (Debian 11.2.0-16) gcc (Debian 11.2.0-16) 11.2.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. COLLECT_GCC_OPTIONS='--version' '-v' '-mfloat-abi=hard' '-mtls-dialect=gnu' '-mthumb' '-mlibarch=armv7-a+fp' '-march=armv7-a+fp' '-dumpdir' 'a-' /usr/lib/gcc/arm-linux-gnueabihf/11/cc1 -quiet -v -imultilib . -imultiarch arm-linux-gnueabihf help-dummy -quiet -dumpdir a- -dumpbase help-dummy -mfloat-abi=hard -mtls-dialect=gnu -mthumb -mlibarch=armv7-a+fp -march=armv7-a+fp -version --version -o /tmp/ccdbsVZx.s GNU C17 (Debian 11.2.0-16) version 11.2.0 (arm-linux-gnueabihf) compiled by GNU C version 11.2.0, GMP version 6.2.1, MPFR version 4.1.0, MPC version 1.2.1, isl version isl-0.24-GMP GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 COLLECT_GCC_OPTIONS='--version' '-v' '-mfloat-abi=hard' '-mtls-dialect=gnu' '-mthumb' '-mlibarch=armv7-a+fp' '-march=armv7-a+fp' '-dumpdir' 'a-' as -v -march=armv7-a -mfloat-abi=hard -meabi=5 --version -o /tmp/ccYLSQQS.o /tmp/ccdbsVZx.s GNU assembler version 2.37.90 (arm-linux-gnueabihf) using BFD version (GNU Binutils for Debian) 2.37.90.20220207 GNU assembler (GNU Binutils for Debian) 2.37.90.20220207 Copyright (C) 2022 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or later. This program has absolutely no warranty. This assembler was configured for a target of `arm-linux-gnueabihf'. COMPILER_PATH=/usr/lib/gcc/arm-linux-gnueabihf/11/:/usr/lib/gcc/arm-linux-gnueabihf/11/:/usr/lib/gcc/arm-linux-gnueabihf/:/usr/lib/gcc/arm-linux-gnueabihf/11/:/usr/lib/gcc/arm-linux-gnueabihf/ LIBRARY_PATH=/usr/lib/gcc/arm-linux-gnueabihf/11/:/usr/lib/gcc/arm-linux-gnueabihf/11/../../../arm-linux-gnueabihf/:/usr/lib/gcc/arm-linux-gnueabihf/11/../../../:/lib/arm-linux-gnueabihf/:/lib/:/usr/lib/arm-linux-gnueabihf/:/usr/lib/ COLLECT_GCC_OPTIONS='--version' '-v' '-mfloat-abi=hard' '-mtls-dialect=gnu' '-mthumb' '-mlibarch=armv7-a+fp' '-march=armv7-a+fp' '-dumpdir' 'a.' /usr/lib/gcc/arm-linux-gnueabihf/11/collect2 -plugin /usr/lib/gcc/arm-linux-gnueabihf/11/liblto_plugin.so -plugin-opt=/usr/lib/gcc/arm-linux-gnueabihf/11/lto-wrapper -plugin-opt=-fresolution=/tmp/cc4oCuRH.res -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lc -plugin-opt=-pass-through=-lgcc -plugin-opt=-pass-through=-lgcc_s --build-id --eh-frame-hdr -dynamic-linker /lib/ld-linux-armhf.so.3 -X --hash-style=gnu --as-needed -m armelf_linux_eabi -pie --version /usr/lib/gcc/arm-linux-gnueabihf/11/../../../arm-linux-gnueabihf/Scrt1.o /usr/lib/gcc/arm-linux-gnueabihf/11/../../../arm-linux-gnueabihf/crti.o /usr/lib/gcc/arm-linux-gnueabihf/11/crtbeginS.o -L/usr/lib/gcc/arm-linux-gnueabihf/11 -L/usr/lib/gcc/arm-linux-gnueabihf/11/../../../arm-linux-gnueabihf -L/usr/lib/gcc/arm-linux-gnueabihf/11/../../.. -L/lib/arm-linux-gnueabihf
[Bug c++/104455] New: Cannot select -march=armv7-a using GCC 11
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104455 Bug ID: 104455 Summary: Cannot select -march=armv7-a using GCC 11 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: noloader at gmail dot com Target Milestone: --- Hi Everyone, We are trying to fix a compile problem Debian encountered on Sid [1,2]. The machine is armhf, but I don't have access to it. Sid is using GCC 11, but I am not sure which version. The problem does not exist in GCC 10 and below. I think this is the test case. Below, test.S is mostly armv4 but has some armv7 instructions. I used movw and movt as an example since they are armv7. $ cat test.S .globl return_magic .type return_magic,%function return_magic: movw r0, #0x1234 movt r0, #0x5678 bl 0 $ g++ -g2 -O3 -Wa,--noexecstack -march=armv7-a test.S -c When Debian compiles with GCC 11, it results in: cc1: error: ‘-mfloat-abi=hard’: selected architecture lacks an FPU test.S does not use floating point or neon instructions. This source file does not need a fpu. We don't care what the platform default is because we don't use it. But in this case, the machine is armhf so GCC should know it should use hard floats. When I attempt to add -mfpu=auto per [3], it results in another error: cc1: sorry, unimplemented: -mfpu=auto not currently supported without an explicit CPU. cc1: error: -mfloat-abi=hard: selected processor lacks an FPU test.S does not use floating point or neon instructions. This source file does not need a fpu. I also tried -mfpu=none, but it results in: g++: error: unrecognized argument in option ‘-mfpu=none’ g++: note: valid arguments to ‘-mfpu=’ are: auto crypto-neon-fp-armv8 fp-armv8 fpv4-sp-d16 fpv5-d16 fpv5-sp-d16 neon neon-fp-armv8 neon-fp16 neon-vfpv3 neon-vfpv4 vfp vfp3 vfpv2 vfpv3 vfpv3-d16 vfpv3-d16-fp16 vfpv3-fp16 vfpv3xd vfpv3xd-fp16 vfpv4 vfpv4-d16; did you mean ‘neon’? In the past we avoided specifying a -mfpu option because the code [used to] work with GCC 4 and above, Clang 3 and above, armel, armhf and Android with soft floats. The compiler always knew what float abi it should use. I would like to drop -march=armv7-a, but GCC requires me to use an ISA option before I use instructions from the ISA. I've never liked the rule, but it is what it is. (Microsoft's C/C++ compiler got this right). [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1001995 [2] https://github.com/weidai11/cryptopp/issues/1094 [3] https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html
[Bug driver/103863] We need a warning for loss of no-exec stacks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103863 --- Comment #2 from Jeffrey Walton --- (In reply to Andrew Pinski from comment #1) > I think the warning needs to be implemented in the linker rather than in GCC > because the linker is what decides if there are executable stacks are needed > or not. Thanks Andrew. I thought about a linker warning, too. Do they have to be mutually exclusive (warning in compiler vs warning in linker)? I also asked the Binutil folks for some feedback: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103863.
[Bug c/103863] New: We need a warning for loss of no-exec stacks
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103863 Bug ID: 103863 Summary: We need a warning for loss of no-exec stacks Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: noloader at gmail dot com Target Milestone: --- Hello, This is a feature request. For targets that support no-exec stacks, we need a warning when GCC generates code or drives the linker with loss of no-exec stacks. The warning would be beneficial for most builds nowadays since no-exec stacks are part of most distro hardening. For example, Debian and Fedora both incorporate it into their build system; and special steps must be taken to avoid no-exec stacks out of the box. The warning would also be beneficial in cases like https://bugzilla.redhat.com/show_bug.cgi?id=2035802. In the 2035802 bug, an ARM machine failed to boot because libz contained executable stacks even though they were not needed. A specific warning for no-exec stacks is slightly different than -Wtrampolines. While trampolines resulted in executable stacks in the past, that may not hold in the future as lambdas are added to the language. And trampolines are not a necessary precondition to get in an insecure state like the 2035802 bug shows. It is most unfortunate that ASM files need special handling because the object files are marked with executable stacks by default. Maybe that should be another bug report to change default behavior since the strategy nowadays is: no-exec stacks by default, do something special for executable stacks. Thanks in advance.
[Bug target/96168] GCC support for Apple Silicon (Arm64) on macOS requested
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96168 Jeffrey Walton changed: What|Removed |Added CC||noloader at gmail dot com --- Comment #12 from Jeffrey Walton --- This may be helpful if someone starts on a port of GCC to the M1: https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms The GCC Compile Farm has an M1. I also have an M1 for testing. My M1 has Command Line Tools (CLT), but lacks Xcode. (I don't have an Apple account anymore). My M1 has Autotools but that's about it. My M1 gives some projects some problems when they assume Xcode is present. If you want an account on my box for testing, then send your authorized_keys to noloader, gmail account. If you break my M1 it is no big deal. I'll just reinstall the OS.
[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735 --- Comment #6 from Jeffrey Walton --- Add 9.3 to the know to fail list: $ gcc --version gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Copyright (C) 2019 Free Software Foundation, Inc.
[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735 --- Comment #5 from Jeffrey Walton --- I think we are seeing this bug in the field. We are catching lots of failed self tests as we test on multiple platforms, including Ubuntu 14 ERS and Ubuntu 16 LTS. The problem makes GCC 4.8.4 through 7.5 practically useless for AVX and AVX2. I don't see the problem with GCC 9.3. Maybe the problem got fixed somewhere along the way?
[Bug c++/53431] C++ preprocessor ignores #pragma GCC diagnostic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431 --- Comment #40 from Jeffrey Walton --- Still a problem in 2021.
[Bug c++/98416] POWER8: SIGILL handler does not restart properly after signal using GCC 10.2.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98416 --- Comment #3 from Jeffrey Walton --- (In reply to Andrew Pinski from comment #2) > This is invalid. > The instruction which is failing is: > 9c: f0 00 02 d0 xxspltib vs0,0 > > Which is only valid in power9 and above. > You need to mark CPU_ProbePower9 not to be compiled with -mcpu=power9, by > using the target attribute. I have to use it because I am using the POWER9 ISA.
[Bug c++/98416] POWER8: SIGILL handler does not restart properly after signal using GCC 10.2.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98416 --- Comment #1 from Jeffrey Walton --- Created attachment 49831 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49831=edit Disassembly of ppc_power9.o Created with 'objdump --disassemble ppc_power9.o | c++filt > ppc_power9.disass'.
[Bug c++/98416] New: POWER8: SIGILL handler does not restart properly after signal using GCC 10.2.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98416 Bug ID: 98416 Summary: POWER8: SIGILL handler does not restart properly after signal using GCC 10.2.1 Product: gcc Version: 10.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: noloader at gmail dot com Target Milestone: --- Created attachment 49827 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49827=edit preprocessed ppc_power9.cpp We are testing on GCC203 on the compile farm. GCC203 is a Debian POWER8 machine with GCC 10.2.1. The following code executes 'darn r3, 0;', which is a POWER9 instruction. It causes a SIGILL, which is expected. However, when restarted from the setjump the program SIGILLs again. The second SIGILL is not expected. // https://github.com/weidai11/cryptopp/blob/master/ppc_power9.cpp#L42 // With the extra cruft removed... bool CPU_ProbePower9() { // longjmp and clobber warnings. Volatile is required. volatile int result = true; volatile SigHandler oldHandler = signal(SIGILL, SigIllHandler); if (oldHandler == SIG_ERR) return false; volatile sigset_t oldMask; if (sigprocmask(0, NULLPTR, (sigset_t*))) { signal(SIGILL, oldHandler); return false; } if (setjmp(s_jmpSIGILL)) result = false; / <= SIGILL here! / else { // This is "darn r3, 0". We had to move away from the instrinsic // because Clang and IBM XL C/C++ does not support the instrinsic. #if __BIG_ENDIAN__ __asm__ __volatile__ (".byte 0x7c, 0x60, 0x05, 0xe6 \n" : : : "r3"); #else __asm__ __volatile__ (".byte 0xe6, 0x05, 0x60, 0x7c \n" : : : "r3"); #endif result = true; } sigprocmask(SIG_SETMASK, (sigset_t*), NULLPTR); signal(SIGILL, oldHandler); return result; } Here's what it looks like under the debugger: (gdb) r v Starting program: /home/noloader/cryptopp/cryptest.exe v ... ### This one is expected. It is a feature probe. ### Program received signal SIGILL, Illegal instruction. CryptoPP::CPU_ProbePower9 () at ppc_power9.cpp:70 70 __asm__ __volatile__ (".byte 0x7c, 0x60, 0x05, 0xe6 \n" : : : "r3"); (gdb) n CryptoPP::SigIllHandler () at ppc_power9.cpp:35 35 longjmp(s_jmpSIGILL, 1); (gdb) n ### This one is not expected. ### Program received signal SIGILL, Illegal instruction. CryptoPP::CPU_ProbePower9 () at ppc_power9.cpp:64 64 result = false; (gdb) n Program terminated with signal SIGILL, Illegal instruction. = > the complete command line that triggers the bug; > the compiler output (error messages, warnings, etc.); The compiler command invoked by make before and after are also shown. The program is clean with -Wall, UBsan, Asan, etc. g++ -DNDEBUG -g2 -O3 -fPIC -pthread -pipe -mcpu=power8 -c ppc_power8.cpp g++ -DNDEBUG -g2 -O3 -fPIC -pthread -pipe -mcpu=power9 -c ppc_power9.cpp g++ -DNDEBUG -g2 -O3 -fPIC -pthread -pipe -maltivec -c ppc_simd.cpp = > the preprocessed file (*.i*) that triggers the bug, generated by adding > -save-temps to the complete compilation command, or, in the case of a bug > report for the GNAT front end, a complete set of source files (see below). Attached. The command used was: g++ -save-temps -DNDEBUG -g2 -O3 -fPIC -pthread -pipe -mcpu=power9 -c ppc_power9.cpp = > the exact version of GCC; > the system type; > the options given when GCC was configured/built; $ gcc --version gcc (Debian 10.2.1-1) 10.2.1 20201207 $ lsb_release -a Distributor ID: Debian Description:Debian GNU/Linux bullseye/sid Release:unstable Codename: sid $ gcc -v 2>&1 | fold -w 80 Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/powerpc64-linux-gnu/10/lto-wrapper Target: powerpc64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 10.2.1-1' --with- bugurl=file:///usr/share/doc/gcc-10/README.Bugs --enable-languages=c,ada,c++,go, d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suf fix=-10 --program-prefix=powerpc64-linux-gnu- --enable-shared --enable-linker-bu ild-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix - -libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-l ibstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --en able-gnu-unique-object --disable-libquadmath --disable-libquadmath-support --ena ble-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=r elease --with-target-system-zlib=auto --enable-objc-gc=auto --enable-secureplt - -disable-softfloat --enable-targets=powerpc64-linux,powerpc-linux --enable-multi arch --disable-werror --with-long-double-128 --enable-multilib --enable-checking =release
[Bug middle-end/93644] [10/11 Regression] spurious -Wreturn-local-addr with PHI of PHI
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93644 --- Comment #13 from Jeffrey Walton --- On Wed, Dec 16, 2020 at 9:05 PM eggert at cs dot ucla.edu wrote: > ... > (B) there's no way to shut off the false alarm, not even with '# pragma GCC > diagnostic ignored "-Wreturn-local-addr"'. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53431