[PATCH] [i386] Optimize __builtin_shuffle when it's used to zero the upper bits of the dest. [PR target/94680]
Hi: If the second operand of __builtin_shuffle is const vector 0, and with specific mask, it can be optimized to movq/vmovps. .i.e. foo128: - vxorps %xmm1, %xmm1, %xmm1 - vmovlhps%xmm1, %xmm0, %xmm0 + vmovq %xmm0, %xmm0 foo256: - vxorps %xmm1, %xmm1, %xmm1 - vshuff32x4 $0, %ymm1, %ymm0, %ymm0 + vmovaps %xmm0, %xmm0 foo512: - vxorps %xmm1, %xmm1, %xmm1 - vshuff32x4 $68, %zmm1, %zmm0, %zmm0 + vmovaps %ymm0, %ymm0 Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/94680 * config/i386/sse.md (ssedoublevecmode): Add attribute for V64QI/V32HI/V16SI/V4DI. (ssehalfvecmode): Add attribute for V2DI/V2DF. (*vec_concatv4si_0): Extend to VI124_128. (*vec_concat_0): New pre-reload splitter. * config/i386/predicates.md (movq_parallel): New predicate. gcc/testsuite/ChangeLog: PR target/94680 * gcc.target/i386/avx-pr94680.c: New test. * gcc.target/i386/avx512f-pr94680.c: New test. * gcc.target/i386/sse2-pr94680.c: New test. -- BR, Hongtao From eec5469cdeecf0e6650e9d2963dea4117919c5d2 Mon Sep 17 00:00:00 2001 From: liuhongt Date: Thu, 22 Apr 2021 15:33:16 +0800 Subject: [PATCH] [i386] Optimize __builtin_shuffle when it's used to zero the upper bits of the dest. [PR target/94680] If the second operand of __builtin_shuffle is const vector 0, and with specific mask, it can be optimized to movq/vmovps. .i.e. foo128: - vxorps %xmm1, %xmm1, %xmm1 - vmovlhps%xmm1, %xmm0, %xmm0 + vmovq %xmm0, %xmm0 foo256: - vxorps %xmm1, %xmm1, %xmm1 - vshuff32x4 $0, %ymm1, %ymm0, %ymm0 + vmovaps %xmm0, %xmm0 foo512: - vxorps %xmm1, %xmm1, %xmm1 - vshuff32x4 $68, %zmm1, %zmm0, %zmm0 + vmovaps %ymm0, %ymm0 gcc/ChangeLog: PR target/94680 * config/i386/sse.md (ssedoublevecmode): Add attribute for V64QI/V32HI/V16SI/V4DI. (ssehalfvecmode): Add attribute for V2DI/V2DF. (*vec_concatv4si_0): Extend to VI124_128. (*vec_concat_0): New pre-reload splitter. * config/i386/predicates.md (movq_parallel): New predicate. gcc/testsuite/ChangeLog: PR target/94680 * gcc.target/i386/avx-pr94680.c: New test. * gcc.target/i386/avx512f-pr94680.c: New test. * gcc.target/i386/sse2-pr94680.c: New test. --- gcc/config/i386/predicates.md | 33 gcc/config/i386/sse.md| 37 +++-- gcc/testsuite/gcc.target/i386/avx-pr94680.c | 59 ++ .../gcc.target/i386/avx512f-pr94680.c | 78 +++ gcc/testsuite/gcc.target/i386/sse2-pr94680.c | 51 5 files changed, 250 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/avx-pr94680.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-pr94680.c create mode 100644 gcc/testsuite/gcc.target/i386/sse2-pr94680.c diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md index b1df8548af6..4b706003ed8 100644 --- a/gcc/config/i386/predicates.md +++ b/gcc/config/i386/predicates.md @@ -1524,6 +1524,39 @@ (define_predicate "misaligned_operand" (and (match_code "mem") (match_test "MEM_ALIGN (op) < GET_MODE_BITSIZE (mode)"))) +;; Return true if OP is a parallel for an mov{d,q,dqa,ps,pd} vec_select, +;; where one of the two operands of the vec_concat is const0_operand. +(define_predicate "movq_parallel" + (match_code "parallel") +{ + unsigned nelt = XVECLEN (op, 0); + unsigned nelt2 = nelt >> 1; + unsigned i; + + if (nelt < 2) +return false; + + /* Validate that all of the elements are constants, + lower halves of permute are lower halves of the first operand, + upper halves of permute come from any of the second operand. */ + for (i = 0; i < nelt; ++i) +{ + rtx er = XVECEXP (op, 0, i); + unsigned HOST_WIDE_INT ei; + + if (!CONST_INT_P (er)) + return 0; + ei = INTVAL (er); + if (i < nelt2 && ei != i) + return 0; + if (i >= nelt2 + && (ei < nelt || ei >= nelt<<1)) + return 0; +} + + return 1; +}) + ;; Return true if OP is a vzeroall operation, known to be a PARALLEL. (define_predicate "vzeroall_operation" (match_code "parallel") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 9d3728d1cb0..b55636a3e12 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -812,19 +812,22 @@ (define_mode_attr sseintvecmodelower ;; Mapping of vector modes to a vector mode of double size (define_mode_attr ssedoublevecmode - [(V32QI "V64QI") (V16HI "V32HI") (V8SI "V16SI") (V4DI "V8DI") + [(V64QI "V128QI") (V32HI "V64HI") (V16SI "V32SI") (V8DI "V16DI") + (V32QI "V64QI") (V16HI "V32HI") (V8SI "V16SI") (V4DI "V8DI") (V16QI "V32QI") (V8HI "V16HI") (V4SI "V8SI") (V2DI "V4DI") + (V16SF "V32SF") (V8DF "V16DF") (V8SF "V16SF") (V4DF "V8DF") (V4SF "V8SF") (V2DF "V4DF")]) ;;
[Bug c++/100209] multiple inheritance with crtp pattern fails on sequentioal member access
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100209 Patrick Palka changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ppalka at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug c++/100209] multiple inheritance with crtp pattern fails on sequentioal member access
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100209 --- Comment #3 from Patrick Palka --- During cxx_eval_indirect_ref for *(const struct C &) (const struct C *) (struct C *) ((struct B *) this + -4); constant evaluation of the pointee yields + -4 where D.2217 is a temporary of type C and D.2106 is the FIELD_DECL for the base B. The problem seems to be that cxx_fold_indirect_ref doesn't know how to fold this second expression to just
Re: State of AutoFDO in GCC
Hi, the create_gcov tool was probably removed with the assumption that it was only used with Google GCC branch, but it is actually used with GCC trunk as well. Given that, the tool will be restored in the github repo. It seems to build and work fine with the regression test. The tool may ust work as it is right now, but there is no guarantee it won't break in the future unless someone in the GCC community tries to maintain it. Thanks, David On Thu, Apr 22, 2021 at 3:29 PM Jan Hubicka wrote: > > On 4/22/21 9:58 PM, Eugene Rozenfeld via Gcc wrote: > > > GCC documentation for AutoFDO points to create_gcov tool that converts > perf.data file into gcov format that can be consumed by gcc with > -fauto-profile (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html, > https://gcc.gnu.org/wiki/AutoFDO/Tutorial). > > > > > > I noticed that the source code for create_gcov has been deleted from > https://github.com/google/autofdo on April 7. I asked about that change > in that repo and got the following reply: > > > > > > https://github.com/google/autofdo/pull/107#issuecomment-819108738 > > > > > > "Actually we didn't use create_gcov and havn't updated create_gcov for > years, and we also didn't have enough tests to guarantee it works (It was > gcc-4.8 when we used and verified create_gcov). If you need it, it is > welcomed to update create_gcov and add it to the respository." > > > > > > Does this mean that AutoFDO is currently dead in gcc? > > > > Hello. > > > > Yes. I know that even basic test cases have been broken for years in the > GCC. > > It's new to me that create_gcov was removed. > > > > I tend to send patch to GCC that will remove AutoFDO from GCC. > > I known Bin spent some time working on AutoFDO, has he came up to > something? > > The GCC side of auto-FDO is not that hard. We have most of > infrastructure in place, but stopping point for me was always difficulty > to get gcov-tool working. If some maintainer steps up, I think I can > fix GCC side. > > I am bit unsure how important feature it is - we have FDO that works > quite well for most users but I know there are some users of the LLVM > implementation and there is potential to tie this with other hardware > events to asist i.e. if conversion (where one wants to know how well CPU > predicts the jump rather than just the jump probability) which I always > found potentially interesting. > > Honza > > > > Martin > > > > > > > > Thanks, > > > > > > Eugene > > > > > >
[Bug c++/100224] New: incorrect result when doing double vectorized
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100224 Bug ID: 100224 Summary: incorrect result when doing double vectorized Product: gcc Version: 10.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: zhaoc at apache dot org Target Milestone: --- When I use O3 to compile a code, I got an unexpected result. test_murmur.cc #include #include #include static const uint64_t MURMUR_PRIME = 0xc6a4a7935bd1e995ULL; static const uint32_t MURMUR_SEED = 0xadc83b19ULL; // Our hash function is MurmurHash2, 64 bit version. // It was modified in order to provide the same result in // big and little endian archs (endian neutral). static uint64_t murmur_hash64A (const void* key, int32_t len, unsigned int seed) { const uint64_t m = MURMUR_PRIME; const int r = 47; uint64_t h = seed ^ (len * m); const uint8_t *data = (const uint8_t *)key; const uint8_t *end = data + (len-(len&7)); while(data != end) { uint64_t k; k = *((uint64_t*)data); k *= m; k ^= k >> r; k *= m; h ^= k; h *= m; data += 8; } switch(len & 7) { case 7: h ^= (uint64_t)data[6] << 48; case 6: h ^= (uint64_t)data[5] << 40; case 5: h ^= (uint64_t)data[4] << 32; case 4: h ^= (uint64_t)data[3] << 24; case 3: h ^= (uint64_t)data[2] << 16; case 2: h ^= (uint64_t)data[1] << 8; case 1: h ^= (uint64_t)data[0]; h *= m; }; h ^= h >> r; h *= m; h ^= h >> r; return h; } void update_double(const std::vector& values, std::vector& hashes) { auto size = values.size(); for (int i = 0; i < size; ++i) { auto v = values[i]; uint64_t value = murmur_hash64A(, sizeof(v), MURMUR_SEED); hashes[i] = value; } } int main() { std::vector values(3); std::vector hashes(3); for (int i = 0; i < 3; ++i) { values[i] = i + 1; } update_double(values, hashes); for (auto hash : hashes) { std::cout << hash << std::endl; } return 0; } gcc-10.3.0/bin/g++ -std=gnu++17 -O3 -msse4.2 -mavx2 test_murmur.cc output 4138674677912027985 4138674677912027985 4138674677912027985 When I used option O2 to compile this file, gcc-10.3.0/bin/g++ -std=gnu++17 -O2 -msse4.2 -mavx2 test_murmur.cc the output is following which is expected: 17614482930881034518 9674455539515676295 16429943614018478328 It seems something wrong when GCC generating optimized code for 'double'.
[Bug libstdc++/100223] New: Missing early return in std::partial_sort
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100223 Bug ID: 100223 Summary: Missing early return in std::partial_sort Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: hewillk at gmail dot com Target Milestone: --- Hi, can we add an early return to std::partial_sort to avoid unnecessary O(n) operations when __first is equal to __middle, just like std::rotate does, and make the result more consistent with ranges::partial_sort? https://godbolt.org/z/Kb5x8zfeh
[Bug c++/100209] multiple inheritance with crtp pattern fails on sequentioal member access
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100209 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org Last reconfirmed||2021-04-23 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #2 from Patrick Palka --- Confirmed. A bit more reduced: struct A { int a = 0; }; template struct B { int b = 0; constexpr Derived f(int n) { return *static_cast(this); } }; struct C : A, B { }; constexpr C c = C().f(10); 100209.C:14:22: in ‘constexpr’ expansion of ‘C().C::.B::f(10)’ 100209.C:14:25: error: ‘*(const C*)((C*)(((B*)this) + -4))’ is not a constant expression 14 | constexpr C c = C().f(10); | ^
[Bug c++/94845] DWARF function name doesn't match demangled name in base type template parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94845 --- Comment #9 from robert at ocallahan dot org --- That makes sense ... well, except implementing a full C++ parser and reserializer is horrific.
[Bug tree-optimization/100222] New: Redundant mark_irreducible_loops () in predicate.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100222 Bug ID: 100222 Summary: Redundant mark_irreducible_loops () in predicate.c Product: gcc Version: tree-ssa Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: fxue at os dot amperecomputing.com Target Milestone: --- Found two places that redundant mark_irreducible_loops() is called after loop_optimizer_init (LOOPS_NORMAL), since "LOOPS_NORMAL" already includes "LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS". The codes are in pass_profile::execute () and report_predictor_hitrates().
Re: State of AutoFDO in GCC
On Fri, Apr 23, 2021 at 4:16 AM Martin Liška wrote: > > On 4/22/21 9:58 PM, Eugene Rozenfeld via Gcc wrote: > > GCC documentation for AutoFDO points to create_gcov tool that converts > > perf.data file into gcov format that can be consumed by gcc with > > -fauto-profile (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html, > > https://gcc.gnu.org/wiki/AutoFDO/Tutorial). > > > > I noticed that the source code for create_gcov has been deleted from > > https://github.com/google/autofdo on April 7. I asked about that change in > > that repo and got the following reply: > > > > https://github.com/google/autofdo/pull/107#issuecomment-819108738 > > > > "Actually we didn't use create_gcov and havn't updated create_gcov for > > years, and we also didn't have enough tests to guarantee it works (It was > > gcc-4.8 when we used and verified create_gcov). If you need it, it is > > welcomed to update create_gcov and add it to the respository." > > > > Does this mean that AutoFDO is currently dead in gcc? > > Hello. > > Yes. I know that even basic test cases have been broken for years in the GCC. > It's new to me that create_gcov was removed. > > I tend to send patch to GCC that will remove AutoFDO from GCC. > I known Bin spent some time working on AutoFDO, has he came up to something? Hi Martin, I haven't touched this part for quite some time. I have no objection to removing it from GCC. However, I do have general concern that because of fewer users/developers, it's less likely and harder for new features to land in GCC. I have no idea if this is a real problem or how to fix it. OTOH, maybe removing rotten features, making GCC more(?) concise, and improving existing features that GCC is doing well is the right thing. Thanks, bin > > Martin > > > > > Thanks, > > > > Eugene > > >
Re: [PATCH] Use STATIC_ASSERT for OVL_OP_MAX.
On 4/22/21 4:55 AM, Martin Liška wrote: There's an updated version of the patch, Jonathan noticed correctly the comment related to assert was not correct. Subject: [PATCH] Use STATIC_ASSERT for OVL_OP_MAX. Subject line needs "c++:" Please also include the rationale from your first message before the ChangeLog entries. OK with those adjustments. Jason
[Bug c++/94845] DWARF function name doesn't match demangled name in base type template parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94845 --- Comment #8 from Tom Tromey --- (In reply to rob...@ocallahan.org from comment #7) > So gdb reads DW_AT_name "func", parses it, reserializes it to > "func", and uses that? Yeah. (Actually it's even worse than that, because at least one compiler doesn't emit the template parameters in the name, so in that case gdb will read the children of the DIE to try to construct this form.) I think the reasoning behind the canonicalization is two-fold. First, I think we tried to get g++ changed, back in the day, without success. Second, gdb has to canonicalize user input anyway, so that things like "print func(3)" or "break func" work. And once you have a canonicalizer it is simpler to just use it to work around the problem.
[Bug fortran/69360] loop optimization produces invalid code when a common array has dimension 1 in some files
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69360 --- Comment #7 from Steve Kargl --- On Thu, Apr 22, 2021 at 10:20:54PM +, johnnorthall263 at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69360 > > --- Comment #6 from John Northall --- > It's deliberate! I think with this level of understanding of fortran use > in the real world commercial compilers have a bright future! I'm commercial compiler do have bright future. Of course, with commercial compiler and a license with paid user support, a commercial vendor will gladly deal with garbage-in code. > If it works with dimension set to 2 (whatever the background > true value) then it must be easy to make it do so with > dimension 1? >From the Fortran 2018 standard, R873 common-stmt is COMMON [ / [ common-block-name ] / ] common-block-object-list R874 common-block-object is variable-name [ ( array-spec ) ] C8117 (R874) An array-spec in a common-block-object shall be an explicit-shape-spec-list. R816 explicit-shape-spec is [ lower-bound : ] upper-bound This is nearly identical to the language in the Fortran 95 standard. This statement in setmid.f COMMON /SPACE/ XY(1),XYM(1) is telling gfortran that the array has an upper bound of 1. Changing the above statement to have XY(2), XYM(2) does not mean it works. It means you got luck with processor defined behavior. You simply get a different error message if you ask your Fortran compiler to help you debug your code. % ./example1 At line 21 of file setmid.f Fortran runtime error: Index '155' of dimension 1 of array 'xym' above upper bound of 2 As to "it must be easy to make it do so with dimension 1", you forgot to attach your patch. gfortran isn't a commercial compiler. It depends on contributions for volunteers such as yourself,
[Bug target/100093] different behavior between -mtune=cpu_type and target_attribute (“arch=cputype”)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100093 Hongtao.liu changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #4 from Hongtao.liu --- Fixed in GCC12.
[Bug target/100093] different behavior between -mtune=cpu_type and target_attribute (“arch=cputype”)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100093 --- Comment #3 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:342de04d993beaa644d0b0087c20bef5dad5bf5f commit r12-78-g342de04d993beaa644d0b0087c20bef5dad5bf5f Author: liuhongt Date: Fri Apr 16 11:29:10 2021 +0800 MASK_AVX256_SPLIT_UNALIGNED_STORE/LOAD should be cleared in opts->x_target_flags when X86_TUNE_AVX256_UNALIGNED_LOAD/STORE_OPTIMAL is enabled by target attribute. gcc/ChangeLog: PR target/100093 * config/i386/i386-options.c (ix86_option_override_internal): Clear MASK_AVX256_SPLIT_UNALIGNED_LOAD/STORE in x_target_flags when X86_TUNE_AVX256_UNALIGNED_LOAD/STORE_OPTIMAL is enabled by target attribute. gcc/testsuite/ChangeLog: PR target/100093 * gcc.target/i386/pr100093.c: New test.
[Bug target/100152] [10.3, 11, 12 Regression] used caller-saved register not preserved across a call.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152 --- Comment #28 from Iain Sandoe --- reduced test case: ___UTF_8_put(char *a, int b) { char *c = a; int bytes; if (b <= 15) bytes = 2; else if (b <= 255) bytes = 4; else if (b <= 4095) bytes = 5; else bytes = 6; c = *a = c; switch (bytes) { case 6: *--c = b >>= 6; case 5: *--c = b; b >>= 6; } *c = bytes + b; } ___SCMOBJ_to_NONNULLSTRING(d) { { unsigned long g, h; char c; h = ((long *)1)[1]; g = 0; for (; g < h; g++) i(); g = 0; for (; g < h; g++) ___UTF_8_put(c, (long)*((unsigned *)d + (g << 2 >> 2)) << 2 >> 2); } }
[PATCH] Switch AIX configuration to DWARF2 debugging
As requested at the end of Stage 4, this patch changes the debugging format for AIX configuration of GCC to "DWARF2". This is in preparation for removing stabs debugging support from GCC. The rs6000 configuration files remain somewhat intertwined with the stabs debugging support, but the configuration no longer generates stabs debugging information. I have been bootstrapping and testing GCC with this configuration for years. This patch means that earlier releases (Technology Levels) of AIX 7.1 and 7.2, prior to DWARF support and fixes, cannot build GCC or support GCC. Thanks, David * config/rs6000/aix71.h (PREFERRED_DEBUGGING_TYPE): Change to DWARF2_DEBUG. * config/rs6000/aix72.h (PREFERRED_DEBUGGING_TYPE): Same. diff --git a/gcc/config/rs6000/aix71.h b/gcc/config/rs6000/aix71.h index 3612ed2593b..807e260a175 100644 --- a/gcc/config/rs6000/aix71.h +++ b/gcc/config/rs6000/aix71.h @@ -272,9 +272,9 @@ extern long long intatoll(const char *); #define TARGET_AIX_VERSION 71 -/* AIX 7.1 supports DWARF3 debugging, but XCOFF remains the default. */ +/* AIX 7.1 supports DWARF3+ debugging. */ #define DWARF2_DEBUGGING_INFO 1 -#define PREFERRED_DEBUGGING_TYPE XCOFF_DEBUG +#define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG #define DEBUG_INFO_SECTION "0x1" #define DEBUG_LINE_SECTION "0x2" #define DEBUG_PUBNAMES_SECTION "0x3" diff --git a/gcc/config/rs6000/aix72.h b/gcc/config/rs6000/aix72.h index d34909283cc..36c5d994439 100644 --- a/gcc/config/rs6000/aix72.h +++ b/gcc/config/rs6000/aix72.h @@ -273,9 +273,9 @@ extern long long intatoll(const char *); #define TARGET_AIX_VERSION 72 -/* AIX 7.2 supports DWARF3 debugging, but XCOFF remains the default. */ +/* AIX 7.2 supports DWARF3+ debugging. */ #define DWARF2_DEBUGGING_INFO 1 -#define PREFERRED_DEBUGGING_TYPE XCOFF_DEBUG +#define PREFERRED_DEBUGGING_TYPE DWARF2_DEBUG #define DEBUG_INFO_SECTION "0x1" #define DEBUG_LINE_SECTION "0x2" #define DEBUG_PUBNAMES_SECTION "0x3"
Re: [PATCH] Use STATIC_ASSERT for OVL_OP_MAX.
On Thu, 22 Apr 2021 14:28:24 -0600 Martin Sebor via Gcc-patches wrote: > > enum E { e = 5 }; > > struct A { E e: 3; }; > > > > constexpr int number_of_bits () > > { > > A a = { }; > > a.e = (E)-1; > > return 32 - __builtin_clz(a.e); > > } > > > > I had the same thought about using clz. It works in this case but > not in if one of the enumerators is negative, or if the underlying > type is signed. or for -fshort-enums in this case as is.
[Bug target/100152] [10.3, 11, 12 Regression] used caller-saved register not preserved across a call.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152 Iain Sandoe changed: What|Removed |Added CC||rguenth at gcc dot gnu.org Summary|[10.3, 11, 12 Regression] |[10.3, 11, 12 Regression] |[Darwin, X86] used |used caller-saved register |caller-saved register not |not preserved across a |preserved across a call.|call. --- Comment #27 from Iain Sandoe --- this is caused by, or exposed by, r10-9246-geddcb627ccfbd97e025cf366cc3f3bad76211785 since that's a change in tree optimisation, the problem might not be confined to Darwin, but I wasn't able to get the preprocessed source attached here to produce wrong code on Linux. For x86_64, the Darwin ABI is supposed to be same as Linux - and as noted, there is not so much Darwin-specific back end code (but, of course, there is some).
Re: [PATCH] Fix logic error in 32-bit trampolines, PR target/98952
On Fri, Apr 09, 2021 at 05:09:07PM -0400, Michael Meissner wrote: > Fix logic error in 32-bit trampolines, PR target/98952. > > The test in the PowerPC 32-bit trampoline support is backwards. It aborts > if the trampoline size is greater than the expected size. It should abort > when the trampoline size is less than the expected size. > PR target/98952 > * config/rs6000/tramp.S (__trampoline_setup): Fix trampoline size > comparison in 32-bit. > --- a/libgcc/config/rs6000/tramp.S > +++ b/libgcc/config/rs6000/tramp.S > @@ -64,8 +64,7 @@ FUNC_START(__trampoline_setup) > mflr r11 > addi r7,r11,trampoline_initial-4-.LCF0 /* trampoline address -4 */ > > - li r8,trampoline_size /* verify that the trampoline is big > enough */ > - cmpwcr1,r8,r4 > + cmpwi cr1,r4,trampoline_size /* verify that the trampoline is big > enough */ > srwir4,r4,2 /* # words to move */ > addir9,r3,-4/* adjust pointer for lwzu */ > mtctr r4 As Will says, it looks like the ELFv2 version has the same bug. Please fix that the same way. In the commit message and the changelog, point out that you folded the cmp with the li while you were at it. It is easier to read code like this so the change is fine, but do point it out. Can you test this in a testcase somehow? That would have found the ELFv2 case, for example. Okay for trunk. Okay for backport to 11 when that branch opens again. Does this need more backports? (Those should follow after 11 of course). Thanks, Segher
Re: [PATCH] Fix logic error in 32-bit trampolines, PR target/98952
On Mon, Apr 12, 2021 at 05:02:38PM -0500, will schmidt wrote: > On Fri, 2021-04-09 at 17:09 -0400, Michael Meissner wrote: > > - li r8,trampoline_size /* verify that the trampoline is big > > enough */ > > - cmpwcr1,r8,r4 > > + cmpwi cr1,r4,trampoline_size /* verify that the trampoline is big > > enough */ > > Hmm, I spent several minutes trying to determine how cmpw behaves > differently than cmpwi before noticing you also swapped the > order of the r4,r8 operands. > > That seems OK. > > A statement in the description indicating that you used a cmpwi instead > of a cmpw since you were in the neighborhood would help call that out. In general, don't do two unrelated things, esp. when not pointing it out explicitly. > The #elif _CALL_ELF == 2 portion of tramp.S (line 159 or so) has a > similar compare stanza with respect to the order of operands on the > compare. Will this also have a backwards greater-than less-than issue? > > li r8,trampoline_size /* verify that the trampoline is big > enough */ > cmpwcr1,r8,r4 > srwir4,r4,3 /* # doublewords to move */ > addir9,r3,-8/* adjust pointer for stdu */ > mtctr r4 > blt cr1,.Labort Not sure... It should use cmpwi as well though, and then it is easier to see. Segher
[Bug c++/94845] DWARF function name doesn't match demangled name in base type template parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94845 --- Comment #7 from robert at ocallahan dot org --- So gdb reads DW_AT_name "func", parses it, reserializes it to "func", and uses that?
gcc-8-20210422 is now available
Snapshot gcc-8-20210422 is now available on https://gcc.gnu.org/pub/gcc/snapshots/8-20210422/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 8 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-8 revision ef195a39d0d3b929cc676302d074b42c25460601 You'll find: gcc-8-20210422.tar.xzComplete GCC SHA256=036c1791606429af7ce0075e3cbf2b582fe987cd49f8b42d17fe327f19e5f787 SHA1=e3ea5f9f50ab458368534261d7c36bd58bdfa822 Diffs from 8-20210415 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: State of AutoFDO in GCC
> On 4/22/21 9:58 PM, Eugene Rozenfeld via Gcc wrote: > > GCC documentation for AutoFDO points to create_gcov tool that converts > > perf.data file into gcov format that can be consumed by gcc with > > -fauto-profile (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html, > > https://gcc.gnu.org/wiki/AutoFDO/Tutorial). > > > > I noticed that the source code for create_gcov has been deleted from > > https://github.com/google/autofdo on April 7. I asked about that change in > > that repo and got the following reply: > > > > https://github.com/google/autofdo/pull/107#issuecomment-819108738 > > > > "Actually we didn't use create_gcov and havn't updated create_gcov for > > years, and we also didn't have enough tests to guarantee it works (It was > > gcc-4.8 when we used and verified create_gcov). If you need it, it is > > welcomed to update create_gcov and add it to the respository." > > > > Does this mean that AutoFDO is currently dead in gcc? > > Hello. > > Yes. I know that even basic test cases have been broken for years in the GCC. > It's new to me that create_gcov was removed. > > I tend to send patch to GCC that will remove AutoFDO from GCC. > I known Bin spent some time working on AutoFDO, has he came up to something? The GCC side of auto-FDO is not that hard. We have most of infrastructure in place, but stopping point for me was always difficulty to get gcov-tool working. If some maintainer steps up, I think I can fix GCC side. I am bit unsure how important feature it is - we have FDO that works quite well for most users but I know there are some users of the LLVM implementation and there is potential to tie this with other hardware events to asist i.e. if conversion (where one wants to know how well CPU predicts the jump rather than just the jump probability) which I always found potentially interesting. Honza > > Martin > > > > > Thanks, > > > > Eugene > > >
[wwwdocs] IPA/LTO/profile-feedback changes
Hi, this patch adds changesentry for IPA/LTO and FDO. Honza diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html index 6f58cfe8..bba16ead 100644 --- a/htdocs/gcc-11/changes.html +++ b/htdocs/gcc-11/changes.html @@ -170,6 +170,37 @@ a work-in-progress. use -g together with -gdwarf-2, -gdwarf-3 or -gdwarf-4. + +Inter-procedural optimization improvements: + + New IPA-modref pass was added to track side-effects of function calls + and improve precision of points-to-analysis. Pass can be controlled + by -fipa-modref attribute. + + Identical code folding pass was significantly improved to increase number of + unified functions and to reduce compile-time memory use. + IPA-CP heuristics improved its estimation of potential usefulness of + known loop bounds and strides by taking into account the estimated + frequency of these loops. + + +Link-time optimization improvements: + + LTO bytecode file format was optimized for smaller object files and + faster streaming. + Memory allocation of the linking stage was improved to reduce peak + memory use. + + + +Profile driven optimization improvements: + + +Using https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Optimize-Options.html#index-fprofile-values;>-fprofile-values, + was improved by tracking more target values for e.g. indirect calls. + + +
[Bug fortran/69360] loop optimization produces invalid code when a common array has dimension 1 in some files
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69360 --- Comment #6 from John Northall --- It's deliberate! I think with this level of understanding of fortran use in the real world commercial compilers have a bright future! If it works with dimension set to 2 (whatever the background true value) then it must be easy to make it do so with dimension 1? Cheers, John On Sat, Apr 17, 2021 at 2:48 AM kargl at gcc dot gnu.org < gcc-bugzi...@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69360 > > kargl at gcc dot gnu.org changed: > >What|Removed |Added > > > CC||kargl at gcc dot gnu.org > > --- Comment #5 from kargl at gcc dot gnu.org --- > (In reply to John Northall from comment #4) > > O dear - that makes gfortran unusable on many existing codes - not really > > satisfactory is it? > > John > > > > > gfortran is fairly good at finding bugs in a user's program if > one asks gfortran to do so. Add -fcheck=all to your options. > > At line 21 of file setmid.f > Fortran runtime error: Index '155' of dimension 1 of array 'xym' above > upper > bound of 1 > > Now, you have an opportunity to fix your code. > > -- > You are receiving this mail because: > You are on the CC list for the bug.
Re: [PATCH] c++: Hard error with tentative parse and CTAD [PR87709]
On 4/22/21 11:34 AM, Patrick Palka wrote: As described in detail in comment #4 of this PR, when tentatively parsing a construct that can either be a type or an expression, if during the type parse we encounter an unexpected template placeholder, we need to simulate an error rather than issue a real error because the subsequent expression parse can still succeed. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? OK. gcc/cp/ChangeLog: PR c++/87709 * parser.c (cp_parser_type_id_1): If we see a template placeholder, first try simulating an error before issuing a real error. gcc/testsuite/ChangeLog: PR c++/87709 * g++.dg/cpp1z/class-deduction86.C: New test. --- gcc/cp/parser.c| 11 +++ gcc/testsuite/g++.dg/cpp1z/class-deduction86.C | 16 2 files changed, 23 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction86.C diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index fba516efa23..e1b1617da68 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -23270,10 +23270,13 @@ cp_parser_type_id_1 (cp_parser *parser, cp_parser_flags flags, location_t loc = type_specifier_seq.locations[ds_type_spec]; if (tree tmpl = CLASS_PLACEHOLDER_TEMPLATE (auto_node)) { - error_at (loc, "missing template arguments after %qT", - auto_node); - inform (DECL_SOURCE_LOCATION (tmpl), "%qD declared here", - tmpl); + if (!cp_parser_simulate_error (parser)) + { + error_at (loc, "missing template arguments after %qT", + auto_node); + inform (DECL_SOURCE_LOCATION (tmpl), "%qD declared here", + tmpl); + } } else if (parser->in_template_argument_list_p) error_at (loc, "%qT not permitted in template argument", diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction86.C b/gcc/testsuite/g++.dg/cpp1z/class-deduction86.C new file mode 100644 index 000..a198ed24ec6 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction86.C @@ -0,0 +1,16 @@ +// PR c++/87709 +// { dg-do compile { target c++17 } } + +template +struct lit { + lit(T) { } +}; + +template +int operator+(lit, lit) { + return 0; +} + +auto r2 = (lit(0)) + lit(0); + +static_assert(sizeof(lit(0)));
Re: [PATCH] c++: Refine enum direct-list-initialization [CWG2374]
On 4/22/21 10:49 AM, Patrick Palka wrote: This implements the wording changes of CWG2374, which clarifies the wording of P0138 to forbid e.g. direct-list-initialization of a scoped enumeration from a different scoped enumeration. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? OK. gcc/cp/ChangeLog: DR 2374 * decl.c (is_direct_enum_init): Check the implicit convertibility requirement added by CWG 2374. gcc/testsuite/ChangeLog: DR 2374 * g++.dg/cpp1z/direct-enum-init2.C: New test. --- gcc/cp/decl.c | 8 +++- gcc/testsuite/g++.dg/cpp1z/direct-enum-init2.C | 8 2 files changed, 15 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp1z/direct-enum-init2.C diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index b81de8ef934..60dc2bf182d 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -6191,7 +6191,13 @@ is_direct_enum_init (tree type, tree init) && ENUM_FIXED_UNDERLYING_TYPE_P (type) && TREE_CODE (init) == CONSTRUCTOR && CONSTRUCTOR_IS_DIRECT_INIT (init) - && CONSTRUCTOR_NELTS (init) == 1) + && CONSTRUCTOR_NELTS (init) == 1 + /* DR 2374: The single element needs to be implicitly +convertible to the underlying type of the enum. */ + && can_convert_arg (ENUM_UNDERLYING_TYPE (type), + TREE_TYPE (CONSTRUCTOR_ELT (init, 0)->value), + CONSTRUCTOR_ELT (init, 0)->value, + LOOKUP_IMPLICIT, tf_none)) return true; return false; } diff --git a/gcc/testsuite/g++.dg/cpp1z/direct-enum-init2.C b/gcc/testsuite/g++.dg/cpp1z/direct-enum-init2.C new file mode 100644 index 000..5b94a8d00fe --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/direct-enum-init2.C @@ -0,0 +1,8 @@ +// DR 2374 +// { dg-do compile { target c++17 } } + +enum class Orange; +enum class Apple; + +extern Orange o; +Apple a{o}; // { dg-error "cannot convert" }
Re: removing toxic emailers
On 15/04/2021 10:40 am, Frosku wrote: On Wed Apr 14, 2021 at 9:49 PM BST, Paul Koning via Gcc wrote: My answer is "it depends". More precisely, in the past I would have favored those who decline because the environment is unpleasant -- with the implied assumption being that their objections are reasonable. Given the emergency of cancel culture, that assumption is no longer automatically valid. This is why I asked the question "who decides?" Given a disagreement in which the proposed remedy is to ostracise a participant, it is necessary to inquire for what reason this should be done (and, perhaps, who is pushing for it to be done). My suggestion is that this judgment can be made by the community (via secret ballot), unless it is decided to delegate that power to a smaller body, considered as trustees, or whatever you choose to call them. paul I think, in general, it's fine to leave this decision to moderators. It's just a little disconcerting when one of the people who would probably be moderating is saying that he could have shut down the discussion if he could only ban jerks, as if to imply that everyone who dares to disagree with his position is a jerk worthy of a ban. A little late to the party, but thought this was worth commenting on- from my perspective, as long as there is some sort of consensus amongst moderators about who is worth banning, as opposed to whether it can be fixed by calling the person out on their ongoing behaviour, it's probably worth doing. If that power is left to one mod, it's not a good thing. 3 or a larger odd number of mods is best for avoiding stalemates, and more is better. As an example of a controversial mod choice and without wanting to reopen wounds here, if I were a mod I could quite easily ban Nathan for the dishonesty and divisiveness of his initial post (see below if you require substantive talk around that), despite the fact that I have no particular love for Stallman or any investment in the topic. But another mod might see that contribution as 'the end justifying the means' in terms of bringing in an inevitable debate around Stallman's offputting personal manner, and whether that fits in today's society. Another mod might have another opinion etc. Two or three heads, are better than one, when it comes to behaviour judgement - particularly when an international community is at stake. And the more temperamentally/culturally diverse the mods are - the better for decision-making overall. = 1. 'skeptical that voluntarily pedophilia harms children.’ stallman's own archives 2006-mar-jun I note that children are *incapable* of consenting. That’s what the age of consent means. He has recanted on this as of 2019 (https://www.stallman.org/archives/2019-jul-oct.html#14_September_2019_(Sex_between_an_adult_and_a_child_is_wrong)) because people took the time to point out to him why his opinion was wrong. Omitting his recantation is, by my standards, a lie by omission. It doesn't make what he initially said any less terrible. But it clarifies his actual position. 2. 'end censorship of “child pornography”’. Stallman's archives 2012-jul-oct.html Notice use of “quotes” to down play what is actually being requested. While I don't actually agree with Stallman in the slightest, his stated objection is "it's common practice for teenagers to exchange nude photos with their lovers, and they all potentially could be imprisoned for this. A substantial fraction of them are actually prosecuted. " That's very different from how it's been presented here - a lie by omission. 3. 'gentle expressions of attraction’ Stallman's archives > 2012-jul-oct.html Condoning a variant of the wolf-whistle. Unless one’s talking to one’s lover, ‘gentle invitations for sex’ by a stranger is *grooming* (be it child or of-age). If you ever been to a bar, or an open-air event, or god forbid a party, you are aware that this is an obvious lie (for adults). Secondarily, nothing in Richard's text relates to wolf-whistling or variants.
[Bug c++/77435] Dependent reference non-type template parameter not matched for partial specialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77435 Patrick Palka changed: What|Removed |Added Resolution|--- |FIXED Target Milestone|--- |7.2 Status|NEW |RESOLVED CC||ppalka at gcc dot gnu.org --- Comment #3 from Patrick Palka --- Testcase added to trunk.
[Bug c++/94508] ICE in tsubst_copy, at cp/pt.c:16186
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94508 Patrick Palka changed: What|Removed |Added CC||ppalka at gcc dot gnu.org Target Milestone|--- |11.0 Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #4 from Patrick Palka --- Already fixed for GCC 11.
[Bug c++/94508] ICE in tsubst_copy, at cp/pt.c:16186
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94508 --- Comment #3 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:4e1aaf32ddf13cc79fcf146d6b62a6e0feb82be0 commit r12-73-g4e1aaf32ddf13cc79fcf146d6b62a6e0feb82be0 Author: Patrick Palka Date: Thu Apr 22 17:47:02 2021 -0400 c++: Add testcase for already fixed PR [PR94508] We correctly accept this testcase since r11-8144. gcc/testsuite/ChangeLog: PR c++/94508 * g++.dg/cpp2a/concepts-uneval3.C: New test.
[Bug c++/77435] Dependent reference non-type template parameter not matched for partial specialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77435 --- Comment #2 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:92664c058d705fcaf57875f93b4dfc36cf011afd commit r12-72-g92664c058d705fcaf57875f93b4dfc36cf011afd Author: Patrick Palka Date: Thu Apr 22 17:47:00 2021 -0400 c++: Add testcase for already fixed PR [PR77435] We correctly accept this testcase since r8-1437. gcc/testsuite/ChangeLog: PR c++/77435 * g++.dg/template/partial-specialization9.C: New test.
[PATCH 1/2] Generate overlapping operations between two areas of memory
For op_by_pieces operations between two areas of memory on non-strict alignment target, add -foverlap-op-by-pieces=[off|on|max-memset] to generate overlapping operations to minimize number of operations if it is not a stack push which must not overlap. When operating on LENGTH bytes of memory, -foverlap-op-by-pieces=on starts with the widest usable integer size, MAX_SIZE, for LENGTH bytes and finishes with the smallest usable integer size, MIN_SIZE, for the remaining bytes where MAX_SIZE >= MIN_SIZE. If MIN_SIZE > the remaining bytes, the last operation is performed on MIN_SIZE bytes of overlapping memory from the previous operation. For memset with non-zero byte, -foverlap-op-by-pieces=max-memset generates an overlapping fill with MAX_SIZE if the number of the remaining bytes is greater than one. Tested on Linux/x86-64 with both -foverlap-op-by-pieces enabled and disabled by default. gcc/ PR middl-end/90773 * common.opt (-foverlap-op-by-pieces): New. * expr.c (by_pieces_ninsns): If -foverlap-op-by-pieces is enabled, round up size and alignment to the widest integer mode for maximum size (op_by_pieces_d): Add get_usable_mode, m_push and m_non_zero_memset. (op_by_pieces_d::op_by_pieces_d): Add 2 bool arguments to initialize m_push and m_non_zero_memset. (op_by_pieces_d::get_usable_mode): New. (op_by_pieces_d::run): Use get_usable_mode to get the largest usable integer mode and generate overlapping operations for -foverlap-op-by-pieces. (PUSHG_P): New. (move_by_pieces_d::move_by_pieces_d): Updated for op_by_pieces_d change. (store_by_pieces_d::store_by_pieces_d): Likewise. (clear_by_pieces): Likewsie. * toplev.c (process_options): Issue an error when -foverlap-op-by-pieces is used for strict alignment target. * doc/invoke.texi: Document -foverlap-op-by-pieces. gcc/testsuite/ PR middl-end/90773 * g++.dg/pr90773-1.h: New test. * g++.dg/pr90773-1a.C: Likewise. * g++.dg/pr90773-1b.C: Likewise. * g++.dg/pr90773-1c.C: Likewise. * g++.dg/pr90773-1d.C: Likewise. * gcc.target/i386/pr90773-1.c: Likewise. * gcc.target/i386/pr90773-2.c: Likewise. * gcc.target/i386/pr90773-3.c: Likewise. * gcc.target/i386/pr90773-4.c: Likewise. * gcc.target/i386/pr90773-5.c: Likewise. * gcc.target/i386/pr90773-6.c: Likewise. * gcc.target/i386/pr90773-7.c: Likewise. * gcc.target/i386/pr90773-8.c: Likewise. * gcc.target/i386/pr90773-9.c: Likewise. * gcc.target/i386/pr90773-10.c: Likewise. * gcc.target/i386/pr90773-11.c: Likewise. --- gcc/common.opt | 19 +++ gcc/doc/invoke.texi| 14 ++ gcc/expr.c | 159 - gcc/testsuite/g++.dg/pr90773-1.h | 14 ++ gcc/testsuite/g++.dg/pr90773-1a.C | 13 ++ gcc/testsuite/g++.dg/pr90773-1b.C | 5 + gcc/testsuite/g++.dg/pr90773-1c.C | 5 + gcc/testsuite/g++.dg/pr90773-1d.C | 19 +++ gcc/testsuite/gcc.target/i386/pr90773-1.c | 17 +++ gcc/testsuite/gcc.target/i386/pr90773-10.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-11.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-2.c | 20 +++ gcc/testsuite/gcc.target/i386/pr90773-3.c | 23 +++ gcc/testsuite/gcc.target/i386/pr90773-4.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-5.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-6.c | 11 ++ gcc/testsuite/gcc.target/i386/pr90773-7.c | 11 ++ gcc/testsuite/gcc.target/i386/pr90773-8.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-9.c | 13 ++ gcc/toplev.c | 8 ++ 20 files changed, 383 insertions(+), 33 deletions(-) create mode 100644 gcc/testsuite/g++.dg/pr90773-1.h create mode 100644 gcc/testsuite/g++.dg/pr90773-1a.C create mode 100644 gcc/testsuite/g++.dg/pr90773-1b.C create mode 100644 gcc/testsuite/g++.dg/pr90773-1c.C create mode 100644 gcc/testsuite/g++.dg/pr90773-1d.C create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-10.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-11.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-4.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-5.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-6.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-7.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-8.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-9.c diff --git a/gcc/common.opt b/gcc/common.opt index a75b44ee47e..7f5b38c7810 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2123,6 +2123,25 @@
[PATCH 2/2] x86: Enable -foverlap-op-by-pieces by default
Add TARGET_PREFER_OVERLAP_OP_BY_PIECES for Alder Lake and Intel core processoes with AVX2 to enable -foverlap-op-by-pieces by default. gcc/ PR middl-end/90773 * config/i386/i386-options.c (ix86_option_override_internal): Enable -foverlap-op-by-pieces by default for TARGET_PREFER_OVERLAP_OP_BY_PIECES. * config/i386/i386.h (TARGET_PREFER_OVERLAP_OP_BY_PIECES): New. * config/i386/x86-tune.def (X86_TUNE_PREFER_OVERLAP_OP_BY_PIECES): New. gcc/testsuite/ PR middl-end/90773 * gcc.target/i386/pr90773-12.c: New test. * gcc.target/i386/pr90773-13.c: Likewise. --- gcc/config/i386/i386-options.c | 3 +++ gcc/config/i386/i386.h | 2 ++ gcc/config/i386/x86-tune.def | 6 ++ gcc/testsuite/gcc.target/i386/pr90773-12.c | 11 +++ gcc/testsuite/gcc.target/i386/pr90773-13.c | 11 +++ 5 files changed, 33 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-12.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-13.c diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c index 2a12228d195..5949d2d5597 100644 --- a/gcc/config/i386/i386-options.c +++ b/gcc/config/i386/i386-options.c @@ -2821,6 +2821,9 @@ ix86_option_override_internal (bool main_args_p, if (ix86_indirect_branch != indirect_branch_keep) SET_OPTION_IF_UNSET (opts, opts_set, flag_jump_tables, 0); + if (TARGET_PREFER_OVERLAP_OP_BY_PIECES) +SET_OPTION_IF_UNSET (opts, opts_set, flag_overlap_op_by_pieces, 1); + return true; } diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 96b46bac238..cf24fecaddc 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -304,6 +304,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST]; #define TARGET_SINGLE_STRINGOP ix86_tune_features[X86_TUNE_SINGLE_STRINGOP] #define TARGET_PREFER_KNOWN_REP_MOVSB_STOSB \ ix86_tune_features[X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB] +#define TARGET_PREFER_OVERLAP_OP_BY_PIECES \ + ix86_tune_features[X86_TUNE_PREFER_OVERLAP_OP_BY_PIECES] #define TARGET_MISALIGNED_MOVE_STRING_PRO_EPILOGUES \ ix86_tune_features[X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES] #define TARGET_QIMODE_MATH ix86_tune_features[X86_TUNE_QIMODE_MATH] diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def index eb057a67750..848c1b53ad4 100644 --- a/gcc/config/i386/x86-tune.def +++ b/gcc/config/i386/x86-tune.def @@ -275,6 +275,12 @@ DEF_TUNE (X86_TUNE_PREFER_KNOWN_REP_MOVSB_STOSB, "prefer_known_rep_movsb_stosb", m_SKYLAKE | m_ALDERLAKE | m_CORE_AVX512) +/* X86_TUNE_PREFER_OVERLAP_OP_BY_PIECES: Enable -foverlap-op-by-pieces-run + by default. */ +DEF_TUNE (X86_TUNE_PREFER_OVERLAP_OP_BY_PIECES, + "prefer_overlap_op_by_pieces", + m_CORE_AVX2) + /* X86_TUNE_MISALIGNED_MOVE_STRING_PRO_EPILOGUES: Enable generation of compact prologues and epilogues by issuing a misaligned moves. This requires target to handle misaligned moves and partial memory stalls diff --git a/gcc/testsuite/gcc.target/i386/pr90773-12.c b/gcc/testsuite/gcc.target/i386/pr90773-12.c new file mode 100644 index 000..e45840a5b8d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr90773-12.c @@ -0,0 +1,11 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mno-avx -msse2 -mtune=skylake" } */ + +void +foo (char *dst, char *src) +{ + __builtin_memcpy (dst, src, 255); +} + +/* { dg-final { scan-assembler-times "movdqu\[\\t \]+\[0-9\]*\\(%\[\^,\]+\\)," 16 } } */ +/* { dg-final { scan-assembler-not "mov\[bwlq\]" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr90773-13.c b/gcc/testsuite/gcc.target/i386/pr90773-13.c new file mode 100644 index 000..4d5ae8d1086 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr90773-13.c @@ -0,0 +1,11 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mno-avx -msse2 -mtune=skylake" } */ + +void +foo (char *dst) +{ + __builtin_memset (dst, 0, 255); +} + +/* { dg-final { scan-assembler-times "movups\[\\t \]+%xmm\[0-9\]+, \[0-9\]*\\(%\[\^,\]+\\)" 16 } } */ +/* { dg-final { scan-assembler-not "mov\[bwlq\]" } } */ -- 2.30.2
[PATCH 0/2] Generate overlapping operations between two areas of memory
For op_by_pieces operations between two areas of memory on non-strict alignment target, add -foverlap-op-by-pieces=[off|on|max-memset] to generate overlapping operations to minimize number of operations if it is not a stack push which must not overlap. When operating on LENGTH bytes of memory, -foverlap-op-by-pieces=on starts with the widest usable integer size, MAX_SIZE, for LENGTH bytes and finishes with the smallest usable integer size, MIN_SIZE, for the remaining bytes where MAX_SIZE >= MIN_SIZE. If MIN_SIZE > the remaining bytes, the last operation is performed on MIN_SIZE bytes of overlapping memory from the previous operation. For memset with non-zero byte, -foverlap-op-by-pieces=max-memset generates an overlapping fill with MAX_SIZE if the number of the remaining bytes is greater than one. Code sizes are reduced slightly on glibc and GCC. Performance impact on SPEC CPU 2017 on Intel Xeon are within noise range. H.J. Lu (2): Generate overlapping operations between two areas of memory x86: Enable -foverlap-op-by-pieces by default gcc/common.opt | 19 +++ gcc/config/i386/i386-options.c | 3 + gcc/config/i386/i386.h | 2 + gcc/config/i386/x86-tune.def | 6 + gcc/doc/invoke.texi| 14 ++ gcc/expr.c | 159 - gcc/testsuite/g++.dg/pr90773-1.h | 14 ++ gcc/testsuite/g++.dg/pr90773-1a.C | 13 ++ gcc/testsuite/g++.dg/pr90773-1b.C | 5 + gcc/testsuite/g++.dg/pr90773-1c.C | 5 + gcc/testsuite/g++.dg/pr90773-1d.C | 19 +++ gcc/testsuite/gcc.target/i386/pr90773-1.c | 17 +++ gcc/testsuite/gcc.target/i386/pr90773-10.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-11.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-12.c | 11 ++ gcc/testsuite/gcc.target/i386/pr90773-13.c | 11 ++ gcc/testsuite/gcc.target/i386/pr90773-2.c | 20 +++ gcc/testsuite/gcc.target/i386/pr90773-3.c | 23 +++ gcc/testsuite/gcc.target/i386/pr90773-4.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-5.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-6.c | 11 ++ gcc/testsuite/gcc.target/i386/pr90773-7.c | 11 ++ gcc/testsuite/gcc.target/i386/pr90773-8.c | 13 ++ gcc/testsuite/gcc.target/i386/pr90773-9.c | 13 ++ gcc/toplev.c | 8 ++ 25 files changed, 416 insertions(+), 33 deletions(-) create mode 100644 gcc/testsuite/g++.dg/pr90773-1.h create mode 100644 gcc/testsuite/g++.dg/pr90773-1a.C create mode 100644 gcc/testsuite/g++.dg/pr90773-1b.C create mode 100644 gcc/testsuite/g++.dg/pr90773-1c.C create mode 100644 gcc/testsuite/g++.dg/pr90773-1d.C create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-10.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-11.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-12.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-13.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-4.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-5.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-6.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-7.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-8.c create mode 100644 gcc/testsuite/gcc.target/i386/pr90773-9.c -- 2.30.2
[Bug c++/86426] g++ ICE at on valid code in tree_operand_check, at tree.h:3615
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86426 Patrick Palka changed: What|Removed |Added CC||tangyixuan at mail dot dlut.edu.cn --- Comment #7 from Patrick Palka --- *** Bug 94210 has been marked as a duplicate of this bug. ***
[Bug c++/94210] ICE in tsubst, at cp/pt.c:15105
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94210 Patrick Palka changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE CC||ppalka at gcc dot gnu.org --- Comment #3 from Patrick Palka --- This looks like a dup of PR86426. *** This bug has been marked as a duplicate of bug 86426 ***
[Bug c++/100161] [10/11 Regression] Impossible to suppress Wtype-limits warning involving template parameter.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100161 Marek Polacek changed: What|Removed |Added Summary|[10/11/12 Regression] |[10/11 Regression] |Impossible to suppress |Impossible to suppress |Wtype-limits warning|Wtype-limits warning |involving template |involving template |parameter. |parameter. --- Comment #4 from Marek Polacek --- Fixed on trunk so far.
[Bug c++/100161] [10/11/12 Regression] Impossible to suppress Wtype-limits warning involving template parameter.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100161 --- Comment #3 from CVS Commits --- The master branch has been updated by Marek Polacek : https://gcc.gnu.org/g:244dfb95119106e9267f37583caac565c39eb0ec commit r12-71-g244dfb95119106e9267f37583caac565c39eb0ec Author: Marek Polacek Date: Tue Apr 20 20:24:09 2021 -0400 c++: Prevent bogus -Wtype-limits warning with NTTP [PR100161] Recently, we made sure that we never call value_dependent_expression_p on an expression that isn't potential_constant_expression. That caused this bogus warning with a non-type template parameter, something that users don't want to see. The problem is that in tsubst_copy_and_build/LE_EXPR 't' is "i < n", which, due to 'i', is not p_c_e, therefore we call t_d_e_p. But the type of 'n' isn't dependent, so we think the whole 't' expression is not dependent. It seems we need to test both op0 and op1 separately to suppress this warning. gcc/cp/ChangeLog: PR c++/100161 * pt.c (tsubst_copy_and_build) : Test op0 and op1 separately for value- or type-dependence. gcc/testsuite/ChangeLog: PR c++/100161 * g++.dg/warn/Wtype-limits6.C: New test.
[Patch] PR fortran/100154 - [9/10/11/12 Regression] ICE in gfc_conv_procedure_call, at fortran/trans-expr.c:6131
Now with the correct patch attached ... Sorry for the confusion! --- Dear Fortranners, we need to check the arguments to the affected GNU intrinsic extensions properly, and - as pointed out in the PR by Tobias - we need to allow function references that have a data pointer result. Also the argument names of the character arguments of the subroutine versions needed a fix ("c" instead of "count"). Regtested on x86_64-pc-linux-gnu. OK for mainline (12)? OK for backports after 11.1 release? Thanks, Harald PR fortran/100154 - ICE in gfc_conv_procedure_call, at fortran/trans-expr.c:6131 Add appropriate static checks for the character and status arguments to the GNU Fortran intrinsic extensions fget[c], fput[c]. Extend variable check to allow a function reference having a data pointer result. gcc/fortran/ChangeLog: PR fortran/100154 * check.c (variable_check): Allow function reference having a data pointer result. (arg_strlen_is_zero): New function. (gfc_check_fgetputc_sub): Add static check of character and status arguments. (gfc_check_fgetput_sub): Likewise. * intrinsic.c (add_subroutines): Fix argument name for the character argument to intrinsic subroutines fget[c], fput[c]. gcc/testsuite/ChangeLog: PR fortran/100154 * gfortran.dg/pr100154.f90: New test. diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c index 82db8e4e1b2..1d30c93df82 100644 --- a/gcc/fortran/check.c +++ b/gcc/fortran/check.c @@ -1055,6 +1055,13 @@ variable_check (gfc_expr *e, int n, bool allow_proc) return true; } + /* F2018:R902: function reference having a data pointer result. */ + if (e->expr_type == EXPR_FUNCTION + && e->symtree->n.sym->attr.flavor == FL_PROCEDURE + && e->symtree->n.sym->attr.function + && e->symtree->n.sym->attr.pointer) +return true; + gfc_error ("%qs argument of %qs intrinsic at %L must be a variable", gfc_current_intrinsic_arg[n]->name, gfc_current_intrinsic, >where); @@ -5689,6 +5691,19 @@ gfc_check_spread (gfc_expr *source, gfc_expr *dim, gfc_expr *ncopies) /* Functions for checking FGETC, FPUTC, FGET and FPUT (subroutines and functions). */ +bool +arg_strlen_is_zero (gfc_expr *c, int n) +{ + if (gfc_var_strlen (c) == 0) +{ + gfc_error ("%qs argument of %qs intrinsic at %L must have " + "length at least 1", gfc_current_intrinsic_arg[n]->name, + gfc_current_intrinsic, >where); + return true; +} + return false; +} + bool gfc_check_fgetputc_sub (gfc_expr *unit, gfc_expr *c, gfc_expr *status) { @@ -5702,13 +5717,19 @@ gfc_check_fgetputc_sub (gfc_expr *unit, gfc_expr *c, gfc_expr *status) return false; if (!kind_value_check (c, 1, gfc_default_character_kind)) return false; + if (strcmp (gfc_current_intrinsic, "fgetc") == 0 + && !variable_check (c, 1, false)) +return false; + if (arg_strlen_is_zero (c, 1)) +return false; if (status == NULL) return true; if (!type_check (status, 2, BT_INTEGER) || !kind_value_check (status, 2, gfc_default_integer_kind) - || !scalar_check (status, 2)) + || !scalar_check (status, 2) + || !variable_check (status, 2, false)) return false; return true; @@ -5729,13 +5750,19 @@ gfc_check_fgetput_sub (gfc_expr *c, gfc_expr *status) return false; if (!kind_value_check (c, 0, gfc_default_character_kind)) return false; + if (strcmp (gfc_current_intrinsic, "fget") == 0 + && !variable_check (c, 0, false)) +return false; + if (arg_strlen_is_zero (c, 0)) +return false; if (status == NULL) return true; if (!type_check (status, 1, BT_INTEGER) || !kind_value_check (status, 1, gfc_default_integer_kind) - || !scalar_check (status, 1)) + || !scalar_check (status, 1) + || !variable_check (status, 1, false)) return false; return true; diff --git a/gcc/fortran/intrinsic.c b/gcc/fortran/intrinsic.c index 17fd92eb462..219f04f2317 100644 --- a/gcc/fortran/intrinsic.c +++ b/gcc/fortran/intrinsic.c @@ -3460,7 +3460,7 @@ add_subroutines (void) /* Argument names. These are used as argument keywords and so need to match the documentation. Please keep this list in sorted order. */ static const char -*a = "a", *c = "count", *cm = "count_max", *com = "command", +*a = "a", *c_ = "c", *c = "count", *cm = "count_max", *com = "command", *cr = "count_rate", *dt = "date", *errmsg = "errmsg", *f = "from", *fp = "frompos", *gt = "get", *h = "harvest", *han = "handler", *length = "length", *ln = "len", *md = "mode", *msk = "mask", @@ -3840,12 +3840,12 @@ add_subroutines (void) add_sym_3s ("fgetc", GFC_ISYM_FGETC, CLASS_IMPURE, BT_UNKNOWN, 0, GFC_STD_GNU, gfc_check_fgetputc_sub, NULL, gfc_resolve_fgetc_sub, ut, BT_INTEGER, di, REQUIRED, INTENT_IN, - c, BT_CHARACTER, dc, REQUIRED, INTENT_OUT, + c_, BT_CHARACTER, dc, REQUIRED,
[Bug fortran/100154] [9/10/11/12 Regression] ICE in gfc_conv_procedure_call, at fortran/trans-expr.c:6131
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100154 anlauf at gcc dot gnu.org changed: What|Removed |Added Keywords||rejects-valid --- Comment #9 from anlauf at gcc dot gnu.org --- Submitted: https://gcc.gnu.org/pipermail/fortran/2021-April/055977.html
[Patch] PR fortran/100154 - [9/10/11/12 Regression] ICE in gfc_conv_procedure_call, at fortran/trans-expr.c:6131
Dear Fortranners, we need to check the arguments to the affected GNU intrinsic extensions properly, and - as pointed out in the PR by Tobias - we need to allow function references that have a data pointer result. Also the argument names of the character arguments of the subroutine versions needed a fix ("c" instead of "count"). Regtested on x86_64-pc-linux-gnu. OK for mainline (12)? OK for backports after 11.1 release? Thanks, Harald PR fortran/100154 - ICE in gfc_conv_procedure_call, at fortran/trans-expr.c:6131 Add appropriate static checks for the character and status arguments to the GNU Fortran intrinsic extensions fget[c], fput[c]. Extend variable check to allow a function reference having a data pointer result. gcc/fortran/ChangeLog: PR fortran/100154 * check.c (variable_check): Allow function reference having a data pointer result. (arg_strlen_is_zero): New function. (gfc_check_fgetputc_sub): Add static check of character and status arguments. (gfc_check_fgetput_sub): Likewise. * intrinsic.c (add_subroutines): Fix argument name for the character argument to intrinsic subroutines fget[c], fput[c]. gcc/testsuite/ChangeLog: PR fortran/100154 * gfortran.dg/pr100154.f90: New test. diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c index 82db8e4e1b2..1d30c93df82 100644 --- a/gcc/fortran/check.c +++ b/gcc/fortran/check.c @@ -5689,6 +5691,19 @@ gfc_check_spread (gfc_expr *source, gfc_expr *dim, gfc_expr *ncopies) /* Functions for checking FGETC, FPUTC, FGET and FPUT (subroutines and functions). */ +bool +arg_strlen_is_zero (gfc_expr *c, int n) +{ + if (gfc_var_strlen (c) == 0) +{ + gfc_error ("%qs argument of %qs intrinsic at %L must have " + "length at least 1", gfc_current_intrinsic_arg[n]->name, + gfc_current_intrinsic, >where); + return true; +} + return false; +} + bool gfc_check_fgetputc_sub (gfc_expr *unit, gfc_expr *c, gfc_expr *status) { @@ -5702,13 +5717,19 @@ gfc_check_fgetputc_sub (gfc_expr *unit, gfc_expr *c, gfc_expr *status) return false; if (!kind_value_check (c, 1, gfc_default_character_kind)) return false; + if (strcmp (gfc_current_intrinsic, "fgetc") == 0 + && !variable_check (c, 1, false)) +return false; + if (arg_strlen_is_zero (c, 1)) +return false; if (status == NULL) return true; if (!type_check (status, 2, BT_INTEGER) || !kind_value_check (status, 2, gfc_default_integer_kind) - || !scalar_check (status, 2)) + || !scalar_check (status, 2) + || !variable_check (status, 2, false)) return false; return true; @@ -5729,13 +5750,19 @@ gfc_check_fgetput_sub (gfc_expr *c, gfc_expr *status) return false; if (!kind_value_check (c, 0, gfc_default_character_kind)) return false; + if (strcmp (gfc_current_intrinsic, "fget") == 0 + && !variable_check (c, 0, false)) +return false; + if (arg_strlen_is_zero (c, 0)) +return false; if (status == NULL) return true; if (!type_check (status, 1, BT_INTEGER) || !kind_value_check (status, 1, gfc_default_integer_kind) - || !scalar_check (status, 1)) + || !scalar_check (status, 1) + || !variable_check (status, 1, false)) return false; return true; diff --git a/gcc/fortran/intrinsic.c b/gcc/fortran/intrinsic.c index 17fd92eb462..219f04f2317 100644 --- a/gcc/fortran/intrinsic.c +++ b/gcc/fortran/intrinsic.c @@ -3460,7 +3460,7 @@ add_subroutines (void) /* Argument names. These are used as argument keywords and so need to match the documentation. Please keep this list in sorted order. */ static const char -*a = "a", *c = "count", *cm = "count_max", *com = "command", +*a = "a", *c_ = "c", *c = "count", *cm = "count_max", *com = "command", *cr = "count_rate", *dt = "date", *errmsg = "errmsg", *f = "from", *fp = "frompos", *gt = "get", *h = "harvest", *han = "handler", *length = "length", *ln = "len", *md = "mode", *msk = "mask", @@ -3840,12 +3840,12 @@ add_subroutines (void) add_sym_3s ("fgetc", GFC_ISYM_FGETC, CLASS_IMPURE, BT_UNKNOWN, 0, GFC_STD_GNU, gfc_check_fgetputc_sub, NULL, gfc_resolve_fgetc_sub, ut, BT_INTEGER, di, REQUIRED, INTENT_IN, - c, BT_CHARACTER, dc, REQUIRED, INTENT_OUT, + c_, BT_CHARACTER, dc, REQUIRED, INTENT_OUT, st, BT_INTEGER, di, OPTIONAL, INTENT_OUT); add_sym_2s ("fget", GFC_ISYM_FGET, CLASS_IMPURE, BT_UNKNOWN, 0, GFC_STD_GNU, gfc_check_fgetput_sub, NULL, gfc_resolve_fget_sub, - c, BT_CHARACTER, dc, REQUIRED, INTENT_OUT, + c_, BT_CHARACTER, dc, REQUIRED, INTENT_OUT, st, BT_INTEGER, di, OPTIONAL, INTENT_OUT); add_sym_1s ("flush", GFC_ISYM_FLUSH, CLASS_IMPURE, BT_UNKNOWN, 0, GFC_STD_GNU, @@ -3855,12 +3855,12 @@ add_subroutines (void) add_sym_3s ("fputc", GFC_ISYM_FPUTC, CLASS_IMPURE, BT_UNKNOWN, 0, GFC_STD_GNU, gfc_check_fgetputc_sub, NULL,
Re: [PATCH] Use STATIC_ASSERT for OVL_OP_MAX.
On 4/22/21 9:41 AM, Jonathan Wakely wrote: On Thu, 22 Apr 2021 at 15:59, Martin Sebor wrote: On 4/22/21 2:52 AM, Jonathan Wakely wrote: On Thu, 22 Apr 2021, 08:47 Martin Liška, wrote: On 4/21/21 6:11 PM, Martin Sebor wrote: > On 4/21/21 2:15 AM, Martin Liška wrote: >> Hello. >> >> It's addressing the following Clang warning: >> cp/lex.c:170:45: warning: result of comparison of constant 64 with expression of type 'enum ovl_op_code' is always true [-Wtautological-constant-out-of-range-compare] >> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. >> >> Ready to be installed? >> Thanks, >> Martin >> >> gcc/cp/ChangeLog: >> >> * cp-tree.h (STATIC_ASSERT): Prefer static assert. >> * lex.c (init_operators): Remove run-time check. >> --- >> gcc/cp/cp-tree.h | 3 +++ >> gcc/cp/lex.c | 2 -- >> 2 files changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h >> index 81ff375f8a5..a8f72448ea9 100644 >> --- a/gcc/cp/cp-tree.h >> +++ b/gcc/cp/cp-tree.h >> @@ -5916,6 +5916,9 @@ enum ovl_op_code { >> OVL_OP_MAX >> }; >> +/* Make sure it fits in lang_decl_fn::operator_code. */ >> +STATIC_ASSERT (OVL_OP_MAX < (1 << 6)); >> + > > I wonder if there's a way to test this directly by something like > > static_assert (number-of-bits (ovl_op_info_t::ovl_op_code) > <= number-of-bits (lang_decl_fn::operator_code)); Good point, but I'm not aware of it. Maybe C++ people can chime in? ovl_op_code is an unscoped enumeration (meaning "enum" not "enum class") with no fixed underlying type (i.e. no enum-base like ": int" or ": long" is specified) which means that the number of bits in is value representation is the number of bits needed to represent the minimum and maximum enumerators: "the values of the enumeration are the values representable by a hypothetical integer type with minimal width M such that all enumerators can be represented." There is no function/utility like number-of-bits that can tell you that from the type though.You could use std::underlying_type::type to get the integral type that the compiler used to represent it, but that will probably be 'int' in this case and so all it tells you is an upper bound of no more than 32 bits, which is not useful for this purpose. I suspected there wasn't a function like that. Thanks for confirming it. I wrote the one below just to see if it could be done. It works for one bit-field but I can't think of a way to generalize it. We'd probably need a built-in for that. Perhaps one might be useful. enum E { e = 5 }; struct A { E e: 3; }; constexpr int number_of_bits () { A a = { }; a.e = (E)-1; int n = 0; for (; a.e; ++n) a.e = (E)((unsigned)a.e ^ (1 << n)); return n; } Martin Or: enum E { e = 5 }; struct A { E e: 3; }; constexpr int number_of_bits () { A a = { }; a.e = (E)-1; return 32 - __builtin_clz(a.e); } I had the same thought about using clz. It works in this case but not in if one of the enumerators is negative, or if the underlying type is signed. But you can't get the number-of-bits needed for all the values of the enum E, which is what I was referring to. If you know the enumerators go from 0 to MAX (as is the case for ovl_op_code) you can use (32 - __builtin_clz(MAX)) there too, but in the general case you don't always know the maximum enumerator without checking, and it depends whether the enumeration has a fixed underlying type. That might be another useful query to add a built-in or trait for to improve introspection: get the min and max enumerator (or even all of them, e.g., as an initializer_list or something like that). Martin
[Bug fortran/100218] Allow target of the pointer resulting from the evaluation of function-reference in a variable definition context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100218 --- Comment #1 from anlauf at gcc dot gnu.org --- Submitted: https://gcc.gnu.org/pipermail/fortran/2021-April/055976.html
[Patch] PR fortran/100218 - target of pointer from evaluation of function-reference
Dear Fortranners, while analyzing a different PR (PR100154), Tobias pointed out that the target of a pointer from the evaluation of function-reference is allowed to be used in a variable definition context and thus as an actual argument to a function or subroutine. This seems to be a more general issue that seems to have been overlooked. The attached simple patch allows to compile and run the attached example, which is by the way already yet rejected with -std=f2003. Regtested on x86_64-pc-linux-gnu. OK for mainline? Shall we backport this to (at least) 11? Thanks, Harald Fortran - allow target of pointer from evaluation of function-reference Fortran allows the target of a pointer from the evaluation of a function-reference in a variable definition context (e.g. F2018:R902). gcc/fortran/ChangeLog: PR fortran/100218 * expr.c (gfc_check_vardef_context): Extend check to allow pointer from a function reference. gcc/testsuite/ChangeLog: PR fortran/100218 * gfortran.dg/pr100218.f90: New test. diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c index 92a6700568d..696b9f1daac 100644 --- a/gcc/fortran/expr.c +++ b/gcc/fortran/expr.c @@ -6121,7 +6132,9 @@ gfc_check_vardef_context (gfc_expr* e, bool pointer, bool alloc_obj, } if (!pointer && sym->attr.flavor != FL_VARIABLE && !(sym->attr.flavor == FL_PROCEDURE && sym == sym->result) - && !(sym->attr.flavor == FL_PROCEDURE && sym->attr.proc_pointer)) + && !(sym->attr.flavor == FL_PROCEDURE && sym->attr.proc_pointer) + && !(sym->attr.flavor == FL_PROCEDURE + && sym->attr.function && sym->attr.pointer)) { if (context) gfc_error ("%qs in variable definition context (%s) at %L is not" diff --git a/gcc/testsuite/gfortran.dg/pr100218.f90 b/gcc/testsuite/gfortran.dg/pr100218.f90 new file mode 100644 index 000..62b18f6a935 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr100218.f90 @@ -0,0 +1,19 @@ +! { dg-do run } +! { dg-options "-O2 -std=f2008" } +! PR fortran/100218 - target of pointer from evaluation of function-reference + +program p + implicit none + integer, target :: z = 0 + call g (f ()) + if (z /= 1) stop 1 +contains + function f () result (r) +integer, pointer :: r +r => z + end function f + subroutine g (x) +integer, intent(out) :: x +x = 1 + end subroutine g +end program p
Re: State of AutoFDO in GCC
On 4/22/21 9:58 PM, Eugene Rozenfeld via Gcc wrote: > GCC documentation for AutoFDO points to create_gcov tool that converts > perf.data file into gcov format that can be consumed by gcc with > -fauto-profile (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html, > https://gcc.gnu.org/wiki/AutoFDO/Tutorial). > > I noticed that the source code for create_gcov has been deleted from > https://github.com/google/autofdo on April 7. I asked about that change in > that repo and got the following reply: > > https://github.com/google/autofdo/pull/107#issuecomment-819108738 > > "Actually we didn't use create_gcov and havn't updated create_gcov for years, > and we also didn't have enough tests to guarantee it works (It was gcc-4.8 > when we used and verified create_gcov). If you need it, it is welcomed to > update create_gcov and add it to the respository." > > Does this mean that AutoFDO is currently dead in gcc? Hello. Yes. I know that even basic test cases have been broken for years in the GCC. It's new to me that create_gcov was removed. I tend to send patch to GCC that will remove AutoFDO from GCC. I known Bin spent some time working on AutoFDO, has he came up to something? Martin > > Thanks, > > Eugene >
[Bug tree-optimization/100221] New: missed optimization for dead code elimination at -O3 (vs. -O1, -Os, -O2)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100221 Bug ID: 100221 Summary: missed optimization for dead code elimination at -O3 (vs. -O1, -Os, -O2) Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zhendong.su at inf dot ethz.ch Target Milestone: --- [551] % gcctk -v Using built-in specs. COLLECT_GCC=gcctk COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-trunk/configure --disable-bootstrap --prefix=/local/suz-local/software/local/gcc-trunk --enable-languages=c,c++ --disable-werror --enable-multilib --with-system-zlib Thread model: posix Supported LTO compression algorithms: zlib gcc version 12.0.0 20210422 (experimental) [master revision 3cf04d1afa8:0e51007a40c:d42088e453042f4f8ba9190a7e29efd937ea2181] (GCC) [552] % [552] % gcctk -O1 -S -o O1.s small.c [553] % gcctk -O3 -S -o O3.s small.c [554] % [554] % wc O1.s O3.s 49 100 693 O1.s 73 151 1072 O3.s 122 251 1765 total [555] % [555] % grep foo O1.s [556] % grep foo O3.s callfoo [557] % [557] % cat small.c extern void foo(void); int a, b; static int c; static void f() { while (a) for (; b; b--) ; } void i() { if (c) foo(); int *g = { int **h[1] = {}; f(); } } int main() { i(); return 0; }
State of AutoFDO in GCC
GCC documentation for AutoFDO points to create_gcov tool that converts perf.data file into gcov format that can be consumed by gcc with -fauto-profile (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html, https://gcc.gnu.org/wiki/AutoFDO/Tutorial). I noticed that the source code for create_gcov has been deleted from https://github.com/google/autofdo on April 7. I asked about that change in that repo and got the following reply: https://github.com/google/autofdo/pull/107#issuecomment-819108738 "Actually we didn't use create_gcov and havn't updated create_gcov for years, and we also didn't have enough tests to guarantee it works (It was gcc-4.8 when we used and verified create_gcov). If you need it, it is welcomed to update create_gcov and add it to the respository." Does this mean that AutoFDO is currently dead in gcc? Thanks, Eugene
[PATCH] config/i386: Commentary typo fix
From: Bernhard Reutner-Fischer gcc/ChangeLog: * config/i386/x86-tune-sched-bd.c (dispatch_group): Commentary typo fix. --- gcc/config/i386/x86-tune-sched-bd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/i386/x86-tune-sched-bd.c b/gcc/config/i386/x86-tune-sched-bd.c index ad0edf713f5..be38e48b271 100644 --- a/gcc/config/i386/x86-tune-sched-bd.c +++ b/gcc/config/i386/x86-tune-sched-bd.c @@ -67,7 +67,7 @@ along with GCC; see the file COPYING3. If not see #define BIG 100 -/* Dispatch groups. Istructions that affect the mix in a dispatch window. */ +/* Dispatch groups. Instructions that affect the mix in a dispatch window. */ enum dispatch_group { disp_no_group = 0, disp_load, -- 2.31.1
[Bug c++/93083] [C++20] copy deduction rejected when doing CTAD for NTTP
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93083 Patrick Palka changed: What|Removed |Added CC||mateusz.pusz at gmail dot com --- Comment #9 from Patrick Palka --- *** Bug 95015 has been marked as a duplicate of this bug. ***
[Bug c++/95015] Partial specializations of class templates with class NTTP fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95015 Patrick Palka changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||ppalka at gcc dot gnu.org Resolution|--- |DUPLICATE --- Comment #2 from Patrick Palka --- (In reply to Johel Ernesto Guerrero Peña from comment #1) > Seems fixed in trunk, with and without the deduction guide. Ever since r11-5752, so dup of PR93083 apparently. *** This bug has been marked as a duplicate of bug 93083 ***
[Bug c++/83818] g++ class template parameter deduction discards const qualifier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83818 --- Comment #5 from Jonathan Wakely --- Ha, apparently I forgot that I'd reported the bug and fixed it myself.
[Bug c++/24666] [meta-bug] arrays decay to pointers too early
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24666 Bug 24666 depends on bug 94924, which changed state. Bug 94924 Summary: Default equality operator for C-array compares addresses, not data https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94924 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE
[Bug c++/93480] Defaulted <=> doesn't expand array elements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93480 Patrick Palka changed: What|Removed |Added CC||rhalbersma at gmail dot com --- Comment #8 from Patrick Palka --- *** Bug 94924 has been marked as a duplicate of this bug. ***
[Bug c++/94924] Default equality operator for C-array compares addresses, not data
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94924 Patrick Palka changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE CC||ppalka at gcc dot gnu.org --- Comment #3 from Patrick Palka --- Looks like a dup of PR93480, which has been fixed for GCC 11. *** This bug has been marked as a duplicate of bug 93480 ***
[Bug ipa/100220] New: missed optimization for dead code elimination at -O3 (vs. -O1, -Os, -O2)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100220 Bug ID: 100220 Summary: missed optimization for dead code elimination at -O3 (vs. -O1, -Os, -O2) Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: ipa Assignee: unassigned at gcc dot gnu.org Reporter: zhendong.su at inf dot ethz.ch CC: marxin at gcc dot gnu.org Target Milestone: --- [638] % gcctk -v Using built-in specs. COLLECT_GCC=gcctk COLLECT_LTO_WRAPPER=/local/suz-local/software/local/gcc-trunk/libexec/gcc/x86_64-pc-linux-gnu/12.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-trunk/configure --disable-bootstrap --prefix=/local/suz-local/software/local/gcc-trunk --enable-languages=c,c++ --disable-werror --enable-multilib --with-system-zlib Thread model: posix Supported LTO compression algorithms: zlib gcc version 12.0.0 20210422 (experimental) [master revision 3cf04d1afa8:0e51007a40c:d42088e453042f4f8ba9190a7e29efd937ea2181] (GCC) [639] % [639] % gcctk -O1 -S -o O1.s small.c [640] % gcctk -O3 -S -o O3.s small.c [641] % [641] % wc O1.s O3.s 62 135 857 O1.s 93 200 1337 O3.s 155 335 2194 total [642] % [642] % grep foo O1.s [643] % grep foo O3.s callfoo [644] % [644] % cat small.c extern void foo(void); int b, c, d, e, *h; static int *f = static int a() { return 1; } static void g() { if (!*f) for (; 1; d++) ; foo(); } static void i() { int j, l = 0, k[24] = {0}, *m[2] = {[4], }, n[27]; h = n; if (a() & n[0]) for (; c; c--) ; int p[8]; h = p; p[0] && (h = ); e = 0; } static void o() { int *q, **r = , ***s[1]; s[0] = i(); g(); } int main() { if (b) o(); return 0; }
[Bug c++/80990] cv-qualifiers ignored in variable definition using class template argument deduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80990 Patrick Palka changed: What|Removed |Added CC||bugzilla.gcc.karo at cupdev dot ne ||t --- Comment #8 from Patrick Palka --- *** Bug 83818 has been marked as a duplicate of this bug. ***
[Bug c++/83818] g++ class template parameter deduction discards const qualifier
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83818 Patrick Palka changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE CC||ppalka at gcc dot gnu.org --- Comment #4 from Patrick Palka --- (In reply to Jonathan Wakely from comment #3) > Confirmed. Already seems to be fixed on trunk. Ever since r8-1170. Looks like this is pretty much a dup of PR80990 then. *** This bug has been marked as a duplicate of bug 80990 ***
Re: Fix Fortran rounding issues, PR fortran/96983.
On April 22, 2021 9:09:28 PM GMT+02:00, Michael Meissner wrote: >On Wed, Apr 21, 2021 at 10:10:07AM +0200, Tobias Burnus wrote: >> On 20.04.21 08:58, Richard Biener via Fortran wrote: >> >On Mon, Apr 19, 2021 at 9:40 PM Michael Meissner via Fortran >> > wrote: >> Is there any reason to not only send the email to fortran@ _and_ >> gcc-patches@ but sending it to 13 Fortran maintainers explicitly? >(Now >> removed) > >Sorry about that. With PowerPC backend changes, I generally do >explicitly add >the maintainers so things don't get lost. > > >> >>Fix Fortran rounding issues, PR fortran/96983. >> >> >> >>Can I check this change into the GCC trunk? >> >The patch looks fine technically and is definitely an improvement >since the >> >intermediate conversion looks odd. But it might be that the >standard >> >requires such dance though the preceeding cases handled don't seem >to >> >care. I'm thinking of a FP format where round(1.6) == 3 because of >lack >> >of precision but using an intermediate FP format with higher >precision >> >would "correctly" compute 2. >> >> The patched build_round_expr is only called by ANINT / NINT; >> NINT is real → integer; ANINT is real → real >> [And the modified code is only called for NINT, reason: see comment >far below.] >> >> NINT (A[, KIND]) is described (F2018) as "Nearest integer": >> * Result Characteristics. Integer. If KIND is present, the kind type >parameter >> is that specified by the value of KIND; >> otherwise, the kind type parameter is that of default integer type. >> * The result is the integer nearest A, or if there are two >> integers equally near A, the result is whichever such integer has >the greater >> magnitude. >> * Example. NINT (2.783) has the value 3. >> >> ANINT (A[, KIND]) as "Nearest whole number": >> * The result is of type real. If KIND is present, the kind type >parameter is that >> specified by the value of KIND; otherwise, the kind type parameter >is that of A. >> * The result is the integer nearest A, or if there are two integers >equally near A, >> the result is whichever such integer has the greater magnitude. >> * Examples. ANINT (2.783) has the value 3.0. ANINT (−2.783) has the >value −3.0. >> >> >Of course the current code doesn't handle this correctly for the >> >case if llroundf either. >> >>I've not contributed to the Fortran front end before. If the >maintainers like >> >>the patch, can somebody point out if I need to do additional things >to commit >> >>the patch? >> Nothing special: a testcase already exists, committing is done as >usual >> and a PR to update you have as well. > >Given GCC 11 has branched, is it ok to backport the patch to the GCC 11 >branch >as well? I assume it is, since it fixes a regression in the compiler. Please wait until after 11.1 is released. Thanks, Richard.
[Bug c++/100215] Improve text of -Wmaybe-uninitialized references
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100215 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2021-04-22 Keywords||diagnostic Ever confirmed|0 |1
[Bug target/100219] New: Arm/Cortex-M: Suboptimal code returning unaligned struct with non-empty stack frame
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100219 Bug ID: 100219 Summary: Arm/Cortex-M: Suboptimal code returning unaligned struct with non-empty stack frame Product: gcc Version: 10.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: matthijs at stdin dot nl Target Milestone: --- Consider the program below, which deals with functions returning a struct of two members, either using a literal value or by forwarding the return value from another function. When the struct has no alignment, this results in suboptimal code that breaks the struct (stored in a single registrer) apart into its members and reassembles them into the struct into a single register again, where it could just have done absolutely nothing. Giving the struct some alignment somehow prevents this problem from occuring. Consider this program: $ cat Foo.c struct Result { char a, b; } #if defined(ALIGN) __attribute((aligned(ALIGN)))__ #endif ; struct Result other(const int*); struct Result func1() { int x; return other(); } struct Result func2() { struct Result y = {0x12, 0x34}; return y; } struct Result func3() { return other(0); } Which produces the following code: $ arm-linux-gnueabi-gcc-10 --version arm-linux-gnueabi-gcc-10 (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0 $ arm-linux-gnueabi-gcc-10 -fno-stack-protector -mcpu=cortex-m4 -c -O3 ~/Foo.c && objdump -d Foo.o : 0: b500push{lr} 2: b083sub sp, #12 4: a801add r0, sp, #4 6: f7ff fffe bl 0 a: 4603mov r3, r0 c: b2dauxtbr2, r3 e: 2000movsr0, #0 10: f362 0007 bfi r0, r2, #0, #8 14: f3c3 2307 ubfxr3, r3, #8, #8 18: f363 200f bfi r0, r3, #8, #8 1c: b003add sp, #12 1e: f85d fb04 ldr.w pc, [sp], #4 22: bf00nop 0024 : 24: f243 4312 movwr3, #13330 ; 0x3412 28: f003 0212 and.w r2, r3, #18 2c: 2000movsr0, #0 2e: f362 0007 bfi r0, r2, #0, #8 32: 0a1blsrsr3, r3, #8 34: b082sub sp, #8 36: f363 200f bfi r0, r3, #8, #8 3a: b002add sp, #8 3c: 4770bx lr 3e: bf00nop 0040 : 40: b082sub sp, #8 42: 2000movsr0, #0 44: b002add sp, #8 46: f7ff bffe b.w 0 4a: bf00nop Especially note func2, which correctly builds the struct using a single word literal, and then continues to break it apart and rebuild it. Note that I added -fno-stack-protector to make the generated code more consise, but the problem occurs even without this option. Somehow, the alignment influences this, since adding some alignment makes the problem disappear: $ arm-linux-gnueabi-gcc-10 -fno-stack-protector -mcpu=cortex-m4 -c -O3 ~/Foo.c -DALIGN=2 && objdump -d Foo.o Foo.o: file format elf32-littlearm Disassembly of section .text: : 0: b500push{lr} 2: b083sub sp, #12 4: a801add r0, sp, #4 6: f7ff fffe bl 0 a: b003add sp, #12 c: f85d fb04 ldr.w pc, [sp], #4 0010 : 10: f243 4012 movwr0, #13330 ; 0x3412 14: 4770bx lr 16: bf00nop 0018 : 18: 2000movsr0, #0 1a: f7ff bffe b.w 0 1e: bf00nop Other things I've observed: - When using ALIGN=2 or ALIGN=4, the problem disappears as shown above. ALIGN=1 is equivalent to no alignment. Using ALIGN=8 also makes the problem disappear, but it seams this cause the return value to be passed in memory, rather than in r0 directly. - Using -mcpu=arm8, or arm7tdmi, or some other arm cpus I tried, the problem disappears. With all cortex variants I tried the problem stays, though sometimes it seems slightly less severe. - I could not reproduce this on x86_64. - Using a struct with just 1 char, the problem disappears. - Using a struct with 4 chars, the problem stays (and becomes more pronounced because there's more work to rebuild the struct). - Using a struct with 2 shorts, the problem disappears for func2, but stays for func1. - Writing something equivalent in C++, the problem also appears (I originally saw this problem in C++ and then tried reproducing in C). - When
[Bug libstdc++/100187] Helper lambda in ranges_algo.h misses forwarding return type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100187 Patrick Palka changed: What|Removed |Added Ever confirmed|0 |1 CC||ppalka at gcc dot gnu.org Status|UNCONFIRMED |NEW Last reconfirmed||2021-04-22 --- Comment #4 from Patrick Palka --- Confirmed.
Re: Fix Fortran rounding issues, PR fortran/96983.
On Wed, Apr 21, 2021 at 10:10:07AM +0200, Tobias Burnus wrote: > On 20.04.21 08:58, Richard Biener via Fortran wrote: > >On Mon, Apr 19, 2021 at 9:40 PM Michael Meissner via Fortran > > wrote: > Is there any reason to not only send the email to fortran@ _and_ > gcc-patches@ but sending it to 13 Fortran maintainers explicitly? (Now > removed) Sorry about that. With PowerPC backend changes, I generally do explicitly add the maintainers so things don't get lost. > >>Fix Fortran rounding issues, PR fortran/96983. > >> > >>Can I check this change into the GCC trunk? > >The patch looks fine technically and is definitely an improvement since the > >intermediate conversion looks odd. But it might be that the standard > >requires such dance though the preceeding cases handled don't seem to > >care. I'm thinking of a FP format where round(1.6) == 3 because of lack > >of precision but using an intermediate FP format with higher precision > >would "correctly" compute 2. > > The patched build_round_expr is only called by ANINT / NINT; > NINT is real → integer; ANINT is real → real > [And the modified code is only called for NINT, reason: see comment far > below.] > > NINT (A[, KIND]) is described (F2018) as "Nearest integer": > * Result Characteristics. Integer. If KIND is present, the kind type parameter > is that specified by the value of KIND; > otherwise, the kind type parameter is that of default integer type. > * The result is the integer nearest A, or if there are two > integers equally near A, the result is whichever such integer has the > greater > magnitude. > * Example. NINT (2.783) has the value 3. > > ANINT (A[, KIND]) as "Nearest whole number": > * The result is of type real. If KIND is present, the kind type parameter is > that > specified by the value of KIND; otherwise, the kind type parameter is that > of A. > * The result is the integer nearest A, or if there are two integers equally > near A, > the result is whichever such integer has the greater magnitude. > * Examples. ANINT (2.783) has the value 3.0. ANINT (−2.783) has the value > −3.0. > > >Of course the current code doesn't handle this correctly for the > >case if llroundf either. > >>I've not contributed to the Fortran front end before. If the maintainers > >>like > >>the patch, can somebody point out if I need to do additional things to > >>commit > >>the patch? > Nothing special: a testcase already exists, committing is done as usual > and a PR to update you have as well. Given GCC 11 has branched, is it ok to backport the patch to the GCC 11 branch as well? I assume it is, since it fixes a regression in the compiler. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
[Bug fortran/100218] New: Allow target of the pointer resulting from the evaluation of function-reference in a variable definition context
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100218 Bug ID: 100218 Summary: Allow target of the pointer resulting from the evaluation of function-reference in a variable definition context Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: anlauf at gcc dot gnu.org Target Milestone: --- While analyzing PR100154, Tobias pointed me to the following: F2018:R902: function-reference shall have a data pointer result A variable is either the data object denoted by designator or the target of the pointer resulting from the evaluation of function-reference; this pointer shall be associated. He also gave an example in that other PR, for which I have a fix. In my interpretation the following code should thus also be valid and is accepted by two other compilers I tested (Intel, Nvidia) and gives the right result. program p implicit none integer, target :: z = 0 call g (f ()) print *, z contains function f () result (r) integer, pointer :: r r => z end function f subroutine g (x) integer, intent(out) :: x x = 1 end subroutine g end program p The following patch seems to help: diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c index 92a6700568d..696b9f1daac 100644 --- a/gcc/fortran/expr.c +++ b/gcc/fortran/expr.c @@ -6121,7 +6132,9 @@ gfc_check_vardef_context (gfc_expr* e, bool pointer, bool alloc_obj, } if (!pointer && sym->attr.flavor != FL_VARIABLE && !(sym->attr.flavor == FL_PROCEDURE && sym == sym->result) - && !(sym->attr.flavor == FL_PROCEDURE && sym->attr.proc_pointer)) + && !(sym->attr.flavor == FL_PROCEDURE && sym->attr.proc_pointer) + && !(sym->attr.flavor == FL_PROCEDURE + && sym->attr.function && sym->attr.pointer)) { if (context) gfc_error ("%qs in variable definition context (%s) at %L is not"
[Bug c++/94845] DWARF function name doesn't match demangled name in base type template parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94845 Tom Tromey changed: What|Removed |Added CC||tromey at gcc dot gnu.org --- Comment #6 from Tom Tromey --- gdb does this canonicalization precisely because the form in the DWARF cannot be relied upon. It would be great to remove this, because it is expensive. One idea for a migration route would be for g++ to promise to emit the same form that the demangler emits; then add an attribute to the comp-unit DIE saying that the names have been canonicalized. (Or, I suppose gdb could use producer sniffing; but I'd rather avoid that as much as possible.)
[Bug inline-asm/98847] Miscompilation with c++17, templates, and register keyword
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98847 --- Comment #12 from programmerjake at gmail dot com --- (In reply to Jakub Jelinek from comment #11) > Fixed. Thanks!!
[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #10 from Uroš Bizjak --- (In reply to Uroš Bizjak from comment #9) > (In reply to Jakub Jelinek from comment #8) > > I think there are 8 those peephole2s rather than just 4 (I've been looking > > for > > rtx_equal_p (XEXP.*, 0) in sync.md > > No, the other are not problematic. Actually, you are right. Those other peephole2 sequences also write to the memory and it is assumed, that the memory is not accessed outside the sequence. Additional patch follows: --cut here-- diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md index c7c508c8de8..c95cf50970e 100644 --- a/gcc/config/i386/sync.md +++ b/gcc/config/i386/sync.md @@ -231,7 +231,8 @@ "!TARGET_64BIT && peep2_reg_dead_p (2, operands[0]) && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))" - [(set (match_dup 3) (match_dup 5))] + [(set (match_dup 3) (match_dup 5)) + (set (match_dup 4) (match_dup 3))] "operands[5] = gen_lowpart (DFmode, operands[1]);") (define_peephole2 @@ -251,6 +252,7 @@ [(const_int 0)] { emit_move_insn (operands[3], gen_lowpart (DFmode, operands[1])); + emit_move_insn (operands[4], operands[3]); emit_insn (gen_memory_blockage ()); DONE; }) @@ -267,7 +269,8 @@ "!TARGET_64BIT && peep2_reg_dead_p (2, operands[0]) && rtx_equal_p (XEXP (operands[4], 0), XEXP (operands[2], 0))" - [(set (match_dup 3) (match_dup 5))] + [(set (match_dup 3) (match_dup 5)) + (set (match_dup 4) (match_dup 3))] "operands[5] = gen_lowpart (DFmode, operands[1]);") (define_peephole2 @@ -287,6 +290,7 @@ [(const_int 0)] { emit_move_insn (operands[3], gen_lowpart (DFmode, operands[1])); + emit_move_insn (operands[4], operands[3]); emit_insn (gen_memory_blockage ()); DONE; }) --cut here--
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 --- Comment #1 from Jakub Jelinek --- Bet the new s390_md_asm_adjust code is unprepared to see hard registers there.
[Bug target/100217] [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 Jakub Jelinek changed: What|Removed |Added Target Milestone|--- |11.0 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW CC||iii at gcc dot gnu.org, ||krebbel at gcc dot gnu.org Last reconfirmed||2021-04-22
[Bug target/100217] New: [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100217 Bug ID: 100217 Summary: [11/12 Regression] ICE when building valgrind testsuite with -march=z14 since r11-7552 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: jakub at gcc dot gnu.org Target Milestone: --- The following testcase reduced from valgrind ICEs on s390x with -mlong-double-128 -march=z14 (with -O2 too): void foo (void) { register long double f0 asm ("f0"); f0 = 1.0L; asm("" : : "f" (f0)); } starting with r11-7552-g3cb8aab390ccf31e4581863b080db30c6735e51e fpext.i: In function ‘foo’: fpext.i:6:3: internal compiler error: in gen_rtx_SUBREG, at emit-rtl.c:1023 6 | asm("" : : "f" (f0)); | ^~~ 0xcbf9c2 gen_rtx_SUBREG(machine_mode, rtx_def*, poly_int<1u, unsigned long>) ../../gcc/emit-rtl.c:1023 0x1c0a54f gen_tf_to_fprx2(rtx_def*, rtx_def*) /usr/src/gcc-test/objz2/gcc/insn-emit.c:4851 0x1725bcb s390_md_asm_adjust ../../gcc/config/s390/s390.c:16786 0xb88cf4 expand_asm_stmt ../../gcc/cfgexpand.c:3424 0xb8a308 expand_gimple_stmt_1 ../../gcc/cfgexpand.c:3841 0xb8a9af expand_gimple_stmt ../../gcc/cfgexpand.c:4008 0xb92fad expand_gimple_basic_block ../../gcc/cfgexpand.c:6045 0xb94da2 execute ../../gcc/cfgexpand.c:6729
[Bug fortran/82376] Duplicate function call using -fcheck=pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82376 --- Comment #2 from José Rui Faustino de Sousa --- Patch posted: https://gcc.gnu.org/pipermail/fortran/2021-April/055973.html
[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #9 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #8) > I think there are 8 those peephole2s rather than just 4 (I've been looking > for > rtx_equal_p (XEXP.*, 0) in sync.md No, the other are not problematic.
[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 --- Comment #8 from Jakub Jelinek --- I think there are 8 those peephole2s rather than just 4 (I've been looking for rtx_equal_p (XEXP.*, 0) in sync.md
[Bug target/100216] arm: UB in arm_canonicalize_comparison (shift exponent 127 is too large for 64-bit type)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100216 --- Comment #1 from Alex Coplan --- GCC built with UBSan here, to be clear.
[Bug target/100216] New: arm: UB in arm_canonicalize_comparison (shift exponent 127 is too large for 64-bit type)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100216 Bug ID: 100216 Summary: arm: UB in arm_canonicalize_comparison (shift exponent 127 is too large for 64-bit type) Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- The following fails: $ cat test.c int *a, *b; void c() { int d; for (; d; d++) b = a[d] ? b : a; } $ gcc/xgcc -B gcc test.c -c -march=armv8-a+simd -mfloat-abi=hard -O3 /data_sdb/toolchain/src/gcc/gcc/config/arm/arm.c:5532:30: runtime error: shift exponent 127 is too large for 64-bit type 'long unsigned int' #0 0x2369893 in arm_canonicalize_comparison(int*, rtx_def**, rtx_def**, bool) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x2369893) #1 0x3137b28 in target_canonicalize_comparison(rtx_code*, rtx_def**, rtx_def**, bool) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x3137b28) #2 0x3198fe1 in simplify_comparison(rtx_code, rtx_def**, rtx_def**) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x3198fe1) #3 0x31601ed in combine_simplify_rtx(rtx_def*, machine_mode, int, int) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x31601ed) #4 0x315a3fa in subst(rtx_def*, rtx_def*, rtx_def*, int, int, int) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x315a3fa) #5 0x3159d87 in subst(rtx_def*, rtx_def*, rtx_def*, int, int, int) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x3159d87) #6 0x3159d87 in subst(rtx_def*, rtx_def*, rtx_def*, int, int, int) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x3159d87) #7 0x31489f9 in try_combine(rtx_insn*, rtx_insn*, rtx_insn*, rtx_insn*, int*, rtx_insn*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x31489f9) #8 0x313c027 in combine_instructions(rtx_insn*, unsigned int) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x313c027) #9 0x31a440c in rest_of_handle_combine() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x31a440c) #10 0x31a453c in (anonymous namespace)::pass_combine::execute(function*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x31a453c) #11 0x17d2355 in execute_one_pass(opt_pass*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x17d2355) #12 0x17d2b6e in execute_pass_list_1(opt_pass*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x17d2b6e) #13 0x17d2be5 in execute_pass_list_1(opt_pass*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x17d2be5) #14 0x17d2c65 in execute_pass_list(function*, opt_pass*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x17d2c65) #15 0xdf27ac in cgraph_node::expand() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf27ac) #16 0xdf370f in expand_all_functions() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf370f) #17 0xdf4e10 in symbol_table::compile() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf4e10) #18 0xdf55ea in symbol_table::finalize_compilation_unit() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf55ea) #19 0x1ac3baa in compile_file() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x1ac3baa) #20 0x1ac8a15 in do_compile() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x1ac8a15) #21 0x1ac8f10 in toplev::main(int, char**) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x1ac8f10) #22 0x36a5ee7 in main (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x36a5ee7) #23 0x75ca1bf6 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6) #24 0x980249 in _start (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x980249)
[committed] c++: Add testcase for already fixed PR [PR16617]
We correctly diagnose the invalid access since r11-1350. gcc/testsuite/ChangeLog: PR c++/16617 * g++.dg/template/access36.C: New test. --- gcc/testsuite/g++.dg/template/access36.C | 25 1 file changed, 25 insertions(+) create mode 100644 gcc/testsuite/g++.dg/template/access36.C diff --git a/gcc/testsuite/g++.dg/template/access36.C b/gcc/testsuite/g++.dg/template/access36.C new file mode 100644 index 000..72ca23c7017 --- /dev/null +++ b/gcc/testsuite/g++.dg/template/access36.C @@ -0,0 +1,25 @@ +// PR c++/16617 + +class B +{ + protected: + int i; +}; + +template void fr (); + +class D2 : public B +{ + friend void fr (); +}; + +template struct X +{}; + +template void fr () +{ + X<::i> x1; // { dg-error "protected" } + X<::i> x2; // { dg-error "protected" } +} + +template void fr(); -- 2.31.1.362.g311531c9de
[Bug target/100182] [8/9/10/11/12 Regression] Miscompilation of atomic_float/1.cc and atomic_float/wait_notify.cc on i686
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182 Uroš Bizjak changed: What|Removed |Added Status|NEW |ASSIGNED CC|uros at gcc dot gnu.org| Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com --- Comment #7 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #1) > In this particular case it is the sync.md:398 peephole2: > (define_peephole2 > [(set (match_operand:DF 0 "memory_operand") > (match_operand:DF 1 "any_fp_register_operand")) >(set (mem:BLK (scratch:SI)) > (unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE)) >(set (match_operand:DF 2 "fp_register_operand") > (unspec:DF [(match_operand:DI 3 "memory_operand")] >UNSPEC_FILD_ATOMIC)) >(set (match_operand:DI 4 "memory_operand") > (unspec:DI [(match_dup 2)] >UNSPEC_FIST_ATOMIC))] > "!TARGET_64BIT >&& peep2_reg_dead_p (4, operands[2]) >&& rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" > [(const_int 0)] > { > emit_insn (gen_memory_blockage ()); > emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]); > DONE; > }) > that triggers here but from what I can read, all the r7-1112 peephole2s > optimize away stores to some memory on the assumption that the memory is > read only once (in another insn matched by the same peephole2). > I'm not 100% sure if we can rely for it on spill slots for which r7-112 > seems to have been written, but for other memory we'd need to prove that the > memory is dead. > Rather than removing those peephole2s altogether, I wonder if we just > shouldn't check that the memory_operand which we'd optimize away stores to > aren't spill slots. Actually, these peepholes are too eager and also remove the store to the memory operand 0 on the assumption that the operand is used only in the peephole2 sequence. As shown in the testcase, this is not always true, and operand 0 can be accessed also after the peephole2'd sequence. The solution is to not remove the store to operand 0. Probably there will be some unneeded stores left in the code, but IMO, this is a small price to pay for the correctness. And we still remove fild/fistp pair. I'm testing the following patch: --cut here-- diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md index c7c508c8de8..538d1f89497 100644 --- a/gcc/config/i386/sync.md +++ b/gcc/config/i386/sync.md @@ -392,7 +392,8 @@ "!TARGET_64BIT && peep2_reg_dead_p (3, operands[2]) && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" - [(set (match_dup 5) (match_dup 1))] + [(set (match_dup 0) (match_dup 1)) + (set (match_dup 5) (match_dup 1))] "operands[5] = gen_lowpart (DFmode, operands[4]);") (define_peephole2 @@ -411,6 +412,7 @@ && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" [(const_int 0)] { + emit_move_insn (operands[0], operands[1]); emit_insn (gen_memory_blockage ()); emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]); DONE; @@ -428,7 +430,8 @@ "!TARGET_64BIT && peep2_reg_dead_p (3, operands[2]) && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" - [(set (match_dup 5) (match_dup 1))] + [(set (match_dup 0) (match_dup 1)) + (set (match_dup 5) (match_dup 1))] "operands[5] = gen_lowpart (DFmode, operands[4]);") (define_peephole2 @@ -447,6 +450,7 @@ && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" [(const_int 0)] { + emit_move_insn (operands[0], operands[1]); emit_insn (gen_memory_blockage ()); emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]); DONE; --cut here--
[committed] c++: Add testcase for already fixed PR [PR84689]
We correctly accept this testcase since r11-1638. gcc/testsuite/ChangeLog: PR c++/84689 * g++.dg/cpp0x/sfinae67.C: New test. --- gcc/testsuite/g++.dg/cpp0x/sfinae67.C | 20 1 file changed, 20 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp0x/sfinae67.C diff --git a/gcc/testsuite/g++.dg/cpp0x/sfinae67.C b/gcc/testsuite/g++.dg/cpp0x/sfinae67.C new file mode 100644 index 000..cfed92ad472 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/sfinae67.C @@ -0,0 +1,20 @@ +// PR c++/84689 +// { dg-do compile { target c++11 } } + +struct base { + void operator()(); +}; + +struct a : base { }; +struct b : base { }; + +struct f : a, b { + using a::operator(); + using b::operator(); +}; + +template auto g(int) -> decltype(T()()); +template auto g(...) -> int; + +using type = decltype(g(0)); +using type = int; -- 2.31.1.362.g311531c9de
[Bug c++/84689] is_invocable is true even for call operator via ambiguous base
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84689 Patrick Palka changed: What|Removed |Added Target Milestone|--- |11.0 Status|NEW |RESOLVED Resolution|--- |FIXED CC||ppalka at gcc dot gnu.org --- Comment #5 from Patrick Palka --- Apparently fixed for GCC 11.
[Patch, fortran] PR fortran/82376 - Duplicate function call using -fcheck=pointer
Hi All! Proposed patch to: PR82376 - Duplicate function call using -fcheck=pointer Patch tested only on x86_64-pc-linux-gnu. Evaluate function result and then pass a pointer, instead of a reference to the function itself, thus avoiding multiple evaluations of the function. Thank you very much. Best regards, José Rui Fortran: Fix double function call with -fcheck=pointer [PR] gcc/fortran/ChangeLog: PR fortran/82376 * trans-expr.c (gfc_conv_procedure_call): Evaluate function result and then pass a pointer. gcc/testsuite/ChangeLog: PR fortran/82376 * gfortran.dg/PR82376.f90: New test. diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c index 213f32b0a67..b83b021755d 100644 --- a/gcc/fortran/trans-expr.c +++ b/gcc/fortran/trans-expr.c @@ -6014,11 +6014,8 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym, || (!e->value.function.esym && e->symtree->n.sym->attr.pointer)) && fsym && fsym->attr.target) - { - gfc_conv_expr (, e); - parmse.expr = gfc_build_addr_expr (NULL_TREE, parmse.expr); - } - + /* Make sure the function only gets called once. */ + gfc_conv_expr_reference (, e, false); else if (e->expr_type == EXPR_FUNCTION && e->symtree->n.sym->result && e->symtree->n.sym->result != e->symtree->n.sym diff --git a/gcc/testsuite/gfortran.dg/PR82376.f90 b/gcc/testsuite/gfortran.dg/PR82376.f90 new file mode 100644 index 000..cea1c2ae211 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/PR82376.f90 @@ -0,0 +1,55 @@ +! { dg-do run } +! +! Test the fix for PR82376 +! + +program main_p + + integer, parameter :: n = 10 + + type :: foo_t +integer, pointer :: v =>null() + end type foo_t + + integer, save :: pcnt = 0 + + type(foo_t) :: int + integer :: i + + do i = 1, n +call init(int, i) +if(.not.associated(int%v)) stop 1 +if(int%v/=i) stop 2 +if(pcnt/=i) stop 3 + end do + +contains + + function new(data) result(this) +integer, target, intent(in) :: data + +integer, pointer :: this + +nullify(this) +this => data +pcnt = pcnt + 1 +return + end function new + + subroutine init(this, data) +type(foo_t), intent(out) :: this +integer, intent(in) :: data + +call set(this, new(data)) +return + end subroutine init + + subroutine set(this, that) +type(foo_t), intent(inout) :: this +integer, target, intent(in):: that + +this%v => that +return + end subroutine set + +end program main_p
[Bug c++/100215] New: Improve text of -Wmaybe-uninitialized references
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100215 Bug ID: 100215 Summary: Improve text of -Wmaybe-uninitialized references Product: gcc Version: 10.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: jengelh at inai dot de Target Milestone: --- Request for enhancements on a particular warning message emitted for shoddy code. Input: ``` #include #include struct C { C() = default; C(C &) { memcpy(bar, o.bar, sizeof(bar)); } int foo = -1; char bar[1024]; }; int main() { C z; std::list().push_back(std::move(z)); } ``` Command & observed output: ``` $ g++ -v gcc version 10.3.0 (SUSE Linux) $ g++ test.cpp -O2 -Wall t2.cpp:5:19: warning: ‘*((void*)& z +4)’ may be used uninitialized in this function [-Wmaybe-uninitialized] 5 | C(C &) { memcpy(bar, o.bar, sizeof(bar)); } | ~~^ ``` The warning references the original variable name, and the caret is placed in a somewhat arbitrary position. Expected output: ``` t2.cpp:5:19: warning: ‘o.bar’ may be used uninitialized in this function [-Wmaybe-uninitialized] 5 | C(C &) { memcpy(bar, o.bar, sizeof(bar)); } | ^~~ ``` - granted, z+4 is a result of the optimizer; it would be nice if it could show o+4 though - more on that, z+4/o+4 could be shown as z.bar/o.bar - caret be adjusted accordingly to point to the 2nd argument of memcpy
[Bug c++/13495] Friendship to class nested within a template is broken
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=13495 Bug 13495 depends on bug 16617, which changed state. Bug 16617 Summary: Fail to do access checking correctly for non-dependent qualified-id https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16617 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug c++/16617] Fail to do access checking correctly for non-dependent qualified-id
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16617 Patrick Palka changed: What|Removed |Added Status|NEW |RESOLVED CC||ppalka at gcc dot gnu.org Target Milestone|--- |11.0 Resolution|--- |FIXED --- Comment #10 from Patrick Palka --- Fixed for GCC 11 by the patch for PR41437.
[Bug c++/84689] is_invocable is true even for call operator via ambiguous base
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84689 --- Comment #4 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:3275f2e2af24541f55462c23af4c6530ac12c5e2 commit r12-70-g3275f2e2af24541f55462c23af4c6530ac12c5e2 Author: Patrick Palka Date: Thu Apr 22 13:32:44 2021 -0400 c++: Add testcase for already fixed PR [PR84689] We correctly accept this testcase since r11-1638. gcc/testsuite/ChangeLog: PR c++/84689 * g++.dg/cpp0x/sfinae67.C: New test.
[Bug c++/16617] Fail to do access checking correctly for non-dependent qualified-id
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=16617 --- Comment #9 from CVS Commits --- The master branch has been updated by Patrick Palka : https://gcc.gnu.org/g:330cc29c06306ebf7bd3b2d37704cc69944923ff commit r12-69-g330cc29c06306ebf7bd3b2d37704cc69944923ff Author: Patrick Palka Date: Thu Apr 22 13:32:40 2021 -0400 c++: Add testcase for already fixed PR [PR16617] We correctly diagnose the invalid access since r11-1350. gcc/testsuite/ChangeLog: PR c++/16617 * g++.dg/template/access36.C: New test.
testsuite/substr_{9,10}.f90: Move to gfortran.dg/
The test was added in r11-6687-gbdd1b1f55529da00b867ef05a53a08fbfc3d1c2e (PR93340) but placed into the wrong directly. Fixed by moving to gfortran.dg/ in commit r12-68-gac456fd981db6b0c2f7ee1ab0d17d36087a74dc2 after confirming that the tests work. Tobias - Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank Thürauf commit ac456fd981db6b0c2f7ee1ab0d17d36087a74dc2 Author: Tobias Burnus Date: Thu Apr 22 19:14:58 2021 +0200 testsuite/substr_{9,10}.f90: Move to gfortran.dg/ gcc/testsuite/ * substr_9.f90: Move to ... * gfortran.dg/substr_9.f90: ... here. * substr_10.f90: Move to ... * gfortran.dg/substr_10.f90: ... here. --- gcc/testsuite/{ => gfortran.dg}/substr_10.f90 | 0 gcc/testsuite/{ => gfortran.dg}/substr_9.f90 | 0 2 files changed, 0 insertions(+), 0 deletions(-) diff --git a/gcc/testsuite/substr_10.f90 b/gcc/testsuite/gfortran.dg/substr_10.f90 similarity index 100% rename from gcc/testsuite/substr_10.f90 rename to gcc/testsuite/gfortran.dg/substr_10.f90 diff --git a/gcc/testsuite/substr_9.f90 b/gcc/testsuite/gfortran.dg/substr_9.f90 similarity index 100% rename from gcc/testsuite/substr_9.f90 rename to gcc/testsuite/gfortran.dg/substr_9.f90
[Bug c++/100055] [10/11/12 Regression] ICE on invalid requires expression
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100055 康桓瑋 changed: What|Removed |Added CC||hewillk at gmail dot com --- Comment #3 from 康桓瑋 --- I think this is a dup of PR 99465, which has several ICE forms.
[Bug target/100214] New: UB in arm.c:optimal_immediate_sequence_1 (left shift of 255 by 30 places cannot be represented in type 'int')
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100214 Bug ID: 100214 Summary: UB in arm.c:optimal_immediate_sequence_1 (left shift of 255 by 30 places cannot be represented in type 'int') Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: acoplan at gcc dot gnu.org Target Milestone: --- Bootstrapping on arm --with-build-config=bootstrap-ubsan shows the following problem: $ cat test.c double a; void b() { a += 0.1; } $ gcc/xgcc -B gcc test.c -c /data_sdb/toolchain/src/gcc/gcc/config/arm/arm.c:4745:37: runtime error: left shift of 255 by 30 places cannot be represented in type 'int' #0 0x23660cb in optimal_immediate_sequence_1(rtx_code, unsigned long, four_ints*, int) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x23660cb) #1 0x2365de4 in optimal_immediate_sequence(rtx_code, unsigned long, four_ints*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x2365de4) #2 0x2368e75 in arm_gen_constant(rtx_code, machine_mode, rtx_def*, unsigned long, rtx_def*, rtx_def*, int, int) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x2368e75) #3 0x23d4a3c in arm_const_double_inline_cost(rtx_def*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x23d4a3c) #4 0x2e0fa1e in satisfies_constraint_Da(rtx_def*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x2e0fa1e) #5 0x14bdf97 in constraint_satisfied_p(rtx_def*, constraint_num) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x14bdf97) #6 0x14c4cee in record_reg_classes(int, int, rtx_def**, machine_mode*, char const**, rtx_insn*, reg_class*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x14c4cee) #7 0x14cba27 in record_operand_costs(rtx_insn*, reg_class*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x14cba27) #8 0x14cc4e2 in scan_one_insn(rtx_insn*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x14cc4e2) #9 0x14cd5c6 in process_bb_for_costs(basic_block_def*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x14cd5c6) #10 0x14cd648 in process_bb_node_for_costs(ira_loop_tree_node*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x14cd648) #11 0x14ad82f in ira_traverse_loop_tree(bool, ira_loop_tree_node*, void (*)(ira_loop_tree_node*), void (*)(ira_loop_tree_node*)) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x14ad82f) #12 0x14cdeec in find_costs_and_classes(_IO_FILE*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x14cdeec) #13 0x14d28f5 in ira_costs() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x14d28f5) #14 0x14b7e37 in ira_build() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x14b7e37) #15 0x149f52a in ira(_IO_FILE*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x149f52a) #16 0x14a060a in (anonymous namespace)::pass_ira::execute(function*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x14a060a) #17 0x17d2355 in execute_one_pass(opt_pass*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x17d2355) #18 0x17d2b6e in execute_pass_list_1(opt_pass*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x17d2b6e) #19 0x17d2be5 in execute_pass_list_1(opt_pass*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x17d2be5) #20 0x17d2c65 in execute_pass_list(function*, opt_pass*) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x17d2c65) #21 0xdf27ac in cgraph_node::expand() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf27ac) #22 0xdf3a23 in cgraph_order_sort::process() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf3a23) #23 0xdf4135 in output_in_order() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf4135) #24 0xdf4e01 in symbol_table::compile() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf4e01) #25 0xdf55ea in symbol_table::finalize_compilation_unit() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0xdf55ea) #26 0x1ac3baa in compile_file() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x1ac3baa) #27 0x1ac8a15 in do_compile() (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x1ac8a15) #28 0x1ac8f10 in toplev::main(int, char**) (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x1ac8f10) #29 0x36a5ee7 in main (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x36a5ee7) #30 0x75ca1bf6 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21bf6) #31 0x980249 in _start (/data_sdb/toolchain/cc1s/ubsan-arm/gcc/cc1+0x980249)
[Bug c++/100210] [[nodiscard]] constructor causes warning on arm-linux-gnueabihf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100210 Marek Polacek changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #2 from Marek Polacek --- Yes, verified that when reverting that change the warning shows up. Thus, fixed.
[Bug c++/100210] [[nodiscard]] constructor causes warning on arm-linux-gnueabihf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100210 Marek Polacek changed: What|Removed |Added CC||mpolacek at gcc dot gnu.org --- Comment #1 from Marek Polacek --- Can't reproduce, I think it has been fixed by r11-7512.
[Bug jit/100096] libgccjit.so.0: Cannot write-enable text segment: Permission denied on NetBSD 9.1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100096 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #28 from Jakub Jelinek --- Fixed.
[Bug sanitizer/99106] [9 Regression] ICE in tree_to_poly_int64, at tree.c:3091
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99106 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #8 from Jakub Jelinek --- Fixed.
[Bug c++/99035] [9 Regression] ICE in declare_weak, at varasm.c:5930
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99035 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #9 from Jakub Jelinek --- Fixed.
[Bug rtl-optimization/98601] [8/9 Regression] aarch64: ICE in rtx_addr_can_trap_p_1, at rtlanal.c:467
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98601 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #13 from Jakub Jelinek --- Fixed.
[Bug c/99136] ICE in gimplify_expr, at gimplify.c:14854
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99136 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #7 from Jakub Jelinek --- Fixed.
[Bug middle-end/99007] [8/9 Regression] ICE in dominated_by_p, at dominance.c:1124
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99007 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #11 from Jakub Jelinek --- Fixed.
[Bug c++/98556] [8/9 Regression] ICE: 'verify_gimple' failed since r8-4821-g1af4ebf5985ef2aa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98556 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #12 from Jakub Jelinek --- Fixed.
[Bug c++/99033] [9 Regression] ICE in tree_to_poly_int64, at tree.c:3091
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99033 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #10 from Jakub Jelinek --- Fixed.
[Bug target/98681] [8/9 Regression] aarch64: Invalid ubfiz instruction rejected by assembler
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98681 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #15 from Jakub Jelinek --- Fixed.