[r12-6606 Regression] FAIL: g++.dg/asan/asan_test.C -O2 (test for excess errors) on Linux/x86_64
On Linux/x86_64, 9d6a0f388eb048f8d87f47af78f07b5ce513bfe6 is the first bad commit commit 9d6a0f388eb048f8d87f47af78f07b5ce513bfe6 Author: Martin Sebor Date: Sat Jan 15 16:41:40 2022 -0700 Add -Wdangling-pointer [PR63272]. caused FAIL: gcc.dg/torture/pr57147-2.c -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: g++.dg/asan/asan_test.C -O2 (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6606/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr57147-2.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr57147-2.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr57147-2.c --target_board='unix{-m64}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr57147-2.c --target_board='unix{-m64\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="asan.exp=g++.dg/asan/asan_test.C --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="asan.exp=g++.dg/asan/asan_test.C --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="asan.exp=g++.dg/asan/asan_test.C --target_board='unix{-m64}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="asan.exp=g++.dg/asan/asan_test.C --target_board='unix{-m64\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)
Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.
On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches wrote: > > On Sat, Jan 15, 2022 at 5:39 PM Hongyu Wang wrote: > > > > Thanks for the suggestion, here is the updated patch that survived > > bootstrap/regtest. > > LGTM for me, but please get the final approval from Hongtao. > Ok, thanks. > Thanks, > Uros. > > > > Please note reg_mentioned_p in the above condition. This function > > > returns nonzero if register op0 appears somewhere within op1 and is > > > critical for the correct operation of your patch. > > I added reg_mentioned_p for all insns except fp16 complex mult, since > > they have constraint & to the dest so it must be allocated different > > register from src. > > > > Uros Bizjak 于2022年1月14日周五 23:49写道: > > > > > > > > > > On Fri, Jan 14, 2022 at 2:44 PM Hongyu Wang > > > wrote: > > > > > > > > > Are there any technical obstacles to introduce subst to > > > > > define_{,insn_and_}split? > > > > > > > > gccint says: define_subst can be used only in define_insn and > > > > define_expand, it cannot be used in other expressions (e.g. in > > > > define_insn_and_split). > > > > > > Hm, hm ... annoying ... > > > > > > > I have no idea how to implement it in current infrastructure. > > > > > > > > > In the proposed patch, if the output register is also mentioned in the > > > > > input, then clearing before insn will clear the value in the input > > > > > register. The solution in the i386.md also takes care of this issue. > > > > > > > > > > > > > For this, I think we can add REGNO checks for operands in condition > > > > (which means there is true dependency). > > > > > > Let's go in your direction, considering the limitations of current > > > infrastructure. > > > +{ > > > + if (TARGET_DEST_FALSE_DEPENDENCY > > > + && get_attr_dest_false_dep (insn) == > > > +DEST_FALSE_DEP_TRUE) > > > +output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands); > > > + return "vgetmant\t{%2, %1, > > > %0|%0, %1, %2}"; > > > +} > > > > > > There is no need to pass the information via attributes. IMO, you > > > shoud use subst attribute directly in the condition: > > > > > > { > > > if (TARGET_DEST_FALSE_DEPENDENCY > > > && > > > && !reg_mentioned_p (operands[0], operands[1])) > > >output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands); > > > return "."; > > > } > > > > > > Assume the above works, so please: > > > > > > - rename TARGET_DEST_FALSE_DEPENDENCY to something less generic, maybe > > > following existing BMI example with TARGET_AVOID_FALSE_DEP_FOR_AVX512F > > > - rename "mask3_dest_false_dep_attr" to "mask3_false_dep_for_avx512f_cond" > > > > > > Please note reg_mentioned_p in the above condition. This function > > > returns nonzero if register op0 appears somewhere within op1 and is > > > critical for the correct operation of your patch. > > > > > > Uros. -- BR, Hongtao
[PATCH] Strengthen memory memory order for atomic::wait/notify
This patch updates the memory order of atomic accesses to the waiter's count to match libc++'s usage. It should be backported to GCC11. Tested x86_64-pc-linux-gnu. From f5ed7674f86283db4f4ff49a2cc65d4f852413a1 Mon Sep 17 00:00:00 2001 From: Thomas Rodgers Date: Sat, 15 Jan 2022 17:40:49 -0800 Subject: [PATCH] Strengthen memory memory order for atomic::wait/notify This matches the memory order in libc++. libstdc++-v3/ChangeLog: * libstdc++-v3/include/bits/atomic_wait.h: Change memory order from Acquire/Release with relaxed loads to SeqCst+Release for accesses to the waiter's count. --- libstdc++-v3/include/bits/atomic_wait.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h index 05cf0013d2a..d7de0d7eb9e 100644 --- a/libstdc++-v3/include/bits/atomic_wait.h +++ b/libstdc++-v3/include/bits/atomic_wait.h @@ -209,18 +209,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION void _M_enter_wait() noexcept - { __atomic_fetch_add(&_M_wait, 1, __ATOMIC_ACQ_REL); } + { __atomic_fetch_add(&_M_wait, 1, __ATOMIC_SEQ_CST); } void _M_leave_wait() noexcept - { __atomic_fetch_sub(&_M_wait, 1, __ATOMIC_ACQ_REL); } + { __atomic_fetch_sub(&_M_wait, 1, __ATOMIC_RELEASE); } bool _M_waiting() const noexcept { __platform_wait_t __res; - __atomic_load(&_M_wait, &__res, __ATOMIC_ACQUIRE); - return __res > 0; + __atomic_load(&_M_wait, &__res, __ATOMIC_SEQ_CST); + return __res != 0; } void @@ -258,7 +258,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION __platform_wait(__addr, __old); #else __platform_wait_t __val; - __atomic_load(__addr, &__val, __ATOMIC_RELAXED); + __atomic_load(__addr, &__val, __ATOMIC_SEQ_CST); if (__val == __old) { lock_guard __l(_M_mtx); @@ -309,7 +309,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { if (_M_laundered()) { - __atomic_fetch_add(_M_addr, 1, __ATOMIC_ACQ_REL); + __atomic_fetch_add(_M_addr, 1, __ATOMIC_SEQ_CST); __all = true; } _M_w._M_notify(_M_addr, __all, __bare); -- 2.31.1
[PATCHv3] libiberty rust-demangle, ignore .suffix
Rust symbols can have a .suffix because of compiler transformations. These can be ignored in the demangled name. Which is what this patch implements. By stopping at the first dot for v0 symbols and searching backwards to the ending 'E' for legacy symbols. An alternative implementation could be to follow what C++ does and represent these as [clone .suffix] tagged onto the demangled name. But this seems somewhat confusing since it results in a demangled name that cannot be mangled again. And it would mean trying to decode compiler internal naming. https://bugs.kde.org/show_bug.cgi?id=445916 https://github.com/rust-lang/rust/issues/60705 libiberty/Changelog * rust-demangle.c (rust_demangle_callback): Ignore everything after '.' char in sym for v0. For legacy symbols search backwards to find the last 'E' before any '.'. * testsuite/rust-demangle-expected: Add new .suffix testcases. --- libiberty/rust-demangle.c | 21 ++--- libiberty/testsuite/rust-demangle-expected | 26 ++ 2 files changed, 44 insertions(+), 3 deletions(-) V3 - Add more testcases - Allow @ in legacy symbols (which can appear in the .suffix) diff --git a/libiberty/rust-demangle.c b/libiberty/rust-demangle.c index 18c760491bdc..42c88161da30 100644 --- a/libiberty/rust-demangle.c +++ b/libiberty/rust-demangle.c @@ -1340,13 +1340,19 @@ rust_demangle_callback (const char *mangled, int options, /* Rust symbols (v0) use only [_0-9a-zA-Z] characters. */ for (p = rdm.sym; *p; p++) { + /* Rust v0 symbols can have '.' suffixes, ignore those. */ + if (rdm.version == 0 && *p == '.') +break; + rdm.sym_len++; if (*p == '_' || ISALNUM (*p)) continue; - /* Legacy Rust symbols can also contain [.:$] characters. */ - if (rdm.version == -1 && (*p == '$' || *p == '.' || *p == ':')) + /* Legacy Rust symbols can also contain [.:$] characters. + Or @ in the .suffix (which will be skipped, see below). */ + if (rdm.version == -1 && (*p == '$' || *p == '.' || *p == ':' +|| *p == '@')) continue; return 0; @@ -1355,7 +1361,16 @@ rust_demangle_callback (const char *mangled, int options, /* Legacy Rust symbols need to be handled separately. */ if (rdm.version == -1) { - /* Legacy Rust symbols always end with E. */ + /* Legacy Rust symbols always end with E. But can be followed by a + .suffix (which we want to ignore). */ + int dot_suffix = 1; + while (rdm.sym_len > 0 && + !(dot_suffix && rdm.sym[rdm.sym_len - 1] == 'E')) +{ + dot_suffix = rdm.sym[rdm.sym_len - 1] == '.'; + rdm.sym_len--; +} + if (!(rdm.sym_len > 0 && rdm.sym[rdm.sym_len - 1] == 'E')) return 0; rdm.sym_len--; diff --git a/libiberty/testsuite/rust-demangle-expected b/libiberty/testsuite/rust-demangle-expected index 7dca315d0054..b565084cfefa 100644 --- a/libiberty/testsuite/rust-demangle-expected +++ b/libiberty/testsuite/rust-demangle-expected @@ -295,3 +295,29 @@ _RMCs4fqI2P2rA04_13const_genericINtB0_4CharKc2202_E --format=auto _RNvNvMCs4fqI2P2rA04_13const_genericINtB4_3FooKpE3foo3FOO >::foo::FOO +# +# Suffixes +# +--format=rust +_RNvMs0_NtCs5l0EXMQXRMU_21rustc_data_structures17obligation_forestINtB5_16ObligationForestNtNtNtCsdozMG8X9FIu_21rustc_trait_selection6traits7fulfill26PendingPredicateObligationE22register_obligation_atB1v_.llvm.8517020237817239694 +>::register_obligation_at +--format=rust +_ZN4core3ptr85drop_in_place$LT$std..rt..lang_start$LT$$LP$$RP$$GT$..$u7b$$u7b$closure$u7d$$u7d$$GT$17h27f14859c664490dE.llvm.8091179795805947855 +core::ptr::drop_in_place::{{closure}}> +# old style rustc llvm thinlto +--format=rust +_ZN9backtrace3foo17hbb467fcdaea5d79bE.llvm.A5310EB9 +backtrace::foo +--format=rust +_ZN9backtrace3foo17hbb467fcdaea5d79bE.llvm.A5310EB9@@16 +backtrace::foo +# new style rustc llvm thinlto +--format=rust +_RC3foo.llvm.9D1C9369 +foo +--format=rust +_RC3foo.llvm.9D1C9369@@16 +foo +--format=rust +_RNvC9backtrace3foo.llvm.A5310EB9 +backtrace::foo -- 2.30.2
Re: gcc/configure: out of date
Martin, I've looked into removing the -Wno-error for this warning for just a subset of targets. It seems doable with some hardcoding in configure.ac but if you're planning to do the cleanup for all of them I'm wondering if we should even bother. What do you think? Martin On 1/14/22 08:46, Martin Liška wrote: Hello. I noticed that when I run: ACLOCAL=~/bin/automake-1.15.1/bin/aclocal AUTOMAKE=~/bin/automake-1.15.1/bin/automake autoconf in gcc subfolder I get the following diff: diff --git a/gcc/configure b/gcc/configure index d19059e13cc..ff570f73ef5 100755 --- a/gcc/configure +++ b/gcc/configure @@ -5352,7 +5352,26 @@ else GDC="$ac_cv_prog_GDC" fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether the D compiler works" >&5 +$as_echo_n "checking whether the D compiler works... " >&6; } +if ${acx_cv_d_compiler_works+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat >conftest.d <&1 || echo failure` + if test x"$errors" = x && test -f conftest.$ac_objext; then + acx_cv_d_compiler_works=yes + fi + rm -f conftest.* +fi +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $acx_cv_d_compiler_works" >&5 +$as_echo "$acx_cv_d_compiler_works" >&6; } +if test "x$GDC" != xno && test x$acx_cv_d_compiler_works != xno; then have_gdc=yes else have_gdc=no @@ -19640,7 +19659,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 19643 "configure" +#line 19662 "configure" #include "confdefs.h" #if HAVE_DLFCN_H @@ -19746,7 +19765,7 @@ else lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2 lt_status=$lt_dlunknown cat > conftest.$ac_ext <<_LT_EOF -#line 19749 "configure" +#line 19768 "configure" #include "confdefs.h" #if HAVE_DLFCN_H Am I correct that somebody forgot to re-generate the file? Martin
Re: [PATCH] libiberty rust-demangle, ignore .suffix
Hi Eddy, On Mon, Dec 20, 2021 at 01:50:52PM +0200, Eduard-Mihai Burtescu wrote: > Apologies for the delay, the email fell through the cracks somehow. And then I went on vacation... Sorry this fairly simple patch takes so long. > The updated patch looks like it would work alright, only needs a couple > tests, e.g.: > https://github.com/rust-lang/rustc-demangle/blob/2811a1ad6f7c8bead2ef3671e4fdc10de1553e96/src/lib.rs#L413-L422 > https://github.com/rust-lang/rustc-demangle/blob/2811a1ad6f7c8bead2ef3671e4fdc10de1553e96/src/v0.rs#L1442-L1444 Thanks. I have added those testcases and noticed we need a tweak for having an @ in the suffix of a legacy symbol. I'll sent the updated patch. Hope it can still be pushed even though stage 3 is closing soon. Cheers, Mark
Re: [PATCH v2 1/2] add -Wuse-after-free
On 1/11/22 15:40, Jason Merrill wrote: On 11/30/21 17:32, Martin Sebor via Gcc-patches wrote: Attached is a revised patch with the following changes based on your comments: 1) Set and use statement uids to determine which statement precedes which in the same basic block. 2) Avoid testing flag_isolate_erroneous_paths_dereference. 3) Use post-dominance to decide whether to use the "maybe" phrasing vs a definite form. David raised (and in our offline discussion today reiterated) an objection to the default setting of the option being the strictest. I have not changed that in this revision. See my rationale for this choice in my reply below: https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583176.html In the latest C2x draft I see in the list of undefined behavior "The value of a pointer that refers to space deallocated by a call to the free or realloc function is used (7.22.3)." So the case that would be technically undefined would be comparing the reallocated pointer to the old pointer which has been deallocated. The C++ draft is more nuanced: it says, "When the end of the duration of a region of storage is reached, the values of all pointers representing the address of any part of that region of storage become invalid pointer values (6.8.3). Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior." So the case above is implementation-defined in C++, not undefined. Let's put =2 in -Wall for now. With that change, this and the -Wdangling-pointer patch are OK on Friday afternoon if there are no other comments before then. I've adjusted both patches and pushed r12-6605 and r12-6606 after retesting with a few packages. In -Wdangling-pointer I reworded the warning to refer to "unnamed temporary" instead of "compound literal" since it triggers for those as well. While rebuilding LLVM and a few its projects with the patches I noticed the -Wdangling-pointer change exposes what seems like a Ranger bug for one translation unit (it enters an infinite loop): I opened pr104038 for it. Martin On 11/23/21 2:16 PM, Martin Sebor wrote: On 11/22/21 6:32 PM, Jeff Law wrote: On 11/1/2021 4:17 PM, Martin Sebor via Gcc-patches wrote: Patch 1 in the series detects a small subset of uses of pointers made indeterminate by calls to deallocation functions like free or C++ operator delete. To control the conditions the warnings are issued under the new -Wuse-after-free= option provides three levels. At the lowest level the warning triggers only for unconditional uses of freed pointers and doesn't warn for uses in equality expressions. Level 2 warns also for come conditional uses, and level 3 also for uses in equality expressions. I debated whether to make level 2 or 3 the default included in -Wall. I decided on 3 for two reasons: 1) to raise awareness of both the problem and GCC's new ability to detect it: using a pointer after it's been freed, even only in principle, by a successful call to realloc, is undefined, and 2) because it's trivial to lower the level either globally, or locally by suppressing the warning around such misuses. I've tested the patch on x86_64-linux and by building Glibc and Binutils/GDB. It triggers a number of times in each, all due to comparing invalidated pointers for equality (i.e., level 3). I have suppressed these in GCC (libiberty) by a #pragma, and will see how the Glibc folks want to deal with theirs (I track them in BZ #28521). The tests contain a number of xfails due to limitations I'm aware of. I marked them pr?? until the patch is approved. I will open bugs for them before committing if I don't resolve them in a followup. Martin gcc-63272-1.diff Add -Wuse-after-free. gcc/c-family/ChangeLog * c.opt (-Wuse-after-free): New options. gcc/ChangeLog: * diagnostic-spec.c (nowarn_spec_t::nowarn_spec_t): Handle OPT_Wreturn_local_addr and OPT_Wuse_after_free_. * diagnostic-spec.h (NW_DANGLING): New enumerator. * doc/invoke.texi (-Wuse-after-free): Document new option. * gimple-ssa-warn-access.cc (pass_waccess::check_call): Rename... (pass_waccess::check_call_access): ...to this. (pass_waccess::check): Rename... (pass_waccess::check_block): ...to this. (pass_waccess::check_pointer_uses): New function. (pass_waccess::gimple_call_return_arg): New function. (pass_waccess::warn_invalid_pointer): New function. (pass_waccess::check_builtin): Handle free and realloc. (gimple_use_after_inval_p): New function. (get_realloc_lhs): New function. (maybe_warn_mismatched_realloc): New function. (pointers_related_p): New function. (pass_waccess::check_call): Call check_pointer_uses. (pass_waccess::execute): Compute and free dominance info. libcpp/ChangeLog: * files.c (_cpp_find_file): Substitute a valid pointer for
Re: [PATCH] PR fortran/83079 - ICE and wrong code with TRANSFER and character(kind=4)
Hi Harald, An early *ping* ... OK. Thanks for the patch! Best regards Thomas
Re: [PATCH] PR fortran/83079 - ICE and wrong code with TRANSFER and character(kind=4)
An early *ping* ... Am 11.01.22 um 22:17 schrieb Harald Anlauf via Fortran: Dear Fortranners, when digging into the issue pointed out in the PR by Gerhard it turned out that there were several issues with the TRANSFER intrinsics in the case MOLD was CHARACTER(kind=4). Default CHARACTER was fine, though. - the size of the result was wrongly calculated - the string length used in temporaries was wrong - the result characteristics was wrong Fortunately, the few fixes were very local and needed only fix-ups of the respective computations. Since the details of which issue would show up seemed to depend on the properties of a the arguments to TRANSFER, I wrote an extended testcase which is a "hull" of what I used in debugging. (The testcase was and can be cross-checked with the NAG compiler.) Regtested on x86_64-pc-linux-gnu. OK for mainline? Thanks, Harald
[PATCH] i386: Improve and optimize ix86_expand_sse_movcc
Modernize ix86_expand_sse_movcc to use expand_simple_{unop,binop} infrastructure to avoid manual twiddling with output registers. Also fix a couple of inconsistent vector_all_ones_operand usages, break a couple of unnecessary else-if chains, eliminate common subexpressions and do some general code simplifications. 2022-01-15 Uroš Bizjak gcc/ChangeLog: * config/i386/i386-expand.c (ix86_expand_sse_movcc): Use expand_simple_unop and expand_simple_binop instead of manually constructing NOT, AND and IOR RTXes. Use vector_all_ones_operand consistently. Eliminate common subexpressions and simplify code. * config/i386/sse.md (3): New expander. (3): Make public. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index c740d6e5c04..138580da96e 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -3781,6 +3781,7 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) { machine_mode mode = GET_MODE (dest); machine_mode cmpmode = GET_MODE (cmp); + rtx x; /* Simplify trivial VEC_COND_EXPR to avoid ICE in pr97506. */ if (rtx_equal_p (op_true, op_false)) @@ -3789,8 +3790,6 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) return; } - rtx t2, t3, x; - /* If we have an integer mask and FP value then we need to cast mask to FP mode. */ if (mode != cmpmode && VECTOR_MODE_P (cmpmode)) @@ -3813,12 +3812,14 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) ? force_reg (mode, op_false) : op_false); if (op_true == CONST0_RTX (mode)) { - rtx n = gen_reg_rtx (cmpmode); if (cmpmode == E_DImode && !TARGET_64BIT) - emit_insn (gen_knotdi (n, cmp)); + { + x = gen_reg_rtx (cmpmode); + emit_insn (gen_knotdi (x, cmp)); + } else - emit_insn (gen_rtx_SET (n, gen_rtx_fmt_e (NOT, cmpmode, cmp))); - cmp = n; + x = expand_simple_unop (cmpmode, NOT, cmp, NULL, 1); + cmp = x; /* Reverse op_true op_false. */ std::swap (op_true, op_false); } @@ -3826,22 +3827,24 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) if (mode == HFmode) emit_insn (gen_movhf_mask (dest, op_true, op_false, cmp)); else - { - rtx vec_merge = gen_rtx_VEC_MERGE (mode, op_true, op_false, cmp); - emit_insn (gen_rtx_SET (dest, vec_merge)); - } + emit_insn (gen_rtx_SET (dest, + gen_rtx_VEC_MERGE (mode, + op_true, op_false, cmp))); return; } - else if (vector_all_ones_operand (op_true, mode) - && op_false == CONST0_RTX (mode)) + + if (vector_all_ones_operand (op_true, mode) + && op_false == CONST0_RTX (mode)) { - emit_insn (gen_rtx_SET (dest, cmp)); + emit_move_insn (dest, cmp); return; } else if (op_false == CONST0_RTX (mode)) { - op_true = force_reg (mode, op_true); - ix86_emit_vec_binop (AND, mode, dest, cmp, op_true); + x = expand_simple_binop (mode, AND, cmp, op_true, + dest, 1, OPTAB_DIRECT); + if (x != dest) + emit_move_insn (dest, x); return; } else if (op_true == CONST0_RTX (mode)) @@ -3851,13 +3854,16 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) ix86_emit_vec_binop (AND, mode, dest, x, op_false); return; } - else if (INTEGRAL_MODE_P (mode) && op_true == CONSTM1_RTX (mode)) + else if (vector_all_ones_operand (op_true, mode)) { - op_false = force_reg (mode, op_false); - ix86_emit_vec_binop (IOR, mode, dest, cmp, op_false); + x = expand_simple_binop (mode, IOR, cmp, op_false, + dest, 1, OPTAB_DIRECT); + if (x != dest) + emit_move_insn (dest, x); return; } - else if (TARGET_XOP) + + if (TARGET_XOP) { op_true = force_reg (mode, op_true); @@ -3865,16 +3871,17 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, rtx op_false) || !nonimmediate_operand (op_false, mode)) op_false = force_reg (mode, op_false); - emit_insn (gen_rtx_SET (dest, gen_rtx_IF_THEN_ELSE (mode, cmp, - op_true, - op_false))); + emit_insn (gen_rtx_SET (dest, + gen_rtx_IF_THEN_ELSE (mode, cmp, + op_true, op_false))); return; } rtx (*gen) (rtx, rtx, rtx, rtx) = NULL; - rtx d = dest; + machine_mode blend_mode = mode; - if (!vector_operand (op_true, mode)) + if (GET_MODE_SIZE (mode) < 16 +
[Patch][V5][Patch 2/2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables.
Hi, Richard, This is the updated version for the change of "Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables”. Compared to the previous patch, I mainly made the following change: Delete the 4th parameter of “warn_uninit”, construct the warning message string based on the value of OPT and VAR. This patch has been bootstrapped and regressing tested on both X86 and aarch64. Okay for GCC12? thanks. Qing. == Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables. With -ftrivial-auto-var-init, the address taken auto variable is replaced with a temporary variable during gimplification, and the original auto variable might be eliminated by compiler optimization completely. As a result, the current uninitialized warning analysis cannot get enough information from the IR, therefore the uninitialized warnings for address taken variable cannot be issued based on the current implemenation of -ftrival-auto-var-init. For more info please refer to: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577431.html In order to improve this situation, we can improve uninitialized analysis for address taken auto variables with -ftrivial-auto-var-init as following: for the following stmt: _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]); if (_1 != 0) The original variable DECL has been eliminated from the IR, all the necessary information that is needed for reporting the warnings for DECL can be acquired from the call to .DEFERRED_INIT. A. the name string of DECL from the 3rd parameter of the call; B. the location of the DECL from the location of the call; C. the call can also be used to hold the information on whether the warning has been issued or not to suppress warning messages when needed; The current testing cases for uninitialized warnings + -ftrivial-auto-var-init are adjusted to reflect the fact that we can issue warnings for address taken variables. gcc/ChangeLog: 2022-01-14 qing zhao * tree-ssa-uninit.c (warn_uninit): Delete the 4th parameter. Handle .DEFERRED_INIT call with an anonymous SSA_NAME specially. (check_defs): Handle .DEFERRED_INIT call with an anonymous SSA_NAME specially. (warn_uninit_phi_uses): Delete the 4th actual when call warn_uninit. (warn_uninitialized_vars): Likewise. (warn_uninitialized_phi): Likewise. The complete patch is: 0001-Enable-Wuninitialized-ftrivial-auto-var-init-for-add.patch Description: 0001-Enable-Wuninitialized-ftrivial-auto-var-init-for-add.patch
Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.
On Sat, Jan 15, 2022 at 5:39 PM Hongyu Wang wrote: > > Thanks for the suggestion, here is the updated patch that survived > bootstrap/regtest. LGTM for me, but please get the final approval from Hongtao. Thanks, Uros. > > Please note reg_mentioned_p in the above condition. This function > > returns nonzero if register op0 appears somewhere within op1 and is > > critical for the correct operation of your patch. > I added reg_mentioned_p for all insns except fp16 complex mult, since > they have constraint & to the dest so it must be allocated different > register from src. > > Uros Bizjak 于2022年1月14日周五 23:49写道: > > > > > > On Fri, Jan 14, 2022 at 2:44 PM Hongyu Wang wrote: > > > > > > > Are there any technical obstacles to introduce subst to > > > > define_{,insn_and_}split? > > > > > > gccint says: define_subst can be used only in define_insn and > > > define_expand, it cannot be used in other expressions (e.g. in > > > define_insn_and_split). > > > > Hm, hm ... annoying ... > > > > > I have no idea how to implement it in current infrastructure. > > > > > > > In the proposed patch, if the output register is also mentioned in the > > > > input, then clearing before insn will clear the value in the input > > > > register. The solution in the i386.md also takes care of this issue. > > > > > > > > > > For this, I think we can add REGNO checks for operands in condition > > > (which means there is true dependency). > > > > Let's go in your direction, considering the limitations of current > > infrastructure. > > +{ > > + if (TARGET_DEST_FALSE_DEPENDENCY > > + && get_attr_dest_false_dep (insn) == > > +DEST_FALSE_DEP_TRUE) > > +output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands); > > + return "vgetmant\t{%2, %1, > > %0|%0, %1, %2}"; > > +} > > > > There is no need to pass the information via attributes. IMO, you > > shoud use subst attribute directly in the condition: > > > > { > > if (TARGET_DEST_FALSE_DEPENDENCY > > && > > && !reg_mentioned_p (operands[0], operands[1])) > >output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands); > > return "."; > > } > > > > Assume the above works, so please: > > > > - rename TARGET_DEST_FALSE_DEPENDENCY to something less generic, maybe > > following existing BMI example with TARGET_AVOID_FALSE_DEP_FOR_AVX512F > > - rename "mask3_dest_false_dep_attr" to "mask3_false_dep_for_avx512f_cond" > > > > Please note reg_mentioned_p in the above condition. This function > > returns nonzero if register op0 appears somewhere within op1 and is > > critical for the correct operation of your patch. > > > > Uros.
Re: [PATCH] widening_mul, i386, v2: Improve spaceship expansion on x86 [PR103973]
On Sat, Jan 15, 2022 at 12:23 PM Jakub Jelinek wrote: > > On Sat, Jan 15, 2022 at 11:42:55AM +0100, Uros Bizjak wrote: > > Yes, that would be nice. XFmode is used for long double, and not obsolete. > > Ok, that seems to work. Compared to the incremental patch I've posted, I > also had to add handling of the case where we have just > x == y ? 0 : x < y ? -1 : 1 (both for -ffast-math and non-ffast-math). > Apparently even that is worth optimizing. > Tested so far on the new testcases, will run full bootstrap/regtest tonight. > > > > Why? That seems to be a waste of time to me, unless something uses them > > > already during expansion. Because pass_expand::execute > > > runs: > > > /* We need JUMP_LABEL be set in order to redirect jumps, and hence > > > split edges which edge insertions might do. */ > > > rebuild_jump_labels (get_insns ()); > > > which resets all LABEL_NUSES to 0 (well, to: > > > if (LABEL_P (insn)) > > > LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0); > > > and then recomputes them and adds JUMP_LABEL if needed: > > > JUMP_LABEL (insn) = label; > > > > I was not aware of that detail. Thanks for sharing (and I wonder if > > all other cases should be removed from the source). > > I guess it depends, for code that can only be called during the expand pass > dropping it should be just fine, for code that can be called also (or only) > later I think adding JUMP_LABEL and correct LABEL_NUSES is needed because > nothing will fix it up afterwards. > > 2022-01-15 Jakub Jelinek > > PR target/103973 > * tree-cfg.h (cond_only_block_p): Declare. > * tree-ssa-phiopt.c (cond_only_block_p): Move function to ... > * tree-cfg.c (cond_only_block_p): ... here. No longer static. > * optabs.def (spaceship_optab): New optab. > * internal-fn.def (SPACESHIP): New internal function. > * internal-fn.h (expand_SPACESHIP): Declare. > * internal-fn.c (expand_PHI): Formatting fix. > (expand_SPACESHIP): New function. > * tree-ssa-math-opts.c (optimize_spaceship): New function. > (math_opts_dom_walker::after_dom_children): Use it. > * config/i386/i386.md (spaceship3): New define_expand. > * config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare. > * config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function. > * doc/md.texi (spaceship@var{m}3): Document. > > * gcc.target/i386/pr103973-1.c: New test. > * gcc.target/i386/pr103973-2.c: New test. > * gcc.target/i386/pr103973-3.c: New test. > * gcc.target/i386/pr103973-4.c: New test. > * gcc.target/i386/pr103973-5.c: New test. > * gcc.target/i386/pr103973-6.c: New test. > * gcc.target/i386/pr103973-7.c: New test. > * gcc.target/i386/pr103973-8.c: New test. > * gcc.target/i386/pr103973-9.c: New test. > * gcc.target/i386/pr103973-10.c: New test. > * gcc.target/i386/pr103973-11.c: New test. > * gcc.target/i386/pr103973-12.c: New test. > * gcc.target/i386/pr103973-13.c: New test. > * gcc.target/i386/pr103973-14.c: New test. > * gcc.target/i386/pr103973-15.c: New test. > * gcc.target/i386/pr103973-16.c: New test. > * gcc.target/i386/pr103973-17.c: New test. > * gcc.target/i386/pr103973-18.c: New test. > * gcc.target/i386/pr103973-19.c: New test. > * gcc.target/i386/pr103973-20.c: New test. > * g++.target/i386/pr103973-1.C: New test. > * g++.target/i386/pr103973-2.C: New test. > * g++.target/i386/pr103973-3.C: New test. > * g++.target/i386/pr103973-4.C: New test. > * g++.target/i386/pr103973-5.C: New test. > * g++.target/i386/pr103973-6.C: New test. > * g++.target/i386/pr103973-7.C: New test. > * g++.target/i386/pr103973-8.C: New test. > * g++.target/i386/pr103973-9.C: New test. > * g++.target/i386/pr103973-10.C: New test. > * g++.target/i386/pr103973-11.C: New test. > * g++.target/i386/pr103973-12.C: New test. > * g++.target/i386/pr103973-13.C: New test. > * g++.target/i386/pr103973-14.C: New test. > * g++.target/i386/pr103973-15.C: New test. > * g++.target/i386/pr103973-16.C: New test. > * g++.target/i386/pr103973-17.C: New test. > * g++.target/i386/pr103973-18.C: New test. > * g++.target/i386/pr103973-19.C: New test. > * g++.target/i386/pr103973-20.C: New test. OK (with a comment fix below) for the x86 part. Thanks, Uros. > --- gcc/tree-cfg.h.jj 2022-01-14 23:57:44.491718086 +0100 > +++ gcc/tree-cfg.h 2022-01-15 09:51:25.359468982 +0100 > @@ -111,6 +111,7 @@ extern basic_block gimple_switch_label_b > extern basic_block gimple_switch_default_bb (function *, gswitch *); > extern edge gimple_switch_edge (function *, gswitch *, unsigned); > extern edge
Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.
Thanks for the suggestion, here is the updated patch that survived bootstrap/regtest. > Please note reg_mentioned_p in the above condition. This function > returns nonzero if register op0 appears somewhere within op1 and is > critical for the correct operation of your patch. I added reg_mentioned_p for all insns except fp16 complex mult, since they have constraint & to the dest so it must be allocated different register from src. Uros Bizjak 于2022年1月14日周五 23:49写道: > > On Fri, Jan 14, 2022 at 2:44 PM Hongyu Wang wrote: > > > > > Are there any technical obstacles to introduce subst to > > > define_{,insn_and_}split? > > > > gccint says: define_subst can be used only in define_insn and > > define_expand, it cannot be used in other expressions (e.g. in > > define_insn_and_split). > > Hm, hm ... annoying ... > > > I have no idea how to implement it in current infrastructure. > > > > > In the proposed patch, if the output register is also mentioned in the > > > input, then clearing before insn will clear the value in the input > > > register. The solution in the i386.md also takes care of this issue. > > > > > > > For this, I think we can add REGNO checks for operands in condition > > (which means there is true dependency). > > Let's go in your direction, considering the limitations of current > infrastructure. > +{ > + if (TARGET_DEST_FALSE_DEPENDENCY > + && get_attr_dest_false_dep (insn) == > +DEST_FALSE_DEP_TRUE) > +output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands); > + return "vgetmant\t{%2, %1, > %0|%0, %1, %2}"; > +} > > There is no need to pass the information via attributes. IMO, you > shoud use subst attribute directly in the condition: > > { > if (TARGET_DEST_FALSE_DEPENDENCY > && > && !reg_mentioned_p (operands[0], operands[1])) >output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands); > return "."; > } > > Assume the above works, so please: > > - rename TARGET_DEST_FALSE_DEPENDENCY to something less generic, maybe > following existing BMI example with TARGET_AVOID_FALSE_DEP_FOR_AVX512F > - rename "mask3_dest_false_dep_attr" to "mask3_false_dep_for_avx512f_cond" > > Please note reg_mentioned_p in the above condition. This function > returns nonzero if register op0 appears somewhere within op1 and is > critical for the correct operation of your patch. > > Uros. From ac472527b653030c89c06021f38268441e64a502 Mon Sep 17 00:00:00 2001 From: wwwhhhyyy Date: Mon, 30 Aug 2021 16:41:41 +0800 Subject: [PATCH] [i386] GLC tuning: Break false dependency for dest register. For GoldenCove micro-architecture, force insert zero-idiom in asm template to break false dependency of dest register for several insns. The related insns are: VPERM/D/Q/PS/PD VRANGEPD/PS/SD/SS VGETMANTSS/SD/SH VGETMANDPS/PD - mem version only VPMULLQ VFMULCSH/PH VFCMULCSH/PH gcc/ChangeLog: * config/i386/i386.h (TARGET_DEST_FALSE_DEP_FOR_GLC): New macro. * config/i386/sse.md (__): Insert zero-idiom in output template when attr enabled, set new attribute to true for non-mask/maskz insn. (avx512fp16_sh_v8hf): Likewise. (avx512dq_mul3): Likewise. (_permvar): Likewise. (avx2_perm_1): Likewise. (avx512f_perm_1): Likewise. (avx512dq_rangep): Likewise. (avx512dq_ranges): Likewise. (_getmant): Likewise. (avx512f_vgetmant): Likewise. * config/i386/subst.md (mask3_dest_false_dep_for_glc_cond): New subst_attr. (mask4_dest_false_dep_for_glc_cond): Likewise. (mask6_dest_false_dep_for_glc_cond): Likewise. (mask10_dest_false_dep_for_glc_cond): Likewise. (maskc_dest_false_dep_for_glc_cond): Likewise. (mask_scalar4_dest_false_dep_for_glc_cond): Likewise. (mask_scalarc_dest_false_dep_for_glc_cond): Likewise. * config/i386/x86-tune.def (X86_TUNE_DEST_FALSE_DEP_FOR_GLC): New DEF_TUNE enabled for m_SAPPHIRERAPIDS and m_ALDERLAKE gcc/testsuite/ChangeLog: * gcc.target/i386/avx2-dest-false-dep-for-glc.c: New test. * gcc.target/i386/avx512dq-dest-false-dep-for-glc.c: Ditto. * gcc.target/i386/avx512f-dest-false-dep-for-glc.c: Ditto. * gcc.target/i386/avx512fp16-dest-false-dep-for-glc.c: Ditto. * gcc.target/i386/avx512fp16vl-dest-false-dep-for-glc.c: Ditto. * gcc.target/i386/avx512vl-dest-false-dep-for-glc.c: Ditto. --- gcc/config/i386/i386.h| 2 + gcc/config/i386/sse.md| 75 +++-- gcc/config/i386/subst.md | 7 ++ gcc/config/i386/x86-tune.def | 6 + .../i386/avx2-dest-false-dep-for-glc.c| 24 .../i386/avx512dq-dest-false-dep-for-glc.c| 73 + .../i386/avx512f-dest-false-dep-for-glc.c | 103 ++ .../i386/avx512fp16-dest-false-dep-for-glc.c | 45 .../avx512fp16vl-dest-false-dep-for-glc.c | 24 .../i386/avx512vl-dest-false-dep-for-glc.c| 76 + 10 files changed, 427 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/avx2-dest-false-dep-for-glc.c create mode 100644
Re: [PATCH] c++: ICE with noexcept and canonical types [PR101715]
On Fri, 14 Jan 2022, Marek Polacek via Gcc-patches wrote: > This is a "canonical types differ for identical types" ICE, which started > with r11-4682. It's a bit tricky to explain. Consider: > > template struct S { > S bar() noexcept(T::value); // #1 > S foo() noexcept(T::value); // #2 > }; > > template S S::foo() noexcept(T::value) {} // #3 > > We ICE because #3 and #2 have the same type, but their canonical types > differ: TYPE_CANONICAL (#3) == #2 but TYPE_CANONICAL (#2) == #1. > > The member functions #1 and #2 have the same type. However, since their > noexcept-specifier is deferred, when parsing them, we create a variant for > both of them, because DEFERRED_PARSE cannot be compared. In other words, > build_cp_fntype_variant's > > tree v = TYPE_MAIN_VARIANT (type); > for (; v; v = TYPE_NEXT_VARIANT (v)) > if (cp_check_qualified_type (v, type, type_quals, rqual, raises, late)) > return v; > > will *not* find an existing variant when creating a method_type for #2, so we > have to create a new one. > > But then we perform delayed parsing and call fixup_deferred_exception_variants > for #1 and #2. f_d_e_v will replace TYPE_RAISES_EXCEPTIONS with the newly > parsed noexcept-specifier. It also sets TYPE_CANONICAL (#2) to #1. Both > noexcepts turned out to be the same, so now we have two equivalent variants in > the list! I.e., > > +-+ +-+ +-+ > | main | | #2 | | #1 | > | S S::(S*) |->| S S::(S*) |->| S S::(S*) > |->NULL > |-| | noex(T::value) | | noex(T::value) | > +-+ +-+ +-+ > > Then we get to #3. As for #1 and #2, grokdeclarator calls build_memfn_type, > which ends up calling build_cp_fntype_variant, which will use the loop > above to look for an existing variant. The first one that matches > cp_check_qualified_type will be used, so we use #2 rather than #1, and the > TYPE_CANONICAL mismatch follows. Hopefully that makes sense. > > As for the fix, I didn't think I could rewrite the method_type #2 with #1 > because the type may have escaped via decltype. So my approach is to > elide #2 from the list, so when looking for a matching variant, we always > find #1 (#2 remains live though, which admittedly sounds sort of dodgy). I wonder about instead making build_cp_fntype_variant set the TYPE_CANONICAL for #3 to TYPE_CANONICAL(#2) (i.e. #1) instead of to #2? Something like: -- >8 -- gcc/cp/tree.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c index 7f7de86b4e8..b89135fa121 100644 --- a/gcc/cp/tree.c +++ b/gcc/cp/tree.c @@ -2779,8 +2779,9 @@ build_cp_fntype_variant (tree type, cp_ref_qualifier rqual, else if (TYPE_CANONICAL (type) != type || cr != raises || late) /* Build the underlying canonical type, since it is different from TYPE. */ -TYPE_CANONICAL (v) = build_cp_fntype_variant (TYPE_CANONICAL (type), - rqual, cr, false); +TYPE_CANONICAL (v) + = TYPE_CANONICAL (build_cp_fntype_variant (TYPE_CANONICAL (type), +rqual, cr, false)); else /* T is its own canonical type. */ TYPE_CANONICAL (v) = v;
[PATCH] c++, dyninit, v2: Optimize C++ dynamic initialization by constants into DECL_INITIAL adjustment [PR102876]
Hi! Sorry for the delay in response. On Thu, Dec 02, 2021 at 02:35:25PM +0100, Richard Biener wrote: > So with > > +/* Mark start and end of dynamic initialization of a variable. */ > +DEF_INTERNAL_FN (DYNAMIC_INIT_START, ECF_LEAF | ECF_NOTHROW, ". r ") > +DEF_INTERNAL_FN (DYNAMIC_INIT_END, ECF_LEAF | ECF_NOTHROW, ". r ") > > there's nothing preventing code motion of unrelated stmts into > the block, but that should be harmless. What it also does > is make 'e' aliased (because it's address is now taken), probably > relevant only for IPA/LTO or for statics. > > The setup does not prevent CSEing the inits with uses from another > initializer - probably OK as well (if not then .DYNAMIC_INIT_END > should also be considered writing to 'e'). > > ". r " also means it clobbers and uses all global memory, I think > we'd like to have it const + looping-pure-or-const. ".cr " would > possibly achieve this, not sure about the looping part. https://eel.is/c++draft/basic.start.static#3 in https://eel.is/c++draft/basic.start.static#3.1 says that the optimization I'd like to do is only allowed if "the dynamic version of the initialization does not change the value of any other object of static or thread storage duration prior to its initialization", so we at least need to ensure that stores from other dynamic initialization or stores from this dynamic initialization aren't moved across the .DYNAMIC_INIT_{START,END} calls (i.e. that if a dynamic initialization of a also stores to b, we don't move the store to b somewhere else and optimize it, or we don't move store to some unrelated var into the block and punt because of that). > > Currently the optimization is only able to optimize cases where the whole > > variable is stored in a single store (typically scalar variables), or > > uses the native_{encode,interpret}* infrastructure to create or update > > the CONSTRUCTOR. This means that except for the first category, we can't > > right now handle unions or anything that needs relocations (vars containing > > pointers to other vars or references). > > I think it would be nice to incrementally add before the native_* fallback > > some attempt to just create or update a CONSTRUCTOR if possible. If we only > > see var.a.b.c.d[10].e = const; style of stores, this shouldn't be that hard > > as the whole access path is recorded there and we'd just need to decide what > > to do with unions if two or more union members are accessed. And do a deep > > copy of the CONSTRUCTOR and try to efficiently update the copy afterwards > > (the CONSTRUCTORs should be sorted on increasing offsets of the > > members/elements, so doing an ordered vec insertion might not be the best > > idea). But MEM_REFs complicate this, parts or all of the access path > > is lost. For non-unions in most cases we could try to guess which field > > it is (do we have some existing function to do that? I vaguely remember > > we've been doing that in some cases in the past in some folding but stopped > > doing so) but with unions it will be harder or impossible. > > I suppose we could, at least for non-overlapping inits, create a > new aggregate type on the fly to be able to compose the CTOR and then > view-convert it to the decls type. Would need to check that > a CTOR wrapped in a V_C_E is handled OK by varasm of course. > > An alternative way of recording the initializer (maybe just emit > it right away into asm?) would be another possibility. It would be nice to come up with something that works for those, but perhaps that can be improved incrementally (so for sure GCC13+ only)? > I also note that loops are quite common in some initializers so more > aggressively unrolling those for initializations might be a good > idea as well. Sure. > > As the middle-end can't easily differentiate between const variables without > > and with mutable members, both of those will have TREE_READONLY on the > > var decl clear (because of dynamic initialization) and TYPE_READONLY set > > on the type, the patch remembers this in an extra argument to > > .DYNAMIC_INIT_START (true if it is ok to set TREE_READONLY on the var decl > > back if the var dynamic initialization could be optimized into > > DECL_INITIAL). > > Thinking more about it, I'm not sure about const vars without mutable > > members with non-trivial destructors, do we register their dtors dynamically > > through __cxa_atexit in the ctors (that would mean the optimization > > currently punts on them), or not (in that case we could put it into .rodata > > even when the dtor will want to perhaps write to them)? > > I think anything like this asks for doing the whole thing at IPA level > to see which functions are "initialization" and thus need not considered > writing when the initializer is made static. I don't see how we could do it at the IPA level. Because for it we need inlining to happen and then perform at least some minimal cleanup passes (forwprop, ccp, fre, dce, cunrolli, etc.) and I
Re: [PATCH] gcc: pass-manager: Fix memory leak. [PR jit/63854]
Jeff, David, do you need any more input from my side? -- Marc Am Sa., 8. Jan. 2022 um 17:32 Uhr schrieb Jeff Law : > > > > On 1/6/2022 6:53 AM, David Malcolm via Gcc-patches wrote: > > On Sun, 2021-12-19 at 22:30 +0100, Marc Nieper-Wißkirchen wrote: > >> This patch fixes a memory leak in the pass manager. In the existing > >> code, > >> the m_name_to_pass_map is allocated in > >> pass_manager::register_pass_name, but > >> never deallocated. This is fixed by adding a deletion in > >> pass_manager::~pass_manager. Moreover the string keys in > >> m_name_to_pass_map are > >> all dynamically allocated. To free them, this patch adds a new hash > >> trait for > >> string hashes that are to be freed when the corresponding hash entry > >> is removed. > >> > >> This fix is particularly relevant for using GCC as a library through > >> libgccjit. > >> The memory leak also occurs when libgccjit is instructed to use an > >> external > >> driver. > >> > >> Before the patch, compiling the hello world example of libgccjit with > >> the external driver under Valgrind shows a loss of 12,611 (48 direct) > >> bytes. After the patch, no memory leaks are reported anymore. > >> (Memory leaks occurring when using the internal driver are mostly in > >> the driver code in gcc/gcc.c and have to be fixed separately.) > >> > >> The patch has been tested by fully bootstrapping the compiler with > >> the > >> frontends C, C++, Fortran, LTO, ObjC, JIT and running the test suite > >> under a x86_64-pc-linux-gnu host. > > Thanks for the patch. > > > > It looks correct to me, given that pass_manager::register_pass_name > > does an xstrdup and puts the result in the map. > > > > That said: > > - I'm not officially a reviewer for this part of gcc (though I probably > > touched this code last) > > - is it cleaner to instead change m_name_to_pass_map's key type from > > const char * to char *, to convey that the map "owns" the name? That > > way we probably wouldn't need struct typed_const_free_remove, and (I > > hope) works better with the type system. > > > > Dave > > > >> gcc/ChangeLog: > >> > >> PR jit/63854 > >> * hash-traits.h (struct typed_const_free_remove): New. > >> (struct free_string_hash): New. > >> * pass_manager.h: Use free_string_hash. > >> * passes.c (pass_manager::register_pass_name): Use > >> free_string_hash. > >> (pass_manager::~pass_manager): Delete allocated > >> m_name_to_pass_map. > My concern (and what I hadn't had time to dig into) was we initially > used nofree_string_hash -- I wanted to make sure there wasn't any path > where the name came from the stack (can't be free'd), was saved > elsewhere (danging pointer) and the like. ie, why were we using > nofree_string_hash to begin with? I've never really mucked around with > these bits, so the analysis side kept falling off the daily todo list. > > If/once you're comfortable with the patch David, then go ahead and apply > it on Marc's behalf. > > jeff >
[r12-6420 Regression] FAIL: g++.dg/vect/pr92595.cc -std=c++2a (test for excess errors) on Linux/x86_64
On Linux/x86_64, d3ff7420e941931d32ce2e332e7968fe67ba20af is the first bad commit commit d3ff7420e941931d32ce2e332e7968fe67ba20af Author: Andre Vieira Date: Thu Dec 2 14:34:15 2021 + [vect] Re-analyze all modes for epilogues caused FAIL: gcc.dg/tree-ssa/gen-vect-26.c (internal compiler error: in operator[], at vec.h:889) FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/tree-ssa/gen-vect-26.c (test for excess errors) FAIL: gcc.dg/tree-ssa/gen-vect-28.c (internal compiler error: in operator[], at vec.h:889) FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect "vectorized 1 loops" 1 FAIL: gcc.dg/tree-ssa/gen-vect-28.c (test for excess errors) FAIL: gcc.dg/vect/slp-perm-9.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-perm-9.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 FAIL: gcc.dg/vect/slp-reduc-11.c -flto -ffat-lto-objects scan-tree-dump vect "vectorizing stmts using SLP" FAIL: gcc.dg/vect/slp-reduc-11.c scan-tree-dump vect "vectorizing stmts using SLP" FAIL: gcc.dg/vect/vect-tail-nomask-1.c -flto -ffat-lto-objects scan-tree-dump-times vect "LOOP EPILOGUE VECTORIZED \\(MODE=V16QI\\)" 2 FAIL: gcc.dg/vect/vect-tail-nomask-1.c scan-tree-dump-times vect "LOOP EPILOGUE VECTORIZED \\(MODE=V16QI\\)" 2 FAIL: gcc.target/i386/avx512fp16-recip-1.c scan-assembler-not vdivph FAIL: gcc.target/i386/pr61403.c scan-assembler rsqrtps FAIL: gcc.target/i386/pr88531-1b.c scan-assembler-times vgatherdpd 4 FAIL: gcc.target/i386/pr88531-1b.c scan-assembler-times vgatherqpd 4 FAIL: gcc.target/i386/pr88531-1b.c scan-assembler-times vmulpd 4 FAIL: gcc.target/i386/pr88531-1c.c scan-assembler-times vgatherdpd 4 FAIL: gcc.target/i386/pr88531-1c.c scan-assembler-times vgatherqpd 4 FAIL: gcc.target/i386/pr88531-1c.c scan-assembler-times vmulpd 4 FAIL: gcc.target/i386/pr88531-2b.c scan-assembler-times vmulps 2 FAIL: gcc.target/i386/pr88531-2c.c scan-assembler-times vmulps 2 FAIL: gcc.target/i386/pr94494.c (internal compiler error: in operator[], at vec.h:889) FAIL: gcc.target/i386/pr94494.c (test for excess errors) FAIL: gcc.target/i386/vect-reduc-1.c scan-assembler-times padd 5 FAIL: gcc.target/i386/vect-reduc-1.c scan-tree-dump vect "LOOP EPILOGUE VECTORIZED" FAIL: g++.dg/vect/pr92595.cc -std=c++14 (internal compiler error: in operator[], at vec.h:889) FAIL: g++.dg/vect/pr92595.cc -std=c++14 (test for excess errors) FAIL: g++.dg/vect/pr92595.cc -std=c++17 (internal compiler error: in operator[], at vec.h:889) FAIL: g++.dg/vect/pr92595.cc -std=c++17 (test for excess errors) FAIL: g++.dg/vect/pr92595.cc -std=c++2a (internal compiler error: in operator[], at vec.h:889) FAIL: g++.dg/vect/pr92595.cc -std=c++2a (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6420/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/gen-vect-26.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/gen-vect-26.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/gen-vect-28.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/gen-vect-28.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/slp-perm-9.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/slp-reduc-11.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/slp-reduc-11.c --target_board='unix{-m64\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-tail-nomask-1.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-tail-nomask-1.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-tail-nomask-1.c --target_board='unix{-m64}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-tail-nomask-1.c --target_board='unix{-m64\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/avx512fp16-recip-1.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/avx512fp16-recip-1.c --target_board='unix{-m64\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="i386.exp=gcc.target/i386/pr61403.c
[r12-6560 Regression] FAIL: gfortran.dg/gomp/allocate-2.f90 -O (test for errors, line 36) on Linux/x86_64
On Linux/x86_64, 69561fc781aca3dea3aa4d5d562ef5a502965924 is the first bad commit commit 69561fc781aca3dea3aa4d5d562ef5a502965924 Author: Hafiz Abid Qadeer Date: Fri Sep 24 10:04:12 2021 +0100 Add support for allocate clause (OpenMP 5.0). caused FAIL: gfortran.dg/gomp/allocate-2.f90 -O (test for errors, line 36) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6560/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="gomp.exp=gfortran.dg/gomp/allocate-2.f90 --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="gomp.exp=gfortran.dg/gomp/allocate-2.f90 --target_board='unix{-m32\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)
[r12-6404 Regression] FAIL: gfortran.dg/ieee/signaling_1.f90 -Os (test for excess errors) on Linux/x86_64
On Linux/x86_64, 492954263e39346287a5a2a32bcc5312466a0ee1 is the first bad commit commit 492954263e39346287a5a2a32bcc5312466a0ee1 Author: Francois-Xavier Coudert Date: Sun Jan 2 11:36:23 2022 +0100 Fortran: Allow IEEE_CLASS to identify signaling NaNs caused FAIL: gfortran.dg/ieee/signaling_1.f90 -O0 (test for excess errors) FAIL: gfortran.dg/ieee/signaling_1.f90 -O1 (test for excess errors) FAIL: gfortran.dg/ieee/signaling_1.f90 -O2 (test for excess errors) FAIL: gfortran.dg/ieee/signaling_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) FAIL: gfortran.dg/ieee/signaling_1.f90 -O3 -g (test for excess errors) FAIL: gfortran.dg/ieee/signaling_1.f90 -Os (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6404/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 --target_board='unix{-m64}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 --target_board='unix{-m64\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)
[r12-6586 Regression] FAIL: g++.dg/torture/pr57993-2.C -Os (test for excess errors) on Linux/x86_64
On Linux/x86_64, 74abb0beb420830e52dfc6b3ee74e77dae8e31a3 is the first bad commit commit 74abb0beb420830e52dfc6b3ee74e77dae8e31a3 Author: Martin Liska Date: Fri Jan 14 15:21:40 2022 +0100 testsuite: rename 2 files. caused FAIL: g++.dg/torture/pr57993-2.C -O0 (test for excess errors) FAIL: g++.dg/torture/pr57993-2.C -O1 (test for excess errors) FAIL: g++.dg/torture/pr57993-2.C -O2 -flto -fno-use-linker-plugin -flto-partition=none (test for excess errors) FAIL: g++.dg/torture/pr57993-2.C -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) FAIL: g++.dg/torture/pr57993-2.C -O2 (test for excess errors) FAIL: g++.dg/torture/pr57993-2.C -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) FAIL: g++.dg/torture/pr57993-2.C -O3 -g (test for excess errors) FAIL: g++.dg/torture/pr57993-2.C -Os (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6586/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr57993-2.C --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr57993-2.C --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr57993-2.C --target_board='unix{-m64}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr57993-2.C --target_board='unix{-m64\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)
[r12-6517 Regression] FAIL: gfortran.dg/ieee/signaling_1.f90 -Os execution test on Linux/x86_64
On Linux/x86_64, 6b14100b9504800768da726dcb81f1857db3b493 is the first bad commit commit 6b14100b9504800768da726dcb81f1857db3b493 Author: Francois-Xavier Coudert Date: Wed Jan 12 11:19:37 2022 +0100 Fortran: fix testcase compiler flags caused FAIL: gfortran.dg/ieee/signaling_1.f90 -O0 execution test FAIL: gfortran.dg/ieee/signaling_1.f90 -O1 execution test FAIL: gfortran.dg/ieee/signaling_1.f90 -O2 execution test FAIL: gfortran.dg/ieee/signaling_1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: gfortran.dg/ieee/signaling_1.f90 -O3 -g execution test FAIL: gfortran.dg/ieee/signaling_1.f90 -Os execution test with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6517/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 --target_board='unix{-m32\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)
[r12-6578 Regression] FAIL: g++.dg/cpp0x/constexpr-compare2.C -std=c++2a (test for excess errors) on Linux/x86_64
On Linux/x86_64, d686d5d85c23451c03799dc55e456b73065f7333 is the first bad commit commit d686d5d85c23451c03799dc55e456b73065f7333 Author: Jakub Jelinek Date: Fri Jan 14 12:07:49 2022 +0100 c++: Reject in constant evaluation address comparisons of start of one var and end of another [PR89074] caused FAIL: g++.dg/cpp0x/constexpr-compare2.C -std=c++14 (test for excess errors) FAIL: g++.dg/cpp0x/constexpr-compare2.C -std=c++17 (test for excess errors) FAIL: g++.dg/cpp0x/constexpr-compare2.C -std=c++2a (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6578/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/cpp0x/constexpr-compare2.C --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/cpp0x/constexpr-compare2.C --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/cpp0x/constexpr-compare2.C --target_board='unix{-m64}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=g++.dg/cpp0x/constexpr-compare2.C --target_board='unix{-m64\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)
[r12-6585 Regression] FAIL: c-c++-common/Walloca-larger-than.c -Wc++-compat (test for excess errors) on Linux/x86_64
On Linux/x86_64, d8b64476138671f3d89cd66f224a9b59e465631b is the first bad commit commit d8b64476138671f3d89cd66f224a9b59e465631b Author: Martin Liska Date: Fri Jan 14 15:04:33 2022 +0100 testsuite: rename files in c-c++-common. caused FAIL: c-c++-common/Walloca-larger-than.c -std=gnu++14 (test for excess errors) FAIL: c-c++-common/Walloca-larger-than.c -std=gnu++17 (test for excess errors) FAIL: c-c++-common/Walloca-larger-than.c -std=gnu++2a (test for excess errors) FAIL: c-c++-common/Walloca-larger-than.c -std=gnu++98 (test for excess errors) FAIL: c-c++-common/Walloca-larger-than.c -Wc++-compat (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6585/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=c-c++-common/Walloca-larger-than.c --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=c-c++-common/Walloca-larger-than.c --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=c-c++-common/Walloca-larger-than.c --target_board='unix{-m64}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=c-c++-common/Walloca-larger-than.c --target_board='unix{-m64\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at skpgkp2 at gmail dot com)
Re: [PATCH] OpenMP front-end: allow requires dynamic_allocators
On Mon, Dec 20, 2021 at 03:16:23PM +, Andrew Stubbs wrote: > This patch removes the "sorry" message for the OpenMP "requires > dynamic_allocators" feature in C, C++ and Fortran. > > The clause is supposed to state that the user code will not work without the > omp_alloc/omp_free and omp_init_allocator/omp_destroy_allocator and these > things *are* present, so there should be no problem allowing it. > > I can't see any reason for our implementation to *do* anything with this > statement -- it's fine for the allocators to work the same with or without > it. Well, we should do a lot with it, but that can wait for later. In particular, if OMP_REQUIRES_DYNAMIC_ALLOCATORS is not present, we should be rejecting omp_init_allocator and omp_destroy_allocator in target regions (maybe just a warning though), and for allocate clauses, directives, and the omp_alloc etc. APIs we should in the target region enforce they don't use omp_null_allocator and the allocators they use is mentioned in uses_allocators (and implement uses_allocators). But, right now it is unclear to me how exactly is that supposed to work with declare target to routines, those aren't nested inside of the target and so can't know what uses_allocator is used, so they can't make any allocations? Or is it just that we can't diagnose this in such routines and only can diagnose omp_null_allocator isn't used... > > I think this patch ought to be fine for GCC 12, but if not it can wait until > stage 1 opens. I shall backport this to OG11 shortly. Anyway, your patch is a step forward, as we don't support uses_allocators, the non-requires dynamic_allocators way is unusable for any target region allocations due to that and so requires dynamic_allocators is the only thing we actually support, but we were rejecting that directive... Ok and sorry for the delay. > gcc/c/ChangeLog: > > * c-parser.c (c_parser_omp_requires): Don't "sorry" dynamic_allocators. > > gcc/cp/ChangeLog: > > * parser.c (cp_parser_omp_requires): Don't "sorry" dynamic_allocators. > > gcc/fortran/ChangeLog: > > * openmp.c (gfc_match_omp_requires): Don't "sorry" dynamic_allocators. > > gcc/testsuite/ChangeLog: > > * gfortran.dg/gomp/requires-8.f90: Reinstate dynamic allocators > requirement. Jakub
[PATCH] widening_mul, i386, v2: Improve spaceship expansion on x86 [PR103973]
On Sat, Jan 15, 2022 at 11:42:55AM +0100, Uros Bizjak wrote: > Yes, that would be nice. XFmode is used for long double, and not obsolete. Ok, that seems to work. Compared to the incremental patch I've posted, I also had to add handling of the case where we have just x == y ? 0 : x < y ? -1 : 1 (both for -ffast-math and non-ffast-math). Apparently even that is worth optimizing. Tested so far on the new testcases, will run full bootstrap/regtest tonight. > > Why? That seems to be a waste of time to me, unless something uses them > > already during expansion. Because pass_expand::execute > > runs: > > /* We need JUMP_LABEL be set in order to redirect jumps, and hence > > split edges which edge insertions might do. */ > > rebuild_jump_labels (get_insns ()); > > which resets all LABEL_NUSES to 0 (well, to: > > if (LABEL_P (insn)) > > LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0); > > and then recomputes them and adds JUMP_LABEL if needed: > > JUMP_LABEL (insn) = label; > > I was not aware of that detail. Thanks for sharing (and I wonder if > all other cases should be removed from the source). I guess it depends, for code that can only be called during the expand pass dropping it should be just fine, for code that can be called also (or only) later I think adding JUMP_LABEL and correct LABEL_NUSES is needed because nothing will fix it up afterwards. 2022-01-15 Jakub Jelinek PR target/103973 * tree-cfg.h (cond_only_block_p): Declare. * tree-ssa-phiopt.c (cond_only_block_p): Move function to ... * tree-cfg.c (cond_only_block_p): ... here. No longer static. * optabs.def (spaceship_optab): New optab. * internal-fn.def (SPACESHIP): New internal function. * internal-fn.h (expand_SPACESHIP): Declare. * internal-fn.c (expand_PHI): Formatting fix. (expand_SPACESHIP): New function. * tree-ssa-math-opts.c (optimize_spaceship): New function. (math_opts_dom_walker::after_dom_children): Use it. * config/i386/i386.md (spaceship3): New define_expand. * config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare. * config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function. * doc/md.texi (spaceship@var{m}3): Document. * gcc.target/i386/pr103973-1.c: New test. * gcc.target/i386/pr103973-2.c: New test. * gcc.target/i386/pr103973-3.c: New test. * gcc.target/i386/pr103973-4.c: New test. * gcc.target/i386/pr103973-5.c: New test. * gcc.target/i386/pr103973-6.c: New test. * gcc.target/i386/pr103973-7.c: New test. * gcc.target/i386/pr103973-8.c: New test. * gcc.target/i386/pr103973-9.c: New test. * gcc.target/i386/pr103973-10.c: New test. * gcc.target/i386/pr103973-11.c: New test. * gcc.target/i386/pr103973-12.c: New test. * gcc.target/i386/pr103973-13.c: New test. * gcc.target/i386/pr103973-14.c: New test. * gcc.target/i386/pr103973-15.c: New test. * gcc.target/i386/pr103973-16.c: New test. * gcc.target/i386/pr103973-17.c: New test. * gcc.target/i386/pr103973-18.c: New test. * gcc.target/i386/pr103973-19.c: New test. * gcc.target/i386/pr103973-20.c: New test. * g++.target/i386/pr103973-1.C: New test. * g++.target/i386/pr103973-2.C: New test. * g++.target/i386/pr103973-3.C: New test. * g++.target/i386/pr103973-4.C: New test. * g++.target/i386/pr103973-5.C: New test. * g++.target/i386/pr103973-6.C: New test. * g++.target/i386/pr103973-7.C: New test. * g++.target/i386/pr103973-8.C: New test. * g++.target/i386/pr103973-9.C: New test. * g++.target/i386/pr103973-10.C: New test. * g++.target/i386/pr103973-11.C: New test. * g++.target/i386/pr103973-12.C: New test. * g++.target/i386/pr103973-13.C: New test. * g++.target/i386/pr103973-14.C: New test. * g++.target/i386/pr103973-15.C: New test. * g++.target/i386/pr103973-16.C: New test. * g++.target/i386/pr103973-17.C: New test. * g++.target/i386/pr103973-18.C: New test. * g++.target/i386/pr103973-19.C: New test. * g++.target/i386/pr103973-20.C: New test. --- gcc/tree-cfg.h.jj 2022-01-14 23:57:44.491718086 +0100 +++ gcc/tree-cfg.h 2022-01-15 09:51:25.359468982 +0100 @@ -111,6 +111,7 @@ extern basic_block gimple_switch_label_b extern basic_block gimple_switch_default_bb (function *, gswitch *); extern edge gimple_switch_edge (function *, gswitch *, unsigned); extern edge gimple_switch_default_edge (function *, gswitch *); +extern bool cond_only_block_p (basic_block); /* Return true if the LHS of a call should be removed. */ --- gcc/tree-ssa-phiopt.c.jj2022-01-14 23:57:44.536717549 +0100 +++ gcc/tree-ssa-phiopt.c 2022-01-15 09:51:25.361468954 +0100 @@ -1958,31 +1958,6 @@
Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]
On Sat, Jan 15, 2022 at 10:56 AM Jakub Jelinek wrote: > > On Sat, Jan 15, 2022 at 09:29:05AM +0100, Uros Bizjak wrote: > > > --- gcc/config/i386/i386.md.jj 2022-01-14 11:51:34.432384170 +0100 > > > +++ gcc/config/i386/i386.md 2022-01-14 18:22:41.140906449 +0100 > > > @@ -23886,6 +23886,18 @@ (define_insn "hreset" > > >[(set_attr "type" "other") > > > (set_attr "length" "4")]) > > > > > > +;; Spaceship optimization > > > +(define_expand "spaceship3" > > > + [(match_operand:SI 0 "register_operand") > > > + (match_operand:MODEF 1 "cmp_fp_expander_operand") > > > + (match_operand:MODEF 2 "cmp_fp_expander_operand")] > > > + "(TARGET_80387 || (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)) > > > + && TARGET_CMOVE && TARGET_IEEE_FP" > > > > Is there a reason that this pattern is limited to TARGET_IEEE_FP? > > During the expansion in ix86_expand_fp_spaceship, we can just skip > > jump on unordered, while ix86_expand_fp_compare will emit the correct > > comparison mode depending on TARGET_IEEE_FP. > > For -ffast-math I thought <=> expands to just x == y ? 0 : x < y ? -1 : 1 > but apparently not, it is still x == y ? 0 : x < y ? -1 : x > y ? 1 : 2 > but it is still optimized much better than the non-fast-math case > without the patch: > comisd %xmm1, %xmm0 > je .L12 > jb .L13 > movl$1, %edx > movl$2, %eax > cmova %edx, %eax > ret > .p2align 4,,10 > .p2align 3 > .L12: > xorl%eax, %eax > ret > .p2align 4,,10 > .p2align 3 > .L13: > movl$-1, %eax > ret > so just one comparison but admittedly the > movl$1, %edx > movl$2, %eax > cmova %edx, %eax > part is unnecessary. > So below is an incremental patch that handles even the !HONORS_NANS > case at the gimple pattern matching (while for HONOR_NANS we must > obey that for NaN operand(s) >/>=/ we look for the third comparison on the false edge of the second one, > for !HONOR_NANS that is not the case. With the incremental patch we get: > comisd %xmm1, %xmm0 > je .L2 > seta%al > leal-1(%rax,%rax), %eax > ret > .p2align 4,,10 > .p2align 3 > .L2: > xorl%eax, %eax > ret > for -O2 -ffast-math. > Also, I've added || (TARGET_SAHF && TARGET_USE_SAHF), because apparently > we can handle that case nicely too, it is just the IX86_FPCMP_ARITH > case where fp_compare already emits very specific code (and we can't > call ix86_expand_fp_compare 3 times because that would for IEEE_FP > emit different comparisons which couldn't be CSEd). > I'll add also -ffast-math testsuite coverage and retest. > > Also, I wonder if I shouldn't handle XFmode the same, thoughts on that? Yes, that would be nice. XFmode is used for long double, and not obsolete. > > > > + gcc_checking_assert (ix86_fp_comparison_strategy (GT) == > > > IX86_FPCMP_COMI); > > > + rtx gt = ix86_expand_fp_compare (GT, op0, op1); > > > + rtx l0 = gen_label_rtx (); > > > + rtx l1 = gen_label_rtx (); > > > + rtx l2 = gen_label_rtx (); > > > + rtx lend = gen_label_rtx (); > > > + rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode, > > > + gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx); > > > + rtx tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un, > > > + gen_rtx_LABEL_REF (VOIDmode, l2), > > > pc_rtx); > > > + rtx_insn *jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp)); > > > + add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ()); > > > > Please also add JUMP_LABEL to the insn. > > > > > + rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode, > > > + gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx); > > > + tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq, > > > + gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx); > > > + jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp)); > > > + add_reg_br_prob_note (jmp, profile_probability::unlikely ()); > > > + tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt, > > > + gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx); > > > + jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp)); > > > + add_reg_br_prob_note (jmp, profile_probability::even ()); > > > + emit_move_insn (dest, constm1_rtx); > > > + emit_jump (lend); > > > + emit_label (l0); > > > > and LABEL_NUSES label. > > Why? That seems to be a waste of time to me, unless something uses them > already during expansion. Because pass_expand::execute > runs: > /* We need JUMP_LABEL be set in order to redirect jumps, and hence > split edges which edge insertions might do. */ > rebuild_jump_labels (get_insns ()); > which resets all LABEL_NUSES to 0 (well, to: > if (LABEL_P (insn)) > LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0); > and then recomputes them and adds JUMP_LABEL if needed: > JUMP_LABEL (insn) = label;
Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]
On Sat, Jan 15, 2022 at 09:29:05AM +0100, Uros Bizjak wrote: > > --- gcc/config/i386/i386.md.jj 2022-01-14 11:51:34.432384170 +0100 > > +++ gcc/config/i386/i386.md 2022-01-14 18:22:41.140906449 +0100 > > @@ -23886,6 +23886,18 @@ (define_insn "hreset" > >[(set_attr "type" "other") > > (set_attr "length" "4")]) > > > > +;; Spaceship optimization > > +(define_expand "spaceship3" > > + [(match_operand:SI 0 "register_operand") > > + (match_operand:MODEF 1 "cmp_fp_expander_operand") > > + (match_operand:MODEF 2 "cmp_fp_expander_operand")] > > + "(TARGET_80387 || (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)) > > + && TARGET_CMOVE && TARGET_IEEE_FP" > > Is there a reason that this pattern is limited to TARGET_IEEE_FP? > During the expansion in ix86_expand_fp_spaceship, we can just skip > jump on unordered, while ix86_expand_fp_compare will emit the correct > comparison mode depending on TARGET_IEEE_FP. For -ffast-math I thought <=> expands to just x == y ? 0 : x < y ? -1 : 1 but apparently not, it is still x == y ? 0 : x < y ? -1 : x > y ? 1 : 2 but it is still optimized much better than the non-fast-math case without the patch: comisd %xmm1, %xmm0 je .L12 jb .L13 movl$1, %edx movl$2, %eax cmova %edx, %eax ret .p2align 4,,10 .p2align 3 .L12: xorl%eax, %eax ret .p2align 4,,10 .p2align 3 .L13: movl$-1, %eax ret so just one comparison but admittedly the movl$1, %edx movl$2, %eax cmova %edx, %eax part is unnecessary. So below is an incremental patch that handles even the !HONORS_NANS case at the gimple pattern matching (while for HONOR_NANS we must obey that for NaN operand(s) >/>=/ > + gcc_checking_assert (ix86_fp_comparison_strategy (GT) == > > IX86_FPCMP_COMI); > > + rtx gt = ix86_expand_fp_compare (GT, op0, op1); > > + rtx l0 = gen_label_rtx (); > > + rtx l1 = gen_label_rtx (); > > + rtx l2 = gen_label_rtx (); > > + rtx lend = gen_label_rtx (); > > + rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode, > > + gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx); > > + rtx tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un, > > + gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx); > > + rtx_insn *jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp)); > > + add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ()); > > Please also add JUMP_LABEL to the insn. > > > + rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode, > > + gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx); > > + tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq, > > + gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx); > > + jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp)); > > + add_reg_br_prob_note (jmp, profile_probability::unlikely ()); > > + tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt, > > + gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx); > > + jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp)); > > + add_reg_br_prob_note (jmp, profile_probability::even ()); > > + emit_move_insn (dest, constm1_rtx); > > + emit_jump (lend); > > + emit_label (l0); > > and LABEL_NUSES label. Why? That seems to be a waste of time to me, unless something uses them already during expansion. Because pass_expand::execute runs: /* We need JUMP_LABEL be set in order to redirect jumps, and hence split edges which edge insertions might do. */ rebuild_jump_labels (get_insns ()); which resets all LABEL_NUSES to 0 (well, to: if (LABEL_P (insn)) LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0); and then recomputes them and adds JUMP_LABEL if needed: JUMP_LABEL (insn) = label; --- gcc/config/i386/i386.md.jj 2022-01-15 09:51:25.404468342 +0100 +++ gcc/config/i386/i386.md 2022-01-15 09:56:31.602109421 +0100 @@ -23892,7 +23892,7 @@ (define_expand "spaceship3" (match_operand:MODEF 1 "cmp_fp_expander_operand") (match_operand:MODEF 2 "cmp_fp_expander_operand")] "(TARGET_80387 || (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)) - && TARGET_CMOVE && TARGET_IEEE_FP" + && (TARGET_CMOVE || (TARGET_SAHF && TARGET_USE_SAHF))" { ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]); DONE; --- gcc/config/i386/i386-expand.c.jj2022-01-15 09:51:25.411468242 +0100 +++ gcc/config/i386/i386-expand.c 2022-01-15 10:38:26.924333651 +0100 @@ -2884,18 +2884,23 @@ ix86_expand_setcc (rtx dest, enum rtx_co void ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1) { - gcc_checking_assert (ix86_fp_comparison_strategy (GT) == IX86_FPCMP_COMI); + gcc_checking_assert (ix86_fp_comparison_strategy (GT) != IX86_FPCMP_ARITH); rtx gt = ix86_expand_fp_compare (GT, op0, op1); rtx l0 = gen_label_rtx (); rtx l1 = gen_label_rtx (); - rtx l2 = gen_label_rtx (); + rtx l2 = TARGET_IEEE_FP ? gen_label_rtx () : NULL_RTX; rtx
Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]
On Fri, Jan 14, 2022 at 11:56 PM Jakub Jelinek wrote: > > Hi! > > C++20: > #include > auto cmp4way(double a, double b) > { > return a <=> b; > } > expands to: > ucomisd %xmm1, %xmm0 > jp .L8 > movl$0, %eax > jne .L8 > .L2: > ret > .p2align 4,,10 > .p2align 3 > .L8: > comisd %xmm0, %xmm1 > movl$-1, %eax > ja .L2 > ucomisd %xmm1, %xmm0 > setbe %al > addl$1, %eax > ret > That is 3 comparisons of the same operands. > The following patch improves it to just one comparison: > comisd %xmm1, %xmm0 > jp .L4 > seta%al > movl$0, %edx > leal-1(%rax,%rax), %eax > cmove %edx, %eax > ret > .L4: > movl$2, %eax > ret > While a <=> b expands to a == b ? 0 : a < b ? -1 : a > b ? 1 : 2 > where the first comparison is equality and this shouldn't raise > exceptions on qNaN operands, if the operands aren't equal (which > includes unordered cases), then it immediately performs < or > > comparison and that raises exceptions even on qNaNs, so we can just > perform a single comparison that raises exceptions on qNaN. > As the 4 different cases are encoded as > ZF CF PF > 1 1 1 a unordered b > 0 0 0 a > b > 0 1 0 a < b > 1 0 0 a == b > we can emit optimal sequence of comparions, first jp > for the unordered case, then je for the == case and finally jb > for the < case. > > The patch pattern recognizes spaceship-like comparisons during > widening_mul if the spaceship optab is implemented, and replaces > thoose comparisons with comparisons of .SPACESHIP ifn which returns > -1/0/1/2 based on the comparison. This seems to work well both for the > case of just returning the -1/0/1/2 (when we have just a common > successor with a PHI) or when the different cases are handled with > various other basic blocks. The testcases cover both of those cases, > the latter with different function calls in those. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2022-01-14 Jakub Jelinek > > PR target/103973 > * tree-cfg.h (cond_only_block_p): Declare. > * tree-ssa-phiopt.c (cond_only_block_p): Move function to ... > * tree-cfg.c (cond_only_block_p): ... here. No longer static. > * optabs.def (spaceship_optab): New optab. > * internal-fn.def (SPACESHIP): New internal function. > * internal-fn.h (expand_SPACESHIP): Declare. > * internal-fn.c (expand_PHI): Formatting fix. > (expand_SPACESHIP): New function. > * tree-ssa-math-opts.c (optimize_spaceship): New function. > (math_opts_dom_walker::after_dom_children): Use it. > * config/i386/i386.md (spaceship3): New define_expand. > * config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare. > * config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function. > * doc/md.texi (spaceship@var{m}3): Document. > > * gcc.target/i386/pr103973-1.c: New test. > * gcc.target/i386/pr103973-2.c: New test. > * gcc.target/i386/pr103973-3.c: New test. > * gcc.target/i386/pr103973-4.c: New test. > * g++.target/i386/pr103973-1.C: New test. > * g++.target/i386/pr103973-2.C: New test. > * g++.target/i386/pr103973-3.C: New test. > * g++.target/i386/pr103973-4.C: New test. >--- gcc/config/i386/i386.md.jj 2022-01-14 11:51:34.432384170 +0100 > +++ gcc/config/i386/i386.md 2022-01-14 18:22:41.140906449 +0100 > @@ -23886,6 +23886,18 @@ (define_insn "hreset" >[(set_attr "type" "other") > (set_attr "length" "4")]) > > +;; Spaceship optimization > +(define_expand "spaceship3" > + [(match_operand:SI 0 "register_operand") > + (match_operand:MODEF 1 "cmp_fp_expander_operand") > + (match_operand:MODEF 2 "cmp_fp_expander_operand")] > + "(TARGET_80387 || (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)) > + && TARGET_CMOVE && TARGET_IEEE_FP" Is there a reason that this pattern is limited to TARGET_IEEE_FP? During the expansion in ix86_expand_fp_spaceship, we can just skip jump on unordered, while ix86_expand_fp_compare will emit the correct comparison mode depending on TARGET_IEEE_FP. > +{ > + ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]); > + DONE; > +}) > + > (include "mmx.md") > (include "sse.md") > (include "sync.md") > --- gcc/config/i386/i386-expand.c.jj2022-01-14 11:51:34.429384213 +0100 > +++ gcc/config/i386/i386-expand.c 2022-01-14 21:02:09.446437810 +0100 > @@ -2879,6 +2879,46 @@ ix86_expand_setcc (rtx dest, enum rtx_co >emit_insn (gen_rtx_SET (dest, ret)); > } > > +/* Expand floating point op0 <=> op1 if NaNs are honored. */ > + > +void > +ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1) > +{ > + gcc_checking_assert (ix86_fp_comparison_strategy (GT) == IX86_FPCMP_COMI); > + rtx gt =
Re: [PATCH] c++: error message for dependent template members [PR70417]
Hi Jason, Hope you are well. Apologies, I've not had time to sit down and look at this since last month I quit my old job, then I had family around for the whole of the Christmas period, and then even more recently I've had to start my new job. In any case happy that you managed to figure it all out. I noticed the small regression at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104025. To be honest I wouldn't even know how to begin to go about fixing that so perhaps it's best if I leave that to you? I guess it's not playing well with the use of warn_missing_template_keyword. Maybe I didn't set that variable up properly or something. Kind regards, Anthony On Thu, 13 Jan 2022 at 21:01, Jason Merrill wrote: > On 12/9/21 10:51, Jason Merrill wrote: > > On 12/4/21 12:23, Anthony Sharp wrote: > >> Hi Jason, > >> > >> Hope you are well. Apologies for not coming back sooner. > >> > >> >I'd put it just above the definition of saved_token_sentinel in > >> parser.c. > >> > >> Sounds good, done. > >> > >> >Maybe cp_parser_require_end_of_template_parameter_list? Either way > >> is fine. > >> > >> Even better, have changed it. > >> > >> >Hmm, good point; operators that are member functions must be > >> non-static, > >> >so we couldn't be doing a comparison of the address of the function. > >> > >> In that case I have set it to return early there. > >> > >> >So declarator_p should be true there. I'll fix that. > >> > >> Thank you. > >> > >> >> + if (next_token->keyword == RID_TEMPLATE) > >> >> +{ > >> >> + /* But at least make sure it's properly formed (e.g. see > >> PR19397). */ > >> >> + if (cp_lexer_peek_nth_token (parser->lexer, 2)->type == > >> CPP_NAME) > >> >> + return 1; > >> >> + > >> >> + return -1; > >> >> +} > >> >> + > >> >> + /* Could be a ~ referencing the destructor of a class template. > >> */ > >> >> + if (next_token->type == CPP_COMPL) > >> >> +{ > >> >> + /* It could only be a template. */ > >> >> + if (cp_lexer_peek_nth_token (parser->lexer, 2)->type == > >> CPP_NAME) > >> >> + return 1; > >> >> + > >> >> + return -1; > >> >> +} > >> > > >> >Why don't these check for the < ? > >> > >> I think perhaps I could have named the function better; instead of > >> next_token_begins_template_id, how's about > >> next_token_begins_template_name? > >> That's all I intended to check for. > > > > You're using it to control whether we try to parse a template-id, and > > it's used to initialize variables named looks_like_template_id, so I > > think better to keep the old name. > > > >> In the first case, something like "->template some_name" will always be > >> intended as a template, so no need to check for the <. If there were > >> default > >> template arguments you could also validly omit the <> completely, I > think > >> (could be wrong). > > > > Or if the template arguments can be deduced, yes: > > > > template struct A > > { > >template void f(U u); > > }; > > > > template void g(A a) > > { > >a->template f(42); > > } > > > > But 'f' is still not a template-id. > > > > ... > > > > Actually, it occurs to me that you might be better off handling this in > > cp_parser_template_name, something like the below, to avoid the complex > > duplicate logic in the id-expression handling. > > > > Note that in this patch I'm using "any_object_scope" as a proxy for "I'm > > parsing an expression", since !is_declaration doesn't work for that; as > > a result, this doesn't handle the static member function template case. > > For that you'd probably still need to pass down a flag so that > > cp_parser_template_name knows it's being called from > > cp_parser_id_expression. > > > > Your patch has a false positive on > > > > template struct A { }; > > template void f() > > { > >A(); > > }; > > > > which my patch checks in_template_argument_list_p to avoid, though > > checking any_object_scope also currently avoids it. > > > > What do you think? > > I decided that it made more sense to keep the check in > cp_parser_id_expression like you had it, but I moved it to the end to > simplify the logic. Here's what I'm applying, thanks!