FYI: "declare mapper" patch set for Fortran (June 2022) (was: [PATCH v3 11/11] FYI/unfinished: OpenMP 5.0 "declare mapper" support for C++)
On 13.09.22 23:04, Julian Brown wrote: This patch implements OpenMP 5.0 "declare mapper" support for C++. And to complete list of patches belonging to this set, Julian had posted the associated Fortran patch set in June: [PATCH 0/6] OpenMP 5.0: Fortran "declare mapper" support https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596036.html Thanks, Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Re: [PATCH] [ICE] Check another epilog variable peeling case in vectorizable_nonlinear_induction.
On Wed, Sep 14, 2022 at 3:25 AM liuhongt via Gcc-patches wrote: > > In vectorizable_nonlinear_induction, r13-2503-gc13223b790bbc5 prevent > variable peeling by > only checking LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo). But when > "!vect_use_loop_mask_for_alignment_p (loop_vinfo) && > LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0", vectorizer will > still do variable peeling for epilog, and it hits gcc_assert in > vect_peel_nonlinear_iv_init. > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > The patch also fix ICE of the testcase in the PR for ia64-linux-gnu(verified > by cross-compile). > > Ok for trunk? OK. Thanks, Richard. > gcc/ChangeLog: > > PR tree-optimization/106905 > * tree-vect-loop.cc (vectorizable_nonlinear_induction): Return > false when !vect_use_loop_mask_for_alignment_p (loop_vinfo) && > LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0. > > gcc/testsuite/ChangeLog: > > * gcc.target/i386/pr106905.c: New test. > * gcc.target/ia64/pr106905.c: New test. > --- > gcc/testsuite/gcc.target/i386/pr106905.c | 14 ++ > gcc/testsuite/gcc.target/ia64/pr106905.c | 20 > gcc/tree-vect-loop.cc| 6 -- > 3 files changed, 38 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr106905.c > create mode 100644 gcc/testsuite/gcc.target/ia64/pr106905.c > > diff --git a/gcc/testsuite/gcc.target/i386/pr106905.c > b/gcc/testsuite/gcc.target/i386/pr106905.c > new file mode 100644 > index 000..a190a1c84e6 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr106905.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=silvermont -O2 -fvect-cost-model=dynamic" } */ > + > +void > +foo_mul_peel (int *a, int b) > +{ > + int i; > + > + for (i = 0; i < 7; ++i) > +{ > + b *= 2; > + a[i] = b; > +} > +} > diff --git a/gcc/testsuite/gcc.target/ia64/pr106905.c > b/gcc/testsuite/gcc.target/ia64/pr106905.c > new file mode 100644 > index 000..1b9656e1203 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/ia64/pr106905.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* { dg-options "-std=c99 -O3 -fPIC" } */ > +long ZDICT_fillNoise_p, ZDICT_trainFromBuffer_legacy_result; > +unsigned ZDICT_fillNoise_acc; > +int ZDICT_totalSampleSize_nbFiles; > +static void ZDICT_fillNoise(void *buffer, long length) { > + unsigned prime2 = 9; > + for (ZDICT_fillNoise_p = 0; ZDICT_fillNoise_p < length; > ZDICT_fillNoise_p++) > +ZDICT_fillNoise_acc *= ((char *)buffer)[ZDICT_fillNoise_p] = prime2; > +} > +long ZDICT_trainFromBuffer_legacy() { > + void *newBuff; > + long total = 0; > + for (; ZDICT_totalSampleSize_nbFiles;) > +total += 0; > + long sBuffSize = total; > + newBuff = 0; > + ZDICT_fillNoise(newBuff + sBuffSize, 32); > + return ZDICT_trainFromBuffer_legacy_result; > +} > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index 8f88f1755be..9c434b66c5b 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -8646,8 +8646,10 @@ vectorizable_nonlinear_induction (loop_vec_info > loop_vinfo, >/* Also doens't support peel for neg when niter is variable. > ??? generate something like niter_expr & 1 ? init_expr : -init_expr? */ >niters_skip = LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo); > - if (niters_skip != NULL_TREE > - && TREE_CODE (niters_skip) != INTEGER_CST) > + if ((niters_skip != NULL_TREE > + && TREE_CODE (niters_skip) != INTEGER_CST) > + || (!vect_use_loop_mask_for_alignment_p (loop_vinfo) > + && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0)) > { >if (dump_enabled_p ()) > dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > -- > 2.27.0 >
Re: [PATCH] RISC-V: Don't try to vectorize tree-ssa/gen-vect-34.c
LGTM, Maybe we can try is after RVV supported.> We don't yet support vectorization on RISC-V. > > gcc/testsuite/ChangeLog> > * gcc.dg/tree-ssa/gen-vect-34.c: Skip RISC-V > > targets.> ---> gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c | 2 +-> 1 file > > changed, 1 insertion(+), 1 deletion(-)> > diff --git > > a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c > > b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c> index > > 8d2d36401fe..41877e05efd 100644> --- > > a/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c> +++ > > b/gcc/testsuite/gcc.dg/tree-ssa/gen-vect-34.c> @@ -13,4 +13,4 @@ float > > summul(int n, float *arg1, float *arg2)> return res1; > > > }> > -/* { dg-final { scan-tree-dump-times > > "vectorized 1 loops" 1 "vect" { target { ! { avr-*-* pru-*-* } } } } } */> > > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { > > target { ! { avr-*-* pru-*-* riscv*-*-* } } } } } */> -- > 2.34.1
[PATCH] [ICE] Check another epilog variable peeling case in vectorizable_nonlinear_induction.
In vectorizable_nonlinear_induction, r13-2503-gc13223b790bbc5 prevent variable peeling by only checking LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo). But when "!vect_use_loop_mask_for_alignment_p (loop_vinfo) && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0", vectorizer will still do variable peeling for epilog, and it hits gcc_assert in vect_peel_nonlinear_iv_init. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. The patch also fix ICE of the testcase in the PR for ia64-linux-gnu(verified by cross-compile). Ok for trunk? gcc/ChangeLog: PR tree-optimization/106905 * tree-vect-loop.cc (vectorizable_nonlinear_induction): Return false when !vect_use_loop_mask_for_alignment_p (loop_vinfo) && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0. gcc/testsuite/ChangeLog: * gcc.target/i386/pr106905.c: New test. * gcc.target/ia64/pr106905.c: New test. --- gcc/testsuite/gcc.target/i386/pr106905.c | 14 ++ gcc/testsuite/gcc.target/ia64/pr106905.c | 20 gcc/tree-vect-loop.cc| 6 -- 3 files changed, 38 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr106905.c create mode 100644 gcc/testsuite/gcc.target/ia64/pr106905.c diff --git a/gcc/testsuite/gcc.target/i386/pr106905.c b/gcc/testsuite/gcc.target/i386/pr106905.c new file mode 100644 index 000..a190a1c84e6 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr106905.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-march=silvermont -O2 -fvect-cost-model=dynamic" } */ + +void +foo_mul_peel (int *a, int b) +{ + int i; + + for (i = 0; i < 7; ++i) +{ + b *= 2; + a[i] = b; +} +} diff --git a/gcc/testsuite/gcc.target/ia64/pr106905.c b/gcc/testsuite/gcc.target/ia64/pr106905.c new file mode 100644 index 000..1b9656e1203 --- /dev/null +++ b/gcc/testsuite/gcc.target/ia64/pr106905.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-std=c99 -O3 -fPIC" } */ +long ZDICT_fillNoise_p, ZDICT_trainFromBuffer_legacy_result; +unsigned ZDICT_fillNoise_acc; +int ZDICT_totalSampleSize_nbFiles; +static void ZDICT_fillNoise(void *buffer, long length) { + unsigned prime2 = 9; + for (ZDICT_fillNoise_p = 0; ZDICT_fillNoise_p < length; ZDICT_fillNoise_p++) +ZDICT_fillNoise_acc *= ((char *)buffer)[ZDICT_fillNoise_p] = prime2; +} +long ZDICT_trainFromBuffer_legacy() { + void *newBuff; + long total = 0; + for (; ZDICT_totalSampleSize_nbFiles;) +total += 0; + long sBuffSize = total; + newBuff = 0; + ZDICT_fillNoise(newBuff + sBuffSize, 32); + return ZDICT_trainFromBuffer_legacy_result; +} diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 8f88f1755be..9c434b66c5b 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -8646,8 +8646,10 @@ vectorizable_nonlinear_induction (loop_vec_info loop_vinfo, /* Also doens't support peel for neg when niter is variable. ??? generate something like niter_expr & 1 ? init_expr : -init_expr? */ niters_skip = LOOP_VINFO_MASK_SKIP_NITERS (loop_vinfo); - if (niters_skip != NULL_TREE - && TREE_CODE (niters_skip) != INTEGER_CST) + if ((niters_skip != NULL_TREE + && TREE_CODE (niters_skip) != INTEGER_CST) + || (!vect_use_loop_mask_for_alignment_p (loop_vinfo) + && LOOP_VINFO_PEELING_FOR_ALIGNMENT (loop_vinfo) < 0)) { if (dump_enabled_p ()) dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, -- 2.27.0
[r13-2650 Regression] FAIL: g++.dg/modules/xtreme-header-2_a.H -std=c++2b (test for excess errors) on Linux/x86_64
On Linux/x86_64, 5d84a4418aa962a715dc74998fea2a7de9d4042c is the first bad commit commit 5d84a4418aa962a715dc74998fea2a7de9d4042c Author: Patrick Palka Date: Tue Sep 13 11:18:14 2022 -0400 libstdc++: Implement ranges::chunk_view from P2442R1 caused FAIL: g++.dg/modules/xtreme-header-2_a.H module-cmi (gcm.cache/$srcdir/g++.dg/modules/xtreme-header-2_a.H.gcm) FAIL: g++.dg/modules/xtreme-header-2_a.H -std=c++2b (internal compiler error: tree check: expected record_type or union_type or qual_union_type, have typename_type in friend_from_decl_list, at cp/module.cc:4737) FAIL: g++.dg/modules/xtreme-header-2_a.H -std=c++2b (test for excess errors) with GCC configured with ../../gcc/configure --prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-2650/usr --enable-clocale=gnu --with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl --enable-libmpx x86_64-linux --disable-bootstrap To reproduce: $ cd {build_dir}/gcc && make check RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-2_a.H --target_board='unix{-m32}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-2_a.H --target_board='unix{-m32\ -march=cascadelake}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-2_a.H --target_board='unix{-m64}'" $ cd {build_dir}/gcc && make check RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-2_a.H --target_board='unix{-m64\ -march=cascadelake}'" (Please do not reply to this email, for question about this report, contact me at haochen dot jiang at intel.com)
[PATCH] xtensa: gcc: implement MI thunk generation for call0 ABI
Suwa-san, could you please take a look? This change fixes the fowllowing testsuite failures when building for call0 ABI: g++.dg/ipa/pr60640-4.C g++.dg/ipa/pr83549.C g++.dg/ipa/pr83667.C g++.dg/torture/pr81812.C gcc/ * config/xtensa/xtensa.cc (xtensa_can_output_mi_thunk) (xtensa_output_mi_thunk): New functions. (TARGET_ASM_CAN_OUTPUT_MI_THUNK) (TARGET_ASM_OUTPUT_MI_THUNK): New macro definitions. (xtensa_prepare_expand_call): Use fixed register a8 as temporary when called with reload_completed set to 1. --- gcc/config/xtensa/xtensa.cc | 116 +++- 1 file changed, 115 insertions(+), 1 deletion(-) diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index f1b3331ea558..e5abd356a745 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -191,6 +191,15 @@ static bool xtensa_can_eliminate (const int from ATTRIBUTE_UNUSED, static HOST_WIDE_INT xtensa_starting_frame_offset (void); static unsigned HOST_WIDE_INT xtensa_asan_shadow_offset (void); static bool xtensa_function_ok_for_sibcall (tree, tree); +static bool xtensa_can_output_mi_thunk (const_tree thunk_fndecl ATTRIBUTE_UNUSED, + HOST_WIDE_INT delta ATTRIBUTE_UNUSED, + HOST_WIDE_INT vcall_offset ATTRIBUTE_UNUSED, + const_tree function ATTRIBUTE_UNUSED); +static void xtensa_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, + HOST_WIDE_INT delta, + HOST_WIDE_INT vcall_offset, + tree function); + static rtx xtensa_delegitimize_address (rtx); @@ -351,6 +360,12 @@ static rtx xtensa_delegitimize_address (rtx); #undef TARGET_FUNCTION_OK_FOR_SIBCALL #define TARGET_FUNCTION_OK_FOR_SIBCALL xtensa_function_ok_for_sibcall +#undef TARGET_ASM_CAN_OUTPUT_MI_THUNK +#define TARGET_ASM_CAN_OUTPUT_MI_THUNK xtensa_can_output_mi_thunk + +#undef TARGET_ASM_OUTPUT_MI_THUNK +#define TARGET_ASM_OUTPUT_MI_THUNK xtensa_output_mi_thunk + struct gcc_target targetm = TARGET_INITIALIZER; @@ -2173,7 +2188,16 @@ xtensa_prepare_expand_call (int callop, rtx *operands) addr = gen_sym_PLT (addr); if (!call_insn_operand (addr, VOIDmode)) -XEXP (operands[callop], 0) = copy_to_mode_reg (Pmode, addr); +{ + /* This may be called while generating MI thunk when we pretend +that reload is over. Use a8 as a temporary register in that case. */ + rtx reg = can_create_pseudo_p () + ? copy_to_mode_reg (Pmode, addr) + : copy_to_suggested_reg (addr, +gen_rtx_REG (Pmode, A8_REG), +Pmode); + XEXP (operands[callop], 0) = reg; +} } @@ -4983,6 +5007,96 @@ xtensa_function_ok_for_sibcall (tree decl ATTRIBUTE_UNUSED, tree exp ATTRIBUTE_U return true; } +static bool +xtensa_can_output_mi_thunk (const_tree thunk_fndecl ATTRIBUTE_UNUSED, + HOST_WIDE_INT delta ATTRIBUTE_UNUSED, + HOST_WIDE_INT vcall_offset ATTRIBUTE_UNUSED, + const_tree function ATTRIBUTE_UNUSED) +{ + if (TARGET_WINDOWED_ABI) +return false; + + return true; +} + +/* Output code to add DELTA to the first argument, and then jump + to FUNCTION. Used for C++ multiple inheritance. */ +static void +xtensa_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED, + HOST_WIDE_INT delta, + HOST_WIDE_INT vcall_offset, + tree function) +{ + rtx this_rtx; + rtx funexp; + rtx_insn *insn; + int this_reg_no; + rtx temp0 = gen_rtx_REG (Pmode, A9_REG); + const char *fnname = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (thunk)); + + reload_completed = 1; + + if (aggregate_value_p (TREE_TYPE (TREE_TYPE (function)), function)) +this_reg_no = 3; + else +this_reg_no = 2; + + this_rtx = gen_rtx_REG (Pmode, A0_REG + this_reg_no); + + if (delta) +{ + if (xtensa_simm8 (delta)) + emit_insn (gen_addsi3 (this_rtx, this_rtx, GEN_INT (delta))); + else + { + emit_move_insn (temp0, GEN_INT (delta)); + emit_insn (gen_addsi3 (this_rtx, this_rtx, temp0)); + } +} + + if (vcall_offset) +{ + rtx temp1 = gen_rtx_REG (Pmode, A0_REG + 10); + rtx addr = temp1; + + emit_move_insn (temp0, gen_rtx_MEM (Pmode, this_rtx)); + if (xtensa_uimm8x4 (vcall_offset)) + addr = plus_constant (Pmode, temp0, vcall_offset); + else if (xtensa_simm8 (vcall_offset)) + emit_insn (gen_addsi3 (temp1, temp0, GEN_INT (vcall_offset))); + else + { + emit_move_insn (temp1, GEN_INT (vcall_offset)); + emit_insn (gen_addsi3 (temp1, temp0, temp1)); + } + emit_move_insn (temp1, gen_rtx_MEM (Pmode, addr)); + emit_insn (gen_add2_insn (this
[PATCH v3 10/11] Use OMP_ARRAY_SECTION instead of TREE_LIST in C++ FE
This patch changes the representation of OMP array sections in the C++ front end to use the new OMP_ARRAY_SECTION tree code instead of a TREE_LIST. This is important for "declare mapper" support, because the array section representation may stick around longer (in "declare mapper" definitions), and special-case handling TREE_LIST becomes necessary in more places, which starts to become unwieldy. Including this patch to support the surrounding unfinished/FYI patches. 2022-02-18 Julian Brown gcc/c-family/ * c-omp.cc (c_omp_split_clauses): Support OMP_ARRAY_SECTION. gcc/cp/ * parser.cc (cp_parser_omp_var_list_no_open): Use OMP_ARRAY_SECTION code instead of TREE_LIST to represent OpenMP array sections. * pt.cc (tsubst_copy, tsubst_omp_clause_decl, tsubst_copy_and_build): Add OMP_ARRAY_SECTION support. * semantics.cc (handle_omp_array_sections_1, handle_omp_array_sections, cp_oacc_check_attachments, finish_omp_clauses): Use OMP_ARRAY_SECTION instead of TREE_LIST where appropriate. * gimplify.cc (gimplify_expr): Ensure OMP_ARRAY_SECTION has been processed out before gimplification. --- gcc/c-family/c-omp.cc | 14 ++ gcc/cp/parser.cc | 15 --- gcc/cp/pt.cc | 52 gcc/cp/semantics.cc | 62 ++- gcc/gimplify.cc | 3 +++ 5 files changed, 112 insertions(+), 34 deletions(-) diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc index e3cda716532..2c25b0072bc 100644 --- a/gcc/c-family/c-omp.cc +++ b/gcc/c-family/c-omp.cc @@ -2671,6 +2671,9 @@ c_omp_split_clauses (location_t loc, enum tree_code code, } else if (TREE_CODE (OMP_CLAUSE_DECL (c)) == TREE_LIST) { + /* TODO: This can go away once we transition all uses of +TREE_LIST for representing OMP array sections to +OMP_ARRAY_SECTION. */ tree t; for (t = OMP_CLAUSE_DECL (c); TREE_CODE (t) == TREE_LIST; t = TREE_CHAIN (t)) @@ -2679,6 +2682,17 @@ c_omp_split_clauses (location_t loc, enum tree_code code, bitmap_clear_bit (&allocate_head, DECL_UID (t)); break; } + else if (TREE_CODE (OMP_CLAUSE_DECL (c)) == OMP_ARRAY_SECTION) + { + tree t; + for (t = OMP_CLAUSE_DECL (c); + TREE_CODE (t) == OMP_ARRAY_SECTION; + t = TREE_OPERAND (t, 0)) + ; + if (DECL_P (t)) + bitmap_clear_bit (&allocate_head, DECL_UID (t)); + break; + } /* FALLTHRU */ case OMP_CLAUSE_PRIVATE: case OMP_CLAUSE_FIRSTPRIVATE: diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 5468502cd26..80d33c3b674 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -36911,11 +36911,14 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, decl = TREE_OPERAND (decl, 0); for (int i = dims.length () - 1; i >= 0; i--) - decl = tree_cons (dims[i].low_bound, dims[i].length, decl); + decl = build3_loc (input_location, OMP_ARRAY_SECTION, + TREE_TYPE (decl), decl, dims[i].low_bound, + dims[i].length); } else if (TREE_CODE (decl) == INDIRECT_REF) { bool ref_p = REFERENCE_REF_P (decl); + tree type = TREE_TYPE (decl); /* Turn *foo into the representation previously used for foo[0]. */ @@ -36925,7 +36928,8 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, /* ...but don't add the [0:1] representation for references (because they have special handling elsewhere). */ if (!ref_p) - decl = tree_cons (integer_zero_node, integer_one_node, decl); + decl = build3_loc (input_location, OMP_ARRAY_SECTION, type, + decl, integer_zero_node, integer_one_node); } else if (TREE_CODE (decl) == ARRAY_REF) { @@ -36934,7 +36938,8 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, decl = TREE_OPERAND (decl, 0); STRIP_NOPS (decl); - decl = tree_cons (idx, integer_one_node, decl); + decl = build3_loc (input_location, OMP_ARRAY_SECTION, +TREE_TYPE (decl), decl, idx, integer_one_node); } else if (TREE_CODE (decl) == NON_LVALUE_EXPR || CONVERT_EXPR_P (decl))
[PATCH v3 11/11] FYI/unfinished: OpenMP 5.0 "declare mapper" support for C++
This patch implements OpenMP 5.0 "declare mapper" support for C++. This hasn't been fully revised yet following previous review comments, but I am including it in this series to demonstrate the new approach to gimplifying map clauses after "declare mapper" instantiation. The "gimplify_scan_omp_clauses" function is still called twice: firstly (before scanning the offload region's body) with SUPPRESS_GIMPLIFICATION set to true. This sets up variables in the splay tree, etc. but does not gimplify anything. Then, implicit "declare mappers" are instantiated after scanning the region's body, then "gimplify_scan_omp_clauses" is called again, and does the rest of its previous tasks -- builds struct sibling lists, and gimplifies clauses. Then gimplify_adjust_omp_clauses is called, and compilation continues. Does this approach seem OK? Thanks, Julian --- gcc/cp/cp-gimplify.cc | 6 + gcc/cp/cp-objcp-common.h | 2 + gcc/cp/cp-tree.h | 10 + gcc/cp/decl.cc| 18 +- gcc/cp/mangle.cc | 5 +- gcc/cp/name-lookup.cc | 3 +- gcc/cp/parser.cc | 393 +- gcc/cp/pt.cc | 92 +++- gcc/cp/semantics.cc | 495 +- gcc/fortran/parse.cc | 3 + gcc/gimplify.cc | 379 -- gcc/langhooks-def.h | 3 + gcc/langhooks.cc | 9 + gcc/langhooks.h | 4 + gcc/omp-general.h | 52 ++ gcc/testsuite/c-c++-common/gomp/map-6.c | 6 +- gcc/testsuite/g++.dg/gomp/declare-mapper-1.C | 58 ++ gcc/testsuite/g++.dg/gomp/declare-mapper-2.C | 30 ++ gcc/testsuite/g++.dg/gomp/declare-mapper-3.C | 27 + gcc/testsuite/g++.dg/gomp/declare-mapper-4.C | 74 +++ gcc/testsuite/g++.dg/gomp/ind-base-3.C| 1 - gcc/testsuite/g++.dg/gomp/member-array-2.C| 1 - gcc/tree-core.h | 4 + gcc/tree-pretty-print.cc | 42 ++ gcc/tree.cc | 2 + gcc/tree.def | 7 + gcc/tree.h| 21 + include/gomp-constants.h | 8 +- .../testsuite/libgomp.c++/declare-mapper-1.C | 87 +++ .../testsuite/libgomp.c++/declare-mapper-2.C | 55 ++ .../testsuite/libgomp.c++/declare-mapper-3.C | 63 +++ .../testsuite/libgomp.c++/declare-mapper-4.C | 63 +++ .../testsuite/libgomp.c++/declare-mapper-5.C | 52 ++ .../testsuite/libgomp.c++/declare-mapper-6.C | 37 ++ .../testsuite/libgomp.c++/declare-mapper-7.C | 48 ++ .../testsuite/libgomp.c++/declare-mapper-8.C | 61 +++ 36 files changed, 2161 insertions(+), 60 deletions(-) create mode 100644 gcc/testsuite/g++.dg/gomp/declare-mapper-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/declare-mapper-2.C create mode 100644 gcc/testsuite/g++.dg/gomp/declare-mapper-3.C create mode 100644 gcc/testsuite/g++.dg/gomp/declare-mapper-4.C create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-1.C create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-2.C create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-3.C create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-4.C create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-5.C create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-6.C create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-7.C create mode 100644 libgomp/testsuite/libgomp.c++/declare-mapper-8.C diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc index c05be833357..050049f3c1a 100644 --- a/gcc/cp/cp-gimplify.cc +++ b/gcc/cp/cp-gimplify.cc @@ -2282,6 +2282,12 @@ cxx_omp_finish_clause (tree c, gimple_seq *, bool /* openacc */) } } +tree +cxx_omp_finish_mapper_clauses (tree clauses) +{ + return finish_omp_clauses (clauses, C_ORT_OMP); +} + /* Return true if DECL's DECL_VALUE_EXPR (if any) should be disregarded in OpenMP construct, because it is going to be remapped during OpenMP lowering. SHARED is true if DECL diff --git a/gcc/cp/cp-objcp-common.h b/gcc/cp/cp-objcp-common.h index 1a67f14d9b3..b6f72e004d3 100644 --- a/gcc/cp/cp-objcp-common.h +++ b/gcc/cp/cp-objcp-common.h @@ -185,6 +185,8 @@ extern tree cxx_simulate_record_decl (location_t, const char *, #define LANG_HOOKS_OMP_CLAUSE_DTOR cxx_omp_clause_dtor #undef LANG_HOOKS_OMP_FINISH_CLAUSE #define LANG_HOOKS_OMP_FINISH_CLAUSE cxx_omp_finish_clause +#undef LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES +#define LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES cxx_omp_finish_mapper_clauses #undef LANG_HOOKS_OMP_PRIVATIZE_BY_REFERENCE #define LANG_HOOKS_OMP_PRIVATIZE_BY_REFERENCE cxx_omp_privatize_by_reference #undef LANG_HOOKS_OM
[PATCH v3 06/11] OpenMP: Pointers and member mappings
This patch was previously posted as part of the series supporting "declare mapper" for Fortran, here: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596041.html Implementing the "omp declare mapper" functionality, I noticed some cases where handling of derived type members that are pointers doesn't seem to be quite right. At present, a type such as this: type T integer, pointer, dimension(:) :: arrptr end type T type(T) :: tvar [...] !$omp target map(tofrom: tvar%arrptr) will be mapped using three mapping nodes: GOMP_MAP_TO tvar%arrptr (the descriptor) GOMP_MAP_TOFROM *tvar%arrptr%data (the actual array data) GOMP_MAP_ALWAYS_POINTER tvar%arrptr%data (a pointer to the array data) This follows OMP 5.0, 2.19.7.1 "map Clause": "If a list item in a map clause is an associated pointer and the pointer is not the base pointer of another list item in a map clause on the same construct, then it is treated as if its pointer target is implicitly mapped in the same clause. For the purposes of the map clause, the mapped pointer target is treated as if its base pointer is the associated pointer." However, we can also write this: map(to: tvar%arrptr) map(tofrom: tvar%arrptr(3:8)) and then instead we should follow: "If the structure sibling list item is a pointer then it is treated as if its association status is undefined, unless it appears as the base pointer of another list item in a map clause on the same construct." But, that's not implemented quite right at the moment (and completely breaks once we introduce declare mappers), because we still map the "to: tvar%arrptr" as the descriptor and the entire array, then we map the "tvar%arrptr(3:8)" part using the descriptor (again!) and the array slice. The solution is to detect when we're mapping a smaller part of the array (or a subcomponent) on the same directive, and only map the descriptor in that case. So we get mappings like this instead: map(to: tvar%arrptr) --> GOMP_MAP_ALLOC tvar%arrptr (the descriptor) map(tofrom: tvar%arrptr(3:8) --> GOMP_MAP_TOFROM tvar%arrptr%data(3) (size 8-3+1, etc.) GOMP_MAP_ALWAYS_POINTER tvar%arrptr%data (bias 3, etc.) 2022-09-13 Julian Brown gcc/fortran/ * trans-openmp.cc (dependency.h): Include. (gfc_trans_omp_array_section): Do not map descriptors here for OpenMP. (gfc_trans_omp_clauses): Check subcomponent and subarray/element accesses elsewhere in the clause list for pointers to derived types or array descriptors, and map just the pointer/descriptor if we have any. libgomp/ * testsuite/libgomp.fortran/map-subarray.f90: New test. * testsuite/libgomp.fortran/map-subcomponents.f90: New test. * testsuite/libgomp.fortran/struct-elem-map-1.f90: Adjust for descriptor-mapping changes. --- gcc/fortran/trans-openmp.cc | 104 -- .../libgomp.fortran/map-subarray.f90 | 33 ++ .../libgomp.fortran/map-subcomponents.f90 | 32 ++ .../libgomp.fortran/struct-elem-map-1.f90 | 10 +- 4 files changed, 163 insertions(+), 16 deletions(-) create mode 100644 libgomp/testsuite/libgomp.fortran/map-subarray.f90 create mode 100644 libgomp/testsuite/libgomp.fortran/map-subcomponents.f90 diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index 4949f001fed..a5411df4ad7 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3. If not see #include "omp-general.h" #include "omp-low.h" #include "memmodel.h" /* For MEMMODEL_ enums. */ +#include "dependency.h" #undef GCC_DIAG_STYLE #define GCC_DIAG_STYLE __gcc_tdiag__ @@ -2470,22 +2471,18 @@ gfc_trans_omp_array_section (stmtblock_t *block, gfc_omp_namelist *n, } if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl))) { - tree desc_node; tree type = TREE_TYPE (decl); ptr2 = gfc_conv_descriptor_data_get (decl); - desc_node = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_DECL (desc_node) = decl; - OMP_CLAUSE_SIZE (desc_node) = TYPE_SIZE_UNIT (type); if (ptr_kind == GOMP_MAP_ALWAYS_POINTER) { - OMP_CLAUSE_SET_MAP_KIND (desc_node, GOMP_MAP_TO); - node2 = node; - node = desc_node; /* Needs to come first. */ - } - else - { - OMP_CLAUSE_SET_MAP_KIND (desc_node, GOMP_MAP_TO_PSET); - node2 = desc_node; + /* For OpenMP, the descriptor must be mapped with its own explicit +map clause (e.g. both "map(foo%arr)" and "map(foo%arr(:))" must +be present in the clause list if "foo%arr" is a pointer to an +array). So, we don't create a GOMP_MAP_TO_PSET node here. */ + node2 = build_omp_clause (input_location, OMP_CLAUSE_MAP); + OMP_CLAUSE_SET_MAP_KIND (node2, GOMP_MAP_TO_PSET); + OMP_CLAUSE
[PATCH v3 09/11] FYI/unfinished: OpenMP: lvalue parsing for map clauses (C++)
This patch changes parsing for OpenMP map clauses in C++ to use the generic expression parser, hence adds support for parsing general lvalues (as required by OpenMP 5.0+). This patch hasn't been fully revised following previous review comments yet, but I'm including it in support of the following (declare mapper) patch, and to demonstrate the cases in baseptrs-4.C that work with the lvalue-parsing support enabled. --- gcc/c-family/c-omp.cc | 1 + gcc/cp/error.cc | 9 + gcc/cp/parser.cc | 141 +-- gcc/cp/parser.h | 3 + gcc/cp/semantics.cc | 7 +- gcc/gimplify.cc | 39 - gcc/testsuite/c-c++-common/gomp/map-6.c | 4 +- gcc/testsuite/g++.dg/gomp/ind-base-3.C| 38 gcc/testsuite/g++.dg/gomp/map-assignment-1.C | 12 ++ gcc/testsuite/g++.dg/gomp/map-inc-1.C | 10 ++ gcc/testsuite/g++.dg/gomp/map-lvalue-ref-1.C | 19 ++ gcc/testsuite/g++.dg/gomp/map-ptrmem-1.C | 37 gcc/testsuite/g++.dg/gomp/map-ptrmem-2.C | 40 + .../g++.dg/gomp/map-static-cast-lvalue-1.C| 17 ++ gcc/testsuite/g++.dg/gomp/map-ternary-1.C | 20 +++ gcc/testsuite/g++.dg/gomp/member-array-2.C| 92 ++ gcc/testsuite/g++.dg/gomp/pr67522.C | 2 +- gcc/tree-pretty-print.cc | 14 ++ gcc/tree.def | 3 + libgomp/testsuite/libgomp.c++/baseptrs-4.C| 26 ++- libgomp/testsuite/libgomp.c++/ind-base-1.C| 162 ++ libgomp/testsuite/libgomp.c++/ind-base-2.C| 49 ++ libgomp/testsuite/libgomp.c++/map-comma-1.C | 15 ++ .../testsuite/libgomp.c++/map-rvalue-ref-1.C | 22 +++ libgomp/testsuite/libgomp.c++/struct-ref-1.C | 97 +++ .../libgomp.c-c++-common/array-field-1.c | 35 .../libgomp.c-c++-common/array-of-struct-1.c | 65 +++ .../libgomp.c-c++-common/array-of-struct-2.c | 65 +++ 28 files changed, 1009 insertions(+), 35 deletions(-) create mode 100644 gcc/testsuite/g++.dg/gomp/ind-base-3.C create mode 100644 gcc/testsuite/g++.dg/gomp/map-assignment-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/map-inc-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/map-lvalue-ref-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/map-ptrmem-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/map-ptrmem-2.C create mode 100644 gcc/testsuite/g++.dg/gomp/map-static-cast-lvalue-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/map-ternary-1.C create mode 100644 gcc/testsuite/g++.dg/gomp/member-array-2.C create mode 100644 libgomp/testsuite/libgomp.c++/ind-base-1.C create mode 100644 libgomp/testsuite/libgomp.c++/ind-base-2.C create mode 100644 libgomp/testsuite/libgomp.c++/map-comma-1.C create mode 100644 libgomp/testsuite/libgomp.c++/map-rvalue-ref-1.C create mode 100644 libgomp/testsuite/libgomp.c++/struct-ref-1.C create mode 100644 libgomp/testsuite/libgomp.c-c++-common/array-field-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/array-of-struct-1.c create mode 100644 libgomp/testsuite/libgomp.c-c++-common/array-of-struct-2.c diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc index a2373fda537..e3cda716532 100644 --- a/gcc/c-family/c-omp.cc +++ b/gcc/c-family/c-omp.cc @@ -3220,6 +3220,7 @@ c_omp_address_inspector::map_supported_p () || TREE_CODE (t) == SAVE_EXPR || TREE_CODE (t) == POINTER_PLUS_EXPR || TREE_CODE (t) == NON_LVALUE_EXPR +|| TREE_CODE (t) == OMP_ARRAY_SECTION || TREE_CODE (t) == NOP_EXPR) if (TREE_CODE (t) == COMPOUND_EXPR) t = TREE_OPERAND (t, 1); diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc index 94181e76d0e..5a7f41c1ca0 100644 --- a/gcc/cp/error.cc +++ b/gcc/cp/error.cc @@ -2471,6 +2471,15 @@ dump_expr (cxx_pretty_printer *pp, tree t, int flags) pp_cxx_right_bracket (pp); break; +case OMP_ARRAY_SECTION: + dump_expr (pp, TREE_OPERAND (t, 0), flags); + pp_cxx_left_bracket (pp); + dump_expr (pp, TREE_OPERAND (t, 1), flags); + pp_colon (pp); + dump_expr (pp, TREE_OPERAND (t, 2), flags); + pp_cxx_right_bracket (pp); + break; + case UNARY_PLUS_EXPR: dump_unary_op (pp, "+", t, flags); break; diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 68fc78e7c5c..5468502cd26 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -4320,6 +4320,9 @@ cp_parser_new (cp_lexer *lexer) parser->omp_declare_simd = NULL; parser->oacc_routine = NULL; + /* Allow array slice in expression. */ + parser->omp_array_section_p = false; + /* Not declaring an implicit function template. */ parser->auto_is_implicit_function_template_parm_p = false; parser->fully_implicit_function_template_p = false; @@ -8100,6 +8103,7 @@ cp_parser_postfix_open_square_expression (cp_parser *parser, releasing_vec express
[PATCH v3 07/11] OpenMP/OpenACC: Reindent TO/FROM/_CACHE_ stanza in {c_}finish_omp_clause
This patch trivially adds braces and reindents the OMP_CLAUSE_TO/OMP_CLAUSE_FROM/OMP_CLAUSE__CACHE_ stanza in c_finish_omp_clause and finish_omp_clause, in preparation for the following patch (to clarify the diff a little). 2022-09-13 Julian Brown gcc/c/ * c-typeck.cc (c_finish_omp_clauses): Add braces and reindent OMP_CLAUSE_TO/OMP_CLAUSE_FROM/OMP_CLAUSE__CACHE_ stanza. gcc/cp/ * semantics.cc (finish_omp_clause): Add braces and reindent OMP_CLAUSE_TO/OMP_CLAUSE_FROM/OMP_CLAUSE__CACHE_ stanza. --- gcc/c/c-typeck.cc | 615 +- gcc/cp/semantics.cc | 778 ++-- 2 files changed, 702 insertions(+), 691 deletions(-) diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index 7da8d70b3bd..5ebeeb8d956 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -14999,321 +14999,326 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) case OMP_CLAUSE_TO: case OMP_CLAUSE_FROM: case OMP_CLAUSE__CACHE_: - t = OMP_CLAUSE_DECL (c); - if (TREE_CODE (t) == TREE_LIST) - { - grp_start_p = pc; - grp_sentinel = OMP_CLAUSE_CHAIN (c); + { + t = OMP_CLAUSE_DECL (c); + if (TREE_CODE (t) == TREE_LIST) + { + grp_start_p = pc; + grp_sentinel = OMP_CLAUSE_CHAIN (c); - if (handle_omp_array_sections (c, ort)) - remove = true; - else - { - t = OMP_CLAUSE_DECL (c); - if (!omp_mappable_type (TREE_TYPE (t))) - { - error_at (OMP_CLAUSE_LOCATION (c), - "array section does not have mappable type " - "in %qs clause", - omp_clause_code_name[OMP_CLAUSE_CODE (c)]); - remove = true; - } - else if (TYPE_ATOMIC (TREE_TYPE (t))) - { - error_at (OMP_CLAUSE_LOCATION (c), - "%<_Atomic%> %qE in %qs clause", t, - omp_clause_code_name[OMP_CLAUSE_CODE (c)]); - remove = true; - } - while (TREE_CODE (t) == ARRAY_REF) - t = TREE_OPERAND (t, 0); - if (TREE_CODE (t) == COMPONENT_REF - && TREE_CODE (TREE_TYPE (t)) == ARRAY_TYPE) - { - do - { - t = TREE_OPERAND (t, 0); - if (TREE_CODE (t) == MEM_REF - || TREE_CODE (t) == INDIRECT_REF) - { - t = TREE_OPERAND (t, 0); - STRIP_NOPS (t); - if (TREE_CODE (t) == POINTER_PLUS_EXPR) - t = TREE_OPERAND (t, 0); - } - } - while (TREE_CODE (t) == COMPONENT_REF -|| TREE_CODE (t) == ARRAY_REF); - - if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP - && OMP_CLAUSE_MAP_IMPLICIT (c) - && (bitmap_bit_p (&map_head, DECL_UID (t)) - || bitmap_bit_p (&map_field_head, DECL_UID (t)) - || bitmap_bit_p (&map_firstprivate_head, - DECL_UID (t - { - remove = true; - break; - } - if (bitmap_bit_p (&map_field_head, DECL_UID (t))) - break; - if (bitmap_bit_p (&map_head, DECL_UID (t))) - { - if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP) - error_at (OMP_CLAUSE_LOCATION (c), - "%qD appears more than once in motion " - "clauses", t); - else if (ort == C_ORT_ACC) - error_at (OMP_CLAUSE_LOCATION (c), - "%qD appears more than once in data " - "clauses", t); - else - error_at (OMP_CLAUSE_LOCATION (c), - "%qD appears more than once in map " - "clauses", t); - remove = true; - } - else - { - bitmap_set_bit (&map_head, DECL_UID (t)); - bitmap_set_bit (&map_field_head, DECL_UID (t)); - } - } -
[PATCH v3 05/11] OpenMP: push attaches to end of clause list in "target" regions
This patch moves GOMP_MAP_ATTACH{_ZERO_LENGTH_ARRAY_SECTION} nodes to the end of the clause list, for offload regions. This ensures that when we do the attach operation, both the "attachment point" and the target region have both already been mapped on the target. This avoids a pathological case that can otherwise happen with struct sibling-list handling. 2022-09-13 Julian Brown gcc/ * gimplify.cc (omp_segregate_mapping_groups): Update comment. (omp_push_attaches_to_end): New function. (gimplify_scan_omp_clauses): Use omp_push_attaches_to_end for offloaded regions. --- gcc/gimplify.cc | 66 - gcc/testsuite/g++.dg/gomp/target-lambda-1.C | 2 +- 2 files changed, 65 insertions(+), 3 deletions(-) diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index c7998c2ccbd..bc7848843b3 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -9641,7 +9641,9 @@ omp_tsort_mapping_groups (vec *groups, /* Split INLIST into two parts, moving groups corresponding to ALLOC/RELEASE/DELETE mappings to one list, and other mappings to another. The former list is then appended to the latter. Each sub-list retains the - order of the original list. */ + order of the original list. + See also omp_push_attaches_to_end below -- we call that later after scanning + omp clauses. */ static omp_mapping_group * omp_segregate_mapping_groups (omp_mapping_group *inlist) @@ -9681,6 +9683,55 @@ omp_segregate_mapping_groups (omp_mapping_group *inlist) return tf_groups; } +/* This function moves GOMP_MAP_ATTACH{_ZERO_LENGTH_ARRAY_SECTION} nodes to the + end of the clause list, for offload regions. This ensures that when we do + the attach, both the "attachment point" and the target region have both + already been mapped on the target. This avoids a pathological case that can + otherwise happen with struct sibling-list handling. + + Do not call this for non-offload regions, e.g. for "enter data" or + "exit data" directives. + + The order of attach nodes and of non-attach nodes is otherwise retained. */ + +static tree +omp_push_attaches_to_end (tree list) +{ + tree nonattach_list = NULL_TREE, attach_list = NULL_TREE; + tree *nonattach_tail = &nonattach_list, *attach_tail = &attach_list; + + for (tree w = list; w;) +{ + tree next = OMP_CLAUSE_CHAIN (w); + + if (OMP_CLAUSE_CODE (w) != OMP_CLAUSE_MAP) + goto nonattach; + + switch (OMP_CLAUSE_MAP_KIND (w)) + { + case GOMP_MAP_ATTACH: + case GOMP_MAP_ATTACH_ZERO_LENGTH_ARRAY_SECTION: + *attach_tail = w; + OMP_CLAUSE_CHAIN (w) = NULL_TREE; + attach_tail = &OMP_CLAUSE_CHAIN (w); + break; + + default: + nonattach: + *nonattach_tail = w; + OMP_CLAUSE_CHAIN (w) = NULL_TREE; + nonattach_tail = &OMP_CLAUSE_CHAIN (w); + } + + w = next; +} + + /* Splice lists together. */ + *nonattach_tail = attach_list; + + return nonattach_list; +} + /* Given a list LIST_P containing groups of mappings given by GROUPS, reorder those groups based on the output list of omp_tsort_mapping_groups -- singly-linked, threaded through each element's NEXT pointer starting at @@ -11950,7 +12001,18 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, list_p = &OMP_CLAUSE_CHAIN (c); } - ctx->clauses = *orig_list_p; + if ((region_type & ORT_TARGET) != 0) +/* If we have a target region, we can push all the attaches to the end of + the list (we may have standalone "attach" operations synthesized for + GOMP_MAP_STRUCT nodes that must be processed after the attachment point + AND the pointed-to block have been mapped). */ +ctx->clauses = omp_push_attaches_to_end (*orig_list_p); + else +/* ...but if we have something else, e.g. "enter data", we need to keep + "attach" nodes together with the previous node they attach to so that + separate "exit data" operations work properly (see libgomp/target.c). */ +ctx->clauses = *orig_list_p; + gimplify_omp_ctxp = ctx; } diff --git a/gcc/testsuite/g++.dg/gomp/target-lambda-1.C b/gcc/testsuite/g++.dg/gomp/target-lambda-1.C index bff7fa7c669..5ce8ceadb19 100644 --- a/gcc/testsuite/g++.dg/gomp/target-lambda-1.C +++ b/gcc/testsuite/g++.dg/gomp/target-lambda-1.C @@ -87,7 +87,7 @@ int main (void) return 0; } -/* { dg-final { scan-tree-dump {#pragma omp target num_teams.* firstprivate\(b\) map\(alloc:MEM.* \[len: 0\]\) map\(firstprivate:iptr \[pointer assign, bias: 0\]\) map\(alloc:MEM.* \[len: 0\]\) map\(firstprivate:this \[pointer assign, bias: 0\]\) map\(to:\*__closure \[len: [0-9]+\]\) map\(firstprivate:__closure \[pointer assign, bias: 0\]\) map\(tofrom:\*_[0-9]+ \[len: [0-9]+\]\) map\(always_pointer:__closure->__this \[pointer assign, bias: 0\]\) map\(from:mapped \[len: [0-9]+\]\) map\(alloc:\*_[0-9]+ \[len: 0\]\) map\(attach_zero_length_array_se
[PATCH v3 04/11] OpenMP/OpenACC: mapping group list-handling improvements
This patch adjusts OpenMP/OpenACC clause list handling in a couple of places, in preparation for the following mapping-clause expansion rework patch. Firstly mapping groups are removed as a whole in the C and C++ front-ends when an error is detected, which avoids leaving "badly-formed" mapping clause groups in the list. Secondly, reindexing of the omp_mapping_group hashmap (during omp_build_struct_sibling_lists) has been reimplemented, fixing some tricky corner-cases where mapping groups are removed from a list at the same time as it is being reordered. Thirdly, code to check if a different clause on the same directive maps the whole of a struct that we have a component mapping for (for example) has been outlined, removing a bit of code duplication. (These changes could be split into different patches if necessary.) 2022-09-13 Julian Brown gcc/ * gimplify.cc (omp_group_last): Allow GOMP_MAP_ATTACH_DETACH after GOMP_MAP_STRUCT (for reindexing). (omp_gather_mapping_groups): Reimplement using... (omp_gather_mapping_groups_1): This new function. Stop processing at GATHER_SENTINEL. (omp_group_base): Allow GOMP_MAP_TO_PSET without any following node. (omp_index_mapping_groups): Reimplement using... (omp_index_mapping_groups_1): This new function. Handle REINDEX_SENTINEL. (omp_reindex_mapping_groups, omp_mapped_by_containing_struct): New functions. (omp_tsort_mapping_groups_1): Adjust handling of base group being the same as current group. Use omp_mapped_by_containing_struct. (omp_build_struct_sibling_lists): Use omp_mapped_by_containing_struct and omp_reindex_mapping_groups. Robustify group deletion for reordered lists. (gimplify_scan_omp_clauses): Update calls to omp_build_struct_sibling_lists. gcc/c/ * c-typeck.cc (c_finish_omp_clauses): Remove whole mapping node group on error. gcc/cp/ * semantics.cc (finish_omp_clauses): Likewise. --- gcc/c/c-typeck.cc | 24 - gcc/cp/semantics.cc | 26 - gcc/gimplify.cc | 227 +++- 3 files changed, 209 insertions(+), 68 deletions(-) diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index ee891ee33c2..7da8d70b3bd 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -14229,12 +14229,19 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) break; } + tree *grp_start_p = NULL, grp_sentinel = NULL_TREE; + for (pc = &clauses, c = clauses; c ; c = *pc) { bool remove = false; bool need_complete = false; bool need_implicitly_determined = false; + /* We've reached the end of a list of expanded nodes. Reset the group +start pointer. */ + if (c == grp_sentinel) + grp_start_p = NULL; + switch (OMP_CLAUSE_CODE (c)) { case OMP_CLAUSE_SHARED: @@ -14995,6 +15002,9 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) t = OMP_CLAUSE_DECL (c); if (TREE_CODE (t) == TREE_LIST) { + grp_start_p = pc; + grp_sentinel = OMP_CLAUSE_CHAIN (c); + if (handle_omp_array_sections (c, ort)) remove = true; else @@ -15638,7 +15648,19 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } if (remove) - *pc = OMP_CLAUSE_CHAIN (c); + { + if (grp_start_p) + { + /* If we found a clause to remove, we want to remove the whole +expanded group, otherwise gimplify +(omp_resolve_clause_dependencies) can get confused. */ + *grp_start_p = grp_sentinel; + pc = grp_start_p; + grp_start_p = NULL; + } + else + *pc = OMP_CLAUSE_CHAIN (c); + } else pc = &OMP_CLAUSE_CHAIN (c); } diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc index ae7c8ea7b1f..7302d21fc54 100644 --- a/gcc/cp/semantics.cc +++ b/gcc/cp/semantics.cc @@ -6755,11 +6755,18 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) break; } + tree *grp_start_p = NULL, grp_sentinel = NULL_TREE; + for (pc = &clauses, c = clauses; c ; c = *pc) { bool remove = false; bool field_ok = false; + /* We've reached the end of a list of expanded nodes. Reset the group +start pointer. */ + if (c == grp_sentinel) + grp_start_p = NULL; + switch (OMP_CLAUSE_CODE (c)) { case OMP_CLAUSE_SHARED: @@ -7985,6 +7992,9 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) t = OMP_CLAUSE_DECL (c); if (TREE_CODE (t) == TREE_LIST) { + grp_start_p = pc; + grp_sentinel = OMP_CLAUSE_CHAIN (c); + if (handle_omp_array_sections (c, ort)) re
[PATCH v3 03/11] OpenMP/OpenACC struct sibling list gimplification extension and rework
This patch refactors struct sibling-list processing in gimplify.cc, and adjusts some related mapping-clause processing in the Fortran FE and omp-low.cc accordingly. This version addresses some previous review comments, mostly with regards to the naming of local helper functions. (Previous version was approved already, pending those changes.) 2022-09-13 Julian Brown gcc/fortran/ * trans-openmp.cc (gfc_trans_omp_clauses): Don't create GOMP_MAP_TO_PSET mappings for class metadata, nor GOMP_MAP_POINTER mappings for POINTER_TYPE_P decls. gcc/ * gimplify.c (gimplify_omp_var_data): Remove GOVD_MAP_HAS_ATTACHMENTS. (GOMP_FIRSTPRIVATE_IMPLICIT): Renumber. (insert_struct_comp_map): Refactor function into... (build_struct_comp_nodes): This new function. Remove list handling and improve self-documentation. (extract_base_bit_offset): Remove BASE_REF, OFFSETP parameters. Move code to strip outer parts of address out of function, but strip no-op conversions. (omp_mapping_group): Add DELETED field for use during reindexing. (omp_strip_components_and_deref, omp_strip_indirections): New functions. (omp_group_last, omp_group_base): Add GOMP_MAP_STRUCT handling. (omp_gather_mapping_groups): Initialise DELETED field for new groups. (omp_index_mapping_groups): Notice DELETED groups when (re)indexing. (omp_siblist_insert_node_after, omp_siblist_move_node_after, omp_siblist_move_nodes_after, omp_siblist_move_concat_nodes_after): New helper functions. (omp_accumulate_sibling_list): New function to build up GOMP_MAP_STRUCT node groups for sibling lists. Outlined from gimplify_scan_omp_clauses. (omp_build_struct_sibling_lists): New function. (gimplify_scan_omp_clauses): Remove struct_map_to_clause, struct_seen_clause, struct_deref_set. Call omp_build_struct_sibling_lists as pre-pass instead of handling sibling lists in the function's main processing loop. (gimplify_adjust_omp_clauses_1): Remove GOVD_MAP_HAS_ATTACHMENTS handling, unused now. * omp-low.cc (scan_sharing_clauses): Handle pointer-type indirect struct references, and references to pointers to structs also. gcc/testsuite/ * g++.dg/goacc/member-array-acc.C: New test. * g++.dg/gomp/member-array-omp.C: New test. * g++.dg/gomp/target-3.C: Update expected output. * g++.dg/gomp/target-lambda-1.C: Likewise. * g++.dg/gomp/target-this-2.C: Likewise. * c-c++-common/goacc/deep-copy-arrayofstruct.c: Move test from here. * c-c++-common/gomp/target-50.c: New test. libgomp/ * testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test. * testsuite/libgomp.oacc-c++/deep-copy-17.C: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-arrayofstruct.c: Move test to here, make "run" test. --- gcc/fortran/trans-openmp.cc | 20 +- gcc/gimplify.cc | 1519 ++--- gcc/omp-low.cc| 16 +- gcc/testsuite/c-c++-common/gomp/target-50.c | 23 + gcc/testsuite/g++.dg/goacc/member-array-acc.C | 13 + gcc/testsuite/g++.dg/gomp/member-array-omp.C | 13 + gcc/testsuite/g++.dg/gomp/target-3.C |4 +- gcc/testsuite/g++.dg/gomp/target-lambda-1.C |3 +- gcc/testsuite/g++.dg/gomp/target-this-2.C |2 +- .../testsuite/libgomp.oacc-c++/deep-copy-17.C | 101 ++ .../libgomp.oacc-c-c++-common/deep-copy-15.c | 68 + .../libgomp.oacc-c-c++-common/deep-copy-16.c | 231 +++ .../deep-copy-arrayofstruct.c |2 +- 13 files changed, 1362 insertions(+), 653 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/gomp/target-50.c create mode 100644 gcc/testsuite/g++.dg/goacc/member-array-acc.C create mode 100644 gcc/testsuite/g++.dg/gomp/member-array-omp.C create mode 100644 libgomp/testsuite/libgomp.oacc-c++/deep-copy-17.C create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c rename {gcc/testsuite/c-c++-common/goacc => libgomp/testsuite/libgomp.oacc-c-c++-common}/deep-copy-arrayofstruct.c (98%) diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index de27ed52c02..4949f001fed 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -3117,30 +3117,16 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, tree present = gfc_omp_check_optional_argument (decl, true); if (openacc && n->sym->ts.type == BT_CLASS) { - tree type = TREE_TYPE (decl); if (n->sym->attr.optional) sorry ("opt
[PATCH v3 01/11] OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer)
This patch reimplements the omp_target_reorder_clauses function in anticipation of supporting "deeper" struct mappings (that is, with several structure dereference operators, or similar). The idea is that in place of the (possibly quadratic) algorithm in omp_target_reorder_clauses that greedily moves clauses containing addresses that are subexpressions of other addresses before those other addresses, we employ a topological sort algorithm to calculate a proper order for map clauses. This should run in linear time, and hopefully handles degenerate cases where multiple "levels" of indirect accesses are present on a given directive. The new method also takes care to keep clause groups together, addressing the concerns raised in: https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570501.html To figure out if some given clause depends on a base pointer in another clause, we strip off the outer layers of the address expression, and check (via a tree_operand_hash hash table we have built) if the result is a "base pointer" as defined in OpenMP 5.0 (1.2.6 Data Terminology). There are some subtleties involved, however: - We must treat MEM_REF with zero offset the same as INDIRECT_REF. This should probably be fixed in the front ends instead so we always use a canonical form (probably INDIRECT_REF). The following patch shows one instance of the problem, but there may be others: https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571382.html - Mapping a whole struct implies mapping each of that struct's elements, which may be base pointers. Because those base pointers aren't necessarily explicitly referenced in the directive in question, we treat the whole-struct mapping as a dependency instead. This version of the patch has been rebased and incorporates several previous review comments: one thing *not* addressed here is the following, which is addressed by the C++ "declare mapper" patch: On Tue, 24 May 2022 15:03:07 +0200 Jakub Jelinek wrote: > I think big question is if we do want to do this map clause reordering > before processing the omp target etc. clauses, or after (during > gimplify_adjust_omp_clauses, when clauses from the implicit mappings > are added too and especially with the declare mapper expansions), > or both before and after. A WIP version of that patch demonstrating a new approach to clause gimplification is included at the end of this series. 2022-09-13 Julian Brown gcc/ * gimplify.c (is_or_contains_p, omp_target_reorder_clauses): Delete functions. (omp_tsort_mark): Add enum. (omp_mapping_group): Add struct. (debug_mapping_group, omp_get_base_pointer, omp_get_attachment, omp_group_last, omp_gather_mapping_groups, omp_group_base, omp_index_mapping_groups, omp_containing_struct, omp_tsort_mapping_groups_1, omp_tsort_mapping_groups, omp_segregate_mapping_groups, omp_reorder_mapping_groups): New functions. (gimplify_scan_omp_clauses): Call above functions instead of omp_target_reorder_clauses, unless we've seen an error. * omp-low.c (scan_sharing_clauses): Avoid strict test if we haven't sorted mapping groups. gcc/testsuite/ * g++.dg/gomp/target-lambda-1.C: Adjust expected output. * g++.dg/gomp/target-this-3.C: Likewise. * g++.dg/gomp/target-this-4.C: Likewise. --- gcc/gimplify.cc | 766 +++- gcc/omp-low.cc | 7 +- gcc/testsuite/g++.dg/gomp/target-lambda-1.C | 7 +- gcc/testsuite/g++.dg/gomp/target-this-3.C | 4 +- gcc/testsuite/g++.dg/gomp/target-this-4.C | 4 +- 5 files changed, 774 insertions(+), 14 deletions(-) diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index b828ddbfe6c..f3cd0df663e 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -8948,6 +8948,7 @@ extract_base_bit_offset (tree base, tree *base_ref, poly_int64 *bitposp, return base; } +#if 0 /* Returns true if EXPR is or contains (as a sub-component) BASE_PTR. */ static bool @@ -9150,6 +9151,743 @@ omp_target_reorder_clauses (tree *list_p) } } } +#endif + +/* Used for topological sorting of mapping groups. UNVISITED means we haven't + started processing the group yet. The TEMPORARY mark is used when we first + encounter a group on a depth-first traversal, and the PERMANENT mark is used + when we have processed all the group's children (i.e. all the base pointers + referred to by the group's mapping nodes, recursively). */ + +enum omp_tsort_mark { + UNVISITED, + TEMPORARY, + PERMANENT +}; + +/* A group of OMP_CLAUSE_MAP nodes that correspond to a single "map" + clause. */ + +struct omp_mapping_group { + tree *grp_start; + tree grp_end; + omp_tsort_mark mark; + struct omp_mapping_group *sibling; + struct omp_mapping_group *next; +}; + +DEBUG_FUNCTION void +debug_mapping_group (omp_mapping_group *grp) +{ + tree tmp = OMP_CLAUSE
[PATCH v3 02/11] Remove omp_target_reorder_clauses
This patch has been split out from the previous one to avoid a confusingly-interleaved diff. The two patches will be committed squashed together. 2022-09-13 Julian Brown gcc/ * gimplify.c (omp_target_reorder_clauses): Delete. --- gcc/gimplify.cc | 205 1 file changed, 205 deletions(-) diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc index f3cd0df663e..2ba50d7d3ca 100644 --- a/gcc/gimplify.cc +++ b/gcc/gimplify.cc @@ -8948,211 +8948,6 @@ extract_base_bit_offset (tree base, tree *base_ref, poly_int64 *bitposp, return base; } -#if 0 -/* Returns true if EXPR is or contains (as a sub-component) BASE_PTR. */ - -static bool -is_or_contains_p (tree expr, tree base_ptr) -{ - if ((TREE_CODE (expr) == INDIRECT_REF && TREE_CODE (base_ptr) == MEM_REF) - || (TREE_CODE (expr) == MEM_REF && TREE_CODE (base_ptr) == INDIRECT_REF)) -return operand_equal_p (TREE_OPERAND (expr, 0), - TREE_OPERAND (base_ptr, 0)); - while (!operand_equal_p (expr, base_ptr)) -{ - if (TREE_CODE (base_ptr) == COMPOUND_EXPR) - base_ptr = TREE_OPERAND (base_ptr, 1); - if (TREE_CODE (base_ptr) == COMPONENT_REF - || TREE_CODE (base_ptr) == POINTER_PLUS_EXPR - || TREE_CODE (base_ptr) == SAVE_EXPR) - base_ptr = TREE_OPERAND (base_ptr, 0); - else - break; -} - return operand_equal_p (expr, base_ptr); -} - -/* Implement OpenMP 5.x map ordering rules for target directives. There are - several rules, and with some level of ambiguity, hopefully we can at least - collect the complexity here in one place. */ - -static void -omp_target_reorder_clauses (tree *list_p) -{ - /* Collect refs to alloc/release/delete maps. */ - auto_vec ard; - tree *cp = list_p; - while (*cp != NULL_TREE) -if (OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP - && (OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ALLOC - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_RELEASE - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_DELETE)) - { - /* Unlink cp and push to ard. */ - tree c = *cp; - tree nc = OMP_CLAUSE_CHAIN (c); - *cp = nc; - ard.safe_push (c); - - /* Any associated pointer type maps should also move along. */ - while (*cp != NULL_TREE - && OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP - && (OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_FIRSTPRIVATE_REFERENCE - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_FIRSTPRIVATE_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ATTACH_DETACH - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_ALWAYS_POINTER - || OMP_CLAUSE_MAP_KIND (*cp) == GOMP_MAP_TO_PSET)) - { - c = *cp; - nc = OMP_CLAUSE_CHAIN (c); - *cp = nc; - ard.safe_push (c); - } - } -else - cp = &OMP_CLAUSE_CHAIN (*cp); - - /* Link alloc/release/delete maps to the end of list. */ - for (unsigned int i = 0; i < ard.length (); i++) -{ - *cp = ard[i]; - cp = &OMP_CLAUSE_CHAIN (ard[i]); -} - *cp = NULL_TREE; - - /* OpenMP 5.0 requires that pointer variables are mapped before - its use as a base-pointer. */ - auto_vec atf; - for (tree *cp = list_p; *cp; cp = &OMP_CLAUSE_CHAIN (*cp)) -if (OMP_CLAUSE_CODE (*cp) == OMP_CLAUSE_MAP) - { - /* Collect alloc, to, from, to/from clause tree pointers. */ - gomp_map_kind k = OMP_CLAUSE_MAP_KIND (*cp); - if (k == GOMP_MAP_ALLOC - || k == GOMP_MAP_TO - || k == GOMP_MAP_FROM - || k == GOMP_MAP_TOFROM - || k == GOMP_MAP_ALWAYS_TO - || k == GOMP_MAP_ALWAYS_FROM - || k == GOMP_MAP_ALWAYS_TOFROM) - atf.safe_push (cp); - } - - for (unsigned int i = 0; i < atf.length (); i++) -if (atf[i]) - { - tree *cp = atf[i]; - tree decl = OMP_CLAUSE_DECL (*cp); - if (TREE_CODE (decl) == INDIRECT_REF || TREE_CODE (decl) == MEM_REF) - { - tree base_ptr = TREE_OPERAND (decl, 0); - STRIP_TYPE_NOPS (base_ptr); - for (unsigned int j = i + 1; j < atf.length (); j++) - if (atf[j]) - { - tree *cp2 = atf[j]; - tree decl2 = OMP_CLAUSE_DECL (*cp2); - - decl2 = OMP_CLAUSE_DECL (*cp2); - if (is_or_contains_p (decl2, base_ptr)) - { - /* Move *cp2 to before *cp. */ - tree c = *cp2; - *cp2 = OMP_CLAUSE_CHAIN (c); - OMP_CLAUSE_CHAIN (c) = *cp; - *cp = c; - - if (*cp2 != NULL_TREE - && OMP_CLAUSE_CODE (*cp2) == OMP_CLAUSE_MAP - && OMP_CLAUSE_MAP_KIND (*cp2) == GOMP_MAP_ALWAYS_POINTER) -
[PATCH v3 00/11] OpenMP 5.0: Struct & mapping clause expansion rework
This is a new version of the first few patches of the series (up to "Handle reference-typed struct members"): https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591973.html Relative to the previously-posted series, this version addresses several review comments, but more significantly it largely reworks how OpenMP/OpenACC mapping clause expansion works in the C and C++ front ends and in gimplify.cc, following discovery of some quite serious shortcomings in the existing & previous support for same. (For the new "baseptrs-4.C" test, approximately 135 out of 274 tests failed.) Revisions/rework have been completed up to the 8th patch in the series. The remaining patches address some previous review comments and are included to support the direction of the first eight patches, but are not ready for full re-review yet. Individual patches bootstrapped (except the "trivial" ones) and regression tested with offloading to NVPTX (up to "OpenMP/OpenACC: Rework clause expansion and nested struct handling"). Further commentary on individual patches. Julian Brown (11): OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer) Remove omp_target_reorder_clauses OpenMP/OpenACC struct sibling list gimplification extension and rework OpenMP/OpenACC: mapping group list-handling improvements OpenMP: push attaches to end of clause list in "target" regions OpenMP: Pointers and member mappings OpenMP/OpenACC: Reindent TO/FROM/_CACHE_ stanza in {c_}finish_omp_clause OpenMP/OpenACC: Rework clause expansion and nested struct handling FYI/unfinished: OpenMP: lvalue parsing for map clauses (C++) Use OMP_ARRAY_SECTION instead of TREE_LIST in C++ FE FYI/unfinished: OpenMP 5.0 "declare mapper" support for C++ gcc/c-family/c-common.h | 68 + gcc/c-family/c-omp.cc | 780 +++- gcc/c/c-typeck.cc | 750 ++-- gcc/cp/cp-gimplify.cc |6 + gcc/cp/cp-objcp-common.h |2 + gcc/cp/cp-tree.h | 10 + gcc/cp/decl.cc| 18 +- gcc/cp/error.cc |9 + gcc/cp/mangle.cc |5 +- gcc/cp/name-lookup.cc |3 +- gcc/cp/parser.cc | 543 ++- gcc/cp/parser.h |3 + gcc/cp/pt.cc | 144 +- gcc/cp/semantics.cc | 1570 --- gcc/fortran/parse.cc |3 + gcc/fortran/trans-openmp.cc | 160 +- gcc/gimplify.cc | 3702 + gcc/langhooks-def.h |3 + gcc/langhooks.cc |9 + gcc/langhooks.h |4 + gcc/omp-general.cc| 426 ++ gcc/omp-general.h | 109 + gcc/omp-low.cc| 26 +- gcc/testsuite/c-c++-common/goacc/mdc-2.c |2 + gcc/testsuite/c-c++-common/gomp/clauses-2.c |2 +- gcc/testsuite/c-c++-common/gomp/map-6.c | 10 +- gcc/testsuite/c-c++-common/gomp/target-50.c | 23 + .../c-c++-common/gomp/target-implicit-map-2.c |2 +- gcc/testsuite/g++.dg/goacc/mdc.C |2 + gcc/testsuite/g++.dg/goacc/member-array-acc.C | 13 + gcc/testsuite/g++.dg/gomp/declare-mapper-1.C | 58 + gcc/testsuite/g++.dg/gomp/declare-mapper-2.C | 30 + gcc/testsuite/g++.dg/gomp/declare-mapper-3.C | 27 + gcc/testsuite/g++.dg/gomp/declare-mapper-4.C | 74 + gcc/testsuite/g++.dg/gomp/ind-base-3.C| 37 + gcc/testsuite/g++.dg/gomp/map-assignment-1.C | 12 + gcc/testsuite/g++.dg/gomp/map-inc-1.C | 10 + gcc/testsuite/g++.dg/gomp/map-lvalue-ref-1.C | 19 + gcc/testsuite/g++.dg/gomp/map-ptrmem-1.C | 37 + gcc/testsuite/g++.dg/gomp/map-ptrmem-2.C | 40 + .../g++.dg/gomp/map-static-cast-lvalue-1.C| 17 + gcc/testsuite/g++.dg/gomp/map-ternary-1.C | 20 + gcc/testsuite/g++.dg/gomp/member-array-2.C| 91 + gcc/testsuite/g++.dg/gomp/member-array-omp.C | 13 + gcc/testsuite/g++.dg/gomp/pr67522.C |2 +- .../g++.dg/gomp/static-component-1.C | 23 + gcc/testsuite/g++.dg/gomp/target-3.C |4 +- gcc/testsuite/g++.dg/gomp/target-lambda-1.C |6 +- gcc/testsuite/g++.dg/gomp/target-this-2.C |2 +- gcc/testsuite/g++.dg/gomp/target-this-3.C |4 +- gcc/testsuite/g++.dg/gomp/target-this-4.C |4 +- gcc/testsuite/gcc.dg/gomp/target-3.c |2 +- gcc/tree-core.h |4 + gcc/tree-pretty-print.cc | 56 + gcc/tree.cc |2 + gcc/tree.def | 10 + gcc/tree.h
Re: libgo patch committed: Make runtime.Version return a useful value
On Tue, Jun 28, 2022 at 10:20 AM Ian Lance Taylor wrote: > > This libgo patch makes runtime.Version return a meaningful string. > This also means that "go version" will print something useful, e.g., > > go version go1.18 gccgo (GCC) 12.0.1 20220216 (experimental) linux/amd64 > > This fixes https://go.dev/issue/51850. > > Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu. Committed > to mainline. I've committed this to the GCC 12 branch, for PR 106747. Ian
[PATCH] Optimize (X<
This patch tweaks the match.pd transformation previously added to fold (X< gcc/ChangeLog * match.pd (op (lshift @0 @1) (lshift @2 @1)): Optimize the expression (X
[PATCH] PR tree-optimization/71343: Value number X<<2 as X*4.
This patch is the second part of a fix for PR tree-optimization/71343, that implements Richard Biener's suggestion of using tree-ssa's value numbering instead of match.pd. The change is that when assigning a value number for the expression X< gcc/ChangeLog PR tree-optimization/71343 * tree-ssa-sccvn.cc (visit_nary_op) : Make the value number of the expression X << C the same as the value number for the multiplication X * (1< -Original Message- > From: Richard Biener > Sent: 08 August 2022 12:42 > To: Roger Sayle > Cc: GCC Patches > Subject: Re: [PATCH] PR tree-optimization/71343: Optimize (X< (X&Y)< > On Mon, Aug 8, 2022 at 10:07 AM Roger Sayle > wrote: > > > > This patch resolves PR tree-optimization/71343, a missed-optimization > > enhancement request where GCC fails to see that (a<<2)+(b<<2) == a*4+b*4. > > This requires two related (sets of) optimizations to be added to match.pd. > > > > The first is that (X< > for many binary operators, including AND, IOR, XOR, and (if overflow > > isn't an issue) PLUS and MINUS. Likewise, the right shifts (both > > logical and arithmetic) and bit-wise logical operators can be > > simplified in a similar fashion. These all reduce the number of > > GIMPLE binary operations from 3 to 2, by combining/eliminating a shift > operation. > > > > The second optimization reflects that the middle-end doesn't impose a > > canonical form on multiplications by powers of two, vs. left shifts, > > instead leaving these operations as specified by the programmer unless > > there's a good reason to change them. Hence, GIMPLE code may contain > > the expressions "X * 8" and "X << 3" even though these represent the > > same value/computation. The tweak to match.pd is that comparison > > operations whose operands are equivalent non-canonical expressions can > > be taught their equivalence. Hence "(X * 8) == (X << 3)" will always > > evaluate to true, and "(X<<2) > 4*X" will always evaluate to false. > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > and make -k check, both with and without --target_board=unix{-m32}, > > with no new failures. Ok for mainline? > > +/* Shifts by constants distribute over several binary operations, > + hence (X << C) + (Y << C) can be simplified to (X + Y) << C. */ > +(for op (plus minus) > + (simplify > +(op (lshift:s @0 INTEGER_CST@1) (lshift:s @2 INTEGER_CST@1)) > +(if (INTEGRAL_TYPE_P (type) > +&& TYPE_OVERFLOW_WRAPS (type) > +&& !TYPE_SATURATING (type) > +&& tree_fits_shwi_p (@1) > +&& tree_to_shwi (@1) > 0 > +&& tree_to_shwi (@1) < TYPE_PRECISION (type)) > > I do wonder why we need to restrict this to shifts by constants? > Any out-of-bound shift was already there, no? > > +/* Some tree expressions are intentionally non-canonical. > + We handle the comparison of the equivalent forms here. */ (for cmp > +(eq le ge) > + (simplify > +(cmp:c (lshift @0 INTEGER_CST@1) (mult @0 integer_pow2p@2)) > +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) > +&& tree_fits_shwi_p (@1) > +&& tree_to_shwi (@1) > 0 > +&& tree_to_shwi (@1) < TYPE_PRECISION (TREE_TYPE (@0)) > +&& wi::to_wide (@1) == wi::exact_log2 (wi::to_wide (@2))) > + { constant_boolean_node (true, type); }))) > + > +(for cmp (ne lt gt) > + (simplify > +(cmp:c (lshift @0 INTEGER_CST@1) (mult @0 integer_pow2p@2)) > +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) > +&& tree_fits_shwi_p (@1) > +&& tree_to_shwi (@1) > 0 > +&& tree_to_shwi (@1) < TYPE_PRECISION (TREE_TYPE (@0)) > +&& wi::to_wide (@1) == wi::exact_log2 (wi::to_wide (@2))) > + { constant_boolean_node (false, type); }))) > > hmm. I wonder if it makes more sense to handle this in value-numbering. > tree-ssa-sccvn.cc:visit_nary_op handles some cases that are not exactly > canonicalization issues but the shift vs mult could be handled there by just > performing the alternate lookup. That would also enable CSE and by means of > that of course the comparisons you do above. > > Thanks, > Richard. > > > > > 2022-08-08 Roger Sayle > > > > gcc/ChangeLog > > PR tree-optimization/71343 > > * match.pd (op (lshift @0 @1) (lshift @2 @1)): Optimize the > > expression (X< > (op (rshift @0 @1) (rshift @2 @1)): Likwise, simplify (X>>C)^(Y>>C) > > to (X^Y)>>C for binary logical operators, AND, IOR and XOR. > > (cmp:c (lshift @0) (mult @1)): Optimize comparisons between > > shifts by integer constants and multiplications by powers of 2. > > > > gcc/testsuite/ChangeLog > > PR tree-optimization/71343 > > * gcc.dg/pr71343-1.c: New test case. > > * gcc.dg/pr71343-2.c: Likewise. > > > > > > Thanks in advance, > > Roger > > -- diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc index 74b8d8d..266 100644 --- a/gcc/tree-ssa-sccvn.cc +++ b/gcc/tree-ssa-sccvn.cc @@
Re: [PATCH] c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]
On Mon, 12 Sep 2022, Jakub Jelinek via Gcc-patches wrote: > > We don't claim in glibc to support old snapshots from master, so checking > > for __GNUC_PREREQ (13, 0) and failing for such older GCC 13 versions is > > fine. > > Ok, makes sense, especially if it is applied on the glibc side a few months > after it is changed on the GCC side. If it is applied immediately, there Leaving the build of glibc C++ tests broken for months with mainline GCC doesn't seem like a good idea; it's a recipe for missing other regressions. I think a fix should be applied to glibc immediately after the GCC change. -- Joseph S. Myers jos...@codesourcery.com
Re: libgo patch committed: Ignore __morestack in runtime.Callers
On Tue, Sep 6, 2022 at 6:40 PM Ian Lance Taylor wrote: > > This libgo patch ignores the __morestack function in runtime.Callers. > We were ignoring all functions starting with "__morestack_", but not > the function "__morestack" itself. Without this change, some tests > such as recover.go started failing recently, though I'm not sure > exactly what changed. Bootstrapped and ran Go testsuite on > x86_64-pc-linux-gnu. Committed to mainline. I've also committed this patch to GCC 12 branch. Ian
[committed] libgomp: Appease some static analyzers [PR106906]
Hi! While icv_addr[1] = false; assignments where icv_addr has void * element type is correct and matches how it is used (in those cases the void * pointer is then cast to bool and used that way), there is no reason not to add explicit (void *) casts there which are there already for (void *) true. And, there is in fact even no point in actually doing those stores at all because we set that pointer to NULL a few lines earlier. So, this patch adds the explicit casts and then comments those out to show intent. Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk. 2022-09-13 Jakub Jelinek PR libgomp/106906 * env.c (get_icv_member_addr): Cast false to void * before assigning it to icv_addr[1], and comment the whole assignment out. --- libgomp/env.c.jj2022-09-12 10:32:00.935086858 +0200 +++ libgomp/env.c 2022-09-12 13:27:22.893571697 +0200 @@ -1892,14 +1892,14 @@ get_icv_member_addr (struct gomp_initial { case GOMP_ICV_NTEAMS: icv_addr[0] = &icvs->nteams_var; - icv_addr[1] = false; + /* icv_addr[1] = (void *) false; */ break; case GOMP_ICV_DYNAMIC: icv_addr[0] = &(*icvs).dyn_var; break; case GOMP_ICV_TEAMS_THREAD_LIMIT: icv_addr[0] = &icvs->teams_thread_limit_var; - icv_addr[1] = false; + /* icv_addr[1] = (void *) false; */ break; case GOMP_ICV_SCHEDULE: icv_addr[0] = &icvs->run_sched_var; @@ -1907,7 +1907,7 @@ get_icv_member_addr (struct gomp_initial break; case GOMP_ICV_THREAD_LIMIT: icv_addr[0] = &icvs->thread_limit_var; - icv_addr[1] = false; + /* icv_addr[1] = (void *) false; */ icv_addr[2] = (void *) UINT_MAX; break; case GOMP_ICV_NTHREADS: Jakub
[PATCH] c++: Implement C++23 P1169R4 - static operator() [PR106651]
Hi! The following patch attempts to implement C++23 P1169R4 - static operator() paper's compiler side (there is some small library side too not implemented yet). This allows static members as user operator() declarations and static specifier on lambdas without lambda capture. As decl specifier parsing doesn't track about the presence and locations of all specifiers, the patch unfortunately replaces the diagnostics about duplicate mutable with diagnostics about conflicting specifiers because the information whether it was mutable mutable, mutable static, static mutable or static static is lost. Beyond this, the synthetized conversion operator changes for static lambdas as it can just return the operator() static method address, doesn't need to create a thunk for it. The change I'm least sure about is the call.cc (joust) change, one thing is that we ICEd because we assumed that len could be different only if both candidates are direct calls but it can be one direct and one indirect call, and then I'm trying to implement my understanding of the [over.match.best.general]/1 and [over.best.ics.general] changes from the paper (implemented both for C++23 and when the static member function is operator() which we accept with pedwarn in earlier standards too). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2022-09-13 Jakub Jelinek PR c++/106651 gcc/c-family/ * c-cppbuiltin.cc (c_cpp_builtins): Predefine __cpp_static_call_operator=202207L for C++23. gcc/cp/ * cp-tree.h (LAMBDA_EXPR_STATIC_P): Implement C++23 P1169R4 - static operator(). Define. * parser.cc (CP_PARSER_FLAGS_ONLY_MUTABLE_OR_CONSTEXPR): Document that it also allows static. (cp_parser_lambda_declarator_opt): Handle static lambda specifier. (cp_parser_decl_specifier_seq): Allow RID_STATIC for CP_PARSER_FLAGS_ONLY_MUTABLE_OR_CONSTEXPR. * decl.cc (grok_op_properties): If operator() isn't a method, use a different error wording, if it is static member function, allow it (for C++20 and older with a pedwarn unless it is a lambda function or template instantiation). * call.cc (joust): Don't ICE if one candidate is static member function and the other is an indirect call. For C++23 or if the static member is operator() and the parameter conversion on the other candidate is user defined conversion, ellipsis or bad conversion, make static member function candidate a winner for that parameter. * lambda.cc (maybe_add_lambda_conv_op): Handle static lambdas. * error.cc (dump_lambda_function): Print static for static lambdas. gcc/testsuite/ * g++.dg/template/error30.C: Adjust expected diagnostics. * g++.dg/cpp1z/constexpr-lambda13.C: Likewise. * g++.dg/cpp23/feat-cxx2b.C: Test __cpp_static_call_operator. * g++.dg/cpp23/static-operator-call1.C: New test. * g++.dg/cpp23/static-operator-call2.C: New test. * g++.old-deja/g++.jason/operator.C: Adjust expected diagnostics. --- gcc/cp/cp-tree.h.jj 2022-09-13 09:21:28.052541628 +0200 +++ gcc/cp/cp-tree.h2022-09-13 12:14:31.674733861 +0200 @@ -504,6 +504,7 @@ extern GTY(()) tree cp_global_trees[CPTI OVL_NESTED_P (in OVERLOAD) DECL_MODULE_EXPORT_P (in _DECL) PACK_EXPANSION_FORCE_EXTRA_ARGS_P (in *_PACK_EXPANSION) + LAMBDA_EXPR_STATIC_P (in LAMBDA_EXPR) 4: IDENTIFIER_MARKED (IDENTIFIER_NODEs) TREE_HAS_CONSTRUCTOR (in INDIRECT_REF, SAVE_EXPR, CONSTRUCTOR, CALL_EXPR, or FIELD_DECL). @@ -1488,6 +1489,10 @@ enum cp_lambda_default_capture_mode_type #define LAMBDA_EXPR_CAPTURE_OPTIMIZED(NODE) \ TREE_LANG_FLAG_2 (LAMBDA_EXPR_CHECK (NODE)) +/* Predicate tracking whether the lambda was declared 'static'. */ +#define LAMBDA_EXPR_STATIC_P(NODE) \ + TREE_LANG_FLAG_3 (LAMBDA_EXPR_CHECK (NODE)) + /* True if this TREE_LIST in LAMBDA_EXPR_CAPTURE_LIST is for an explicit capture. */ #define LAMBDA_CAPTURE_EXPLICIT_P(NODE) \ --- gcc/cp/parser.cc.jj 2022-09-13 09:21:01.276920558 +0200 +++ gcc/cp/parser.cc2022-09-13 12:14:31.683733738 +0200 @@ -1994,7 +1994,7 @@ enum constexpr. */ CP_PARSER_FLAGS_ONLY_TYPE_OR_CONSTEXPR = 0x8, /* When parsing a decl-specifier-seq, only allow mutable, constexpr or - for C++20 consteval. */ + for C++20 consteval or for C++23 static. */ CP_PARSER_FLAGS_ONLY_MUTABLE_OR_CONSTEXPR = 0x10, /* When parsing a decl-specifier-seq, allow missing typename. */ CP_PARSER_FLAGS_TYPENAME_OPTIONAL = 0x20, @@ -11714,13 +11714,26 @@ cp_parser_lambda_declarator_opt (cp_pars omitted_parms_loc = UNKNOWN_LOCATION; } - if (lambda_specs.storage_class == sc_mutable) + if (lambda_specs.storage_class == sc_mutable + || lambda_specs.storage_class == sc_static) { - LAMBDA_EXPR_MUTABLE_P (lambda_expr) = 1; - quals = TYPE_UNQUALIFIED; + if (lambda_specs
[PATCH] LoongArch: Prepare static PIE support
Static PIE allows us to extend the ASLR to cover static executables and it's not too difficult to support it. On GCC side, we just pass a group of options to the linker, like other ports with static PIE support. The real implementation of static PIE (rcrt1.o) will be added into Glibc later. gcc/ChangeLog: * config/loongarch/gnu-user.h (GNU_USER_TARGET_LINK_SPEC): For -static-pie, pass -static -pie --no-dynamic-linker -z text to the linker, and do not pass --dynamic-linker. --- gcc/config/loongarch/gnu-user.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/gcc/config/loongarch/gnu-user.h b/gcc/config/loongarch/gnu-user.h index 664dc9206ad..c5b1afe530d 100644 --- a/gcc/config/loongarch/gnu-user.h +++ b/gcc/config/loongarch/gnu-user.h @@ -40,8 +40,10 @@ along with GCC; see the file COPYING3. If not see #undef GNU_USER_TARGET_LINK_SPEC #define GNU_USER_TARGET_LINK_SPEC \ "%{G*} %{shared} -m " GNU_USER_LINK_EMULATION \ - "%{!shared: %{static} %{!static: %{rdynamic:-export-dynamic} " \ - "-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}}" + "%{!shared: %{static} " \ + "%{!static: %{!static-pie: %{rdynamic:-export-dynamic} " \ + "-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}} " \ + "%{static-pie: -static -pie --no-dynamic-linker -z text}}" /* Similar to standard Linux, but adding -ffast-math support. */ -- 2.37.3
[committed] c++: remove single-parameter version of mark_used
Tested on x86_64-pc-linux-gnu, committed to trunk as obvious. gcc/cp/ChangeLog: * cp-tree.h (mark_used): Remove single-parameter overload. Add default argument to the two-parameter overload. * decl2.cc (mark_used): Likewise. --- gcc/cp/cp-tree.h | 4 ++-- gcc/cp/decl2.cc | 8 +--- 2 files changed, 3 insertions(+), 9 deletions(-) diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 1eb176d4a50..f19ecafc266 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -6945,8 +6945,8 @@ extern bool decl_constant_var_p (tree); extern bool decl_maybe_constant_var_p (tree); extern void no_linkage_error (tree); extern void check_default_args (tree); -extern bool mark_used (tree); -extern bool mark_used (tree, tsubst_flags_t); +extern bool mark_used (tree, +tsubst_flags_t = tf_warning_or_error); extern bool mark_single_function (tree, tsubst_flags_t); extern void finish_static_data_member_decl (tree, tree, bool, tree, int); extern tree cp_build_parm_decl (tree, tree, tree); diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc index 684a2d06dde..9f18466192f 100644 --- a/gcc/cp/decl2.cc +++ b/gcc/cp/decl2.cc @@ -5575,7 +5575,7 @@ mark_single_function (tree expr, tsubst_flags_t complain) wrong, true otherwise. */ bool -mark_used (tree decl, tsubst_flags_t complain) +mark_used (tree decl, tsubst_flags_t complain /* = tf_warning_or_error */) { /* If we're just testing conversions or resolving overloads, we don't want any permanent effects like forcing functions to be @@ -5817,12 +5817,6 @@ mark_used (tree decl, tsubst_flags_t complain) return true; } -bool -mark_used (tree decl) -{ - return mark_used (decl, tf_warning_or_error); -} - tree vtv_start_verification_constructor_init_function (void) { -- 2.37.3.542.gdd3f6c4cae
Re: [PATCH, libgomp] Fix chunk_size<1 for dynamic schedule
On Thu, Aug 04, 2022 at 09:17:09PM +0800, Chung-Lin Tang wrote: > On 2022/6/28 10:06 PM, Jakub Jelinek wrote: > > On Thu, Jun 23, 2022 at 11:47:59PM +0800, Chung-Lin Tang wrote: > > > with the way that chunk_size < 1 is handled for gomp_iter_dynamic_next: > > > > > > (1) chunk_size <= -1: wraps into large unsigned value, seems to work > > > though. > > > (2) chunk_size == 0: infinite loop > > > > > > The (2) behavior is obviously not desired. This patch fixes this by > > > changing > > > > Why? It is a user error, undefined behavior, we shouldn't slow down valid > > code for users who don't bother reading the standard. > > This is loop init code, not per-iteration. The overhead really isn't that > much. But still, we slow down valid code for the sake of garbage. That is a task for sanitizers, not production libraries. > > The question should be, if GCC having infinite loop behavior is reasonable, > even if it is undefined in the spec. Sure, it is perfectly valid implementation of the undefined behavior. UB can leads to tons of surprising behavior, hangs etc. and this is exactly the same category. On Thu, Aug 04, 2022 at 01:31:50PM +, Koning, Paul via Gcc-patches wrote: > I wouldn't think so. The way I see "undefined code" is that you can't > complain > about "wrong code" produced by the compiler. But for the compiler to > malfunction > on wrong input is an entirely differerent matter. For one thing, it's hard > to fix > your code if the compiler fails. How would you locate the offending source > line? The compiler isn't malfunctioning here. It is similar to calling a library function with bogus arguments, say memcpy with NULL source or destination or some invalid pointer not pointing anywhere valid, etc. The spec clearly says zero or negative chunk size is not valid, you use it, you get what you ask for. Furthermore, it is easy to find out when it hangs on which construct it is and check if that construct is valid. I'm strongly against slowing valid code for this. If you want to implement -fsanitize=openmp and either in that case perform checks on the generated code side or link with an instrumented version of libgomp that explains users what errors they do, that is fine. Jakub
Re: [PATCH] c++: some missing-SFINAE fixes
On 9/13/22 09:46, Patrick Palka wrote: On Tue, 13 Sep 2022, Jason Merrill wrote: On 9/13/22 07:45, Patrick Palka wrote: It looks like we aren't respecting SFINAE for: * an invalid/non-constant conditional explicit-specifier * a non-constant conditional noexcept-specifier * a non-constant argument to __integer_pack This patch fixes these issues in the usual way, by passing complain and propagating error_mark_node appropriately. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? gcc/cp/ChangeLog: * decl.cc (build_explicit_specifier): Pass complain to cxx_constant_value. * except.cc (build_noexcept_spec): Likewise. * pt.cc (expand_integer_pack): Likewise. (tsubst_function_decl): Propagate error_mark_node returned from build_explicit_specifier. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/noexcept-type26.C: New test. * g++.dg/cpp2a/explicit19.C: New test. * g++.dg/ext/integer-pack6.C: New test. --- gcc/cp/decl.cc | 2 +- gcc/cp/except.cc | 2 +- gcc/cp/pt.cc | 6 -- gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C | 12 gcc/testsuite/g++.dg/cpp2a/explicit19.C | 12 gcc/testsuite/g++.dg/ext/integer-pack6.C | 13 + 6 files changed, 43 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit19.C create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack6.C diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 936f1cf0197..5404d7e084c 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t complain) expr = build_converted_constant_bool_expr (expr, complain); expr = instantiate_non_dependent_expr (expr, complain); - expr = cxx_constant_value (expr); + expr = cxx_constant_value (expr, NULL_TREE, complain); The patch is OK, but perhaps we want another overload that omits the object parameter? Thanks -- sounds good, like the following incremental patch? OK. -- >8 -- Subject: [PATCH] c++: two-parameter version of cxx_constant_value Since some callers need the complain parameter but not the object parameter, let's introduce and use an overload of cxx_constant_value that omits the latter. gcc/cp/ChangeLog: * cp-tree.h (cxx_constant_value): Define two-parameter version that omits the object parameter. * decl.cc (build_explicit_specifier): Omit NULL_TREE object argument to cxx_constant_value. * except.cc (build_noexcept_spec): Likewise. * pt.cc (expand_integer_pack): Likewise. (fold_targs_r): Likewise. * semantics.cc (finish_if_stmt_cond): Likewise. --- gcc/cp/cp-tree.h| 2 ++ gcc/cp/decl.cc | 2 +- gcc/cp/except.cc| 2 +- gcc/cp/pt.cc| 4 ++-- gcc/cp/semantics.cc | 2 +- 5 files changed, 7 insertions(+), 5 deletions(-) diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 2ba44e80e20..1eb176d4a50 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -8414,6 +8414,8 @@ extern bool require_rvalue_constant_expression (tree); extern bool require_potential_rvalue_constant_expression (tree); extern tree cxx_constant_value(tree, tree = NULL_TREE, tsubst_flags_t = tf_error); +inline tree cxx_constant_value (tree t, tsubst_flags_t complain) +{ return cxx_constant_value (t, NULL_TREE, complain); } extern void cxx_constant_dtor (tree, tree); extern tree cxx_constant_init (tree, tree = NULL_TREE); extern tree maybe_constant_value (tree, tree = NULL_TREE, bool = false); diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 5404d7e084c..006e9affcba 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t complain) expr = build_converted_constant_bool_expr (expr, complain); expr = instantiate_non_dependent_expr (expr, complain); - expr = cxx_constant_value (expr, NULL_TREE, complain); + expr = cxx_constant_value (expr, complain); return expr; } diff --git a/gcc/cp/except.cc b/gcc/cp/except.cc index 4d7f0ce102d..048612de400 100644 --- a/gcc/cp/except.cc +++ b/gcc/cp/except.cc @@ -1257,7 +1257,7 @@ build_noexcept_spec (tree expr, tsubst_flags_t complain) { expr = build_converted_constant_bool_expr (expr, complain); expr = instantiate_non_dependent_expr (expr, complain); - expr = cxx_constant_value (expr, NULL_TREE, complain); + expr = cxx_constant_value (expr, complain); } if (TREE_CODE (expr) == INTEGER_CST) { diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 074179288b6..db4e808adec 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.c
Re: [PATCH] c++: some missing-SFINAE fixes
On Tue, 13 Sep 2022, Jason Merrill wrote: > On 9/13/22 07:45, Patrick Palka wrote: > > It looks like we aren't respecting SFINAE for: > > > >* an invalid/non-constant conditional explicit-specifier > >* a non-constant conditional noexcept-specifier > >* a non-constant argument to __integer_pack > > > > This patch fixes these issues in the usual way, by passing complain > > and propagating error_mark_node appropriately. > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for > > trunk? > > > > gcc/cp/ChangeLog: > > > > * decl.cc (build_explicit_specifier): Pass complain to > > cxx_constant_value. > > * except.cc (build_noexcept_spec): Likewise. > > * pt.cc (expand_integer_pack): Likewise. > > (tsubst_function_decl): Propagate error_mark_node returned > > from build_explicit_specifier. > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/cpp1z/noexcept-type26.C: New test. > > * g++.dg/cpp2a/explicit19.C: New test. > > * g++.dg/ext/integer-pack6.C: New test. > > --- > > gcc/cp/decl.cc | 2 +- > > gcc/cp/except.cc | 2 +- > > gcc/cp/pt.cc | 6 -- > > gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C | 12 > > gcc/testsuite/g++.dg/cpp2a/explicit19.C | 12 > > gcc/testsuite/g++.dg/ext/integer-pack6.C | 13 + > > 6 files changed, 43 insertions(+), 4 deletions(-) > > create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C > > create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit19.C > > create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack6.C > > > > diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc > > index 936f1cf0197..5404d7e084c 100644 > > --- a/gcc/cp/decl.cc > > +++ b/gcc/cp/decl.cc > > @@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t > > complain) > > expr = build_converted_constant_bool_expr (expr, complain); > > expr = instantiate_non_dependent_expr (expr, complain); > > - expr = cxx_constant_value (expr); > > + expr = cxx_constant_value (expr, NULL_TREE, complain); > > The patch is OK, but perhaps we want another overload that omits the object > parameter? Thanks -- sounds good, like the following incremental patch? -- >8 -- Subject: [PATCH] c++: two-parameter version of cxx_constant_value Since some callers need the complain parameter but not the object parameter, let's introduce and use an overload of cxx_constant_value that omits the latter. gcc/cp/ChangeLog: * cp-tree.h (cxx_constant_value): Define two-parameter version that omits the object parameter. * decl.cc (build_explicit_specifier): Omit NULL_TREE object argument to cxx_constant_value. * except.cc (build_noexcept_spec): Likewise. * pt.cc (expand_integer_pack): Likewise. (fold_targs_r): Likewise. * semantics.cc (finish_if_stmt_cond): Likewise. --- gcc/cp/cp-tree.h| 2 ++ gcc/cp/decl.cc | 2 +- gcc/cp/except.cc| 2 +- gcc/cp/pt.cc| 4 ++-- gcc/cp/semantics.cc | 2 +- 5 files changed, 7 insertions(+), 5 deletions(-) diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 2ba44e80e20..1eb176d4a50 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -8414,6 +8414,8 @@ extern bool require_rvalue_constant_expression (tree); extern bool require_potential_rvalue_constant_expression (tree); extern tree cxx_constant_value (tree, tree = NULL_TREE, tsubst_flags_t = tf_error); +inline tree cxx_constant_value (tree t, tsubst_flags_t complain) +{ return cxx_constant_value (t, NULL_TREE, complain); } extern void cxx_constant_dtor (tree, tree); extern tree cxx_constant_init (tree, tree = NULL_TREE); extern tree maybe_constant_value (tree, tree = NULL_TREE, bool = false); diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 5404d7e084c..006e9affcba 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t complain) expr = build_converted_constant_bool_expr (expr, complain); expr = instantiate_non_dependent_expr (expr, complain); - expr = cxx_constant_value (expr, NULL_TREE, complain); + expr = cxx_constant_value (expr, complain); return expr; } diff --git a/gcc/cp/except.cc b/gcc/cp/except.cc index 4d7f0ce102d..048612de400 100644 --- a/gcc/cp/except.cc +++ b/gcc/cp/except.cc @@ -1257,7 +1257,7 @@ build_noexcept_spec (tree expr, tsubst_flags_t complain) { expr = build_converted_constant_bool_expr (expr, complain); expr = instantiate_non_dependent_expr (expr, complain); - expr = cxx_constant_value (expr, NULL_TREE, complain); + expr = cxx_constant_value (expr, complain); } if (TREE_CODE (expr) == INTEGER_CST) { diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index
Re: [PATCH] c++: some missing-SFINAE fixes
On 9/13/22 07:45, Patrick Palka wrote: It looks like we aren't respecting SFINAE for: * an invalid/non-constant conditional explicit-specifier * a non-constant conditional noexcept-specifier * a non-constant argument to __integer_pack This patch fixes these issues in the usual way, by passing complain and propagating error_mark_node appropriately. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? gcc/cp/ChangeLog: * decl.cc (build_explicit_specifier): Pass complain to cxx_constant_value. * except.cc (build_noexcept_spec): Likewise. * pt.cc (expand_integer_pack): Likewise. (tsubst_function_decl): Propagate error_mark_node returned from build_explicit_specifier. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/noexcept-type26.C: New test. * g++.dg/cpp2a/explicit19.C: New test. * g++.dg/ext/integer-pack6.C: New test. --- gcc/cp/decl.cc | 2 +- gcc/cp/except.cc | 2 +- gcc/cp/pt.cc | 6 -- gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C | 12 gcc/testsuite/g++.dg/cpp2a/explicit19.C | 12 gcc/testsuite/g++.dg/ext/integer-pack6.C | 13 + 6 files changed, 43 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit19.C create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack6.C diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 936f1cf0197..5404d7e084c 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t complain) expr = build_converted_constant_bool_expr (expr, complain); expr = instantiate_non_dependent_expr (expr, complain); - expr = cxx_constant_value (expr); + expr = cxx_constant_value (expr, NULL_TREE, complain); The patch is OK, but perhaps we want another overload that omits the object parameter? return expr; } diff --git a/gcc/cp/except.cc b/gcc/cp/except.cc index 7fdbc747c22..4d7f0ce102d 100644 --- a/gcc/cp/except.cc +++ b/gcc/cp/except.cc @@ -1257,7 +1257,7 @@ build_noexcept_spec (tree expr, tsubst_flags_t complain) { expr = build_converted_constant_bool_expr (expr, complain); expr = instantiate_non_dependent_expr (expr, complain); - expr = cxx_constant_value (expr); + expr = cxx_constant_value (expr, NULL_TREE, complain); } if (TREE_CODE (expr) == INTEGER_CST) { diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 1c6f4b84612..f16f5ffca20 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -3869,7 +3869,7 @@ expand_integer_pack (tree call, tree args, tsubst_flags_t complain, else { hi = instantiate_non_dependent_expr (hi, complain); - hi = cxx_constant_value (hi); + hi = cxx_constant_value (hi, NULL_TREE, complain); int len = valid_constant_size_p (hi) ? tree_to_shwi (hi) : -1; /* Calculate the largest value of len that won't make the size of the vec @@ -14312,7 +14312,9 @@ tsubst_function_decl (tree t, tree args, tsubst_flags_t complain, /*function_p=*/false, /*i_c_e_p=*/true); spec = build_explicit_specifier (spec, complain); - if (instantiation_dependent_expression_p (spec)) + if (spec == error_mark_node) + return error_mark_node; + else if (instantiation_dependent_expression_p (spec)) store_explicit_specifier (r, spec); else { diff --git a/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C b/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C new file mode 100644 index 000..73058395f0a --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C @@ -0,0 +1,12 @@ +// Verify a non-constant conditional noexcept-specifier of a function type +// is a SFINAE error. +// { dg-do compile { target c++17 } } + +template void f(void() noexcept(T::value)) = delete; +template void f(...); + +struct B { static bool value; }; + +int main() { + f(nullptr); +} diff --git a/gcc/testsuite/g++.dg/cpp2a/explicit19.C b/gcc/testsuite/g++.dg/cpp2a/explicit19.C new file mode 100644 index 000..47903813680 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/explicit19.C @@ -0,0 +1,12 @@ +// Verify a conditional explicit-specifier is a SFINAE context. +// { dg-do compile { target c++20 } } + +struct A { + template explicit(T::value) A(T) = delete; + A(...); +}; + +struct B { static bool value; }; + +A x(0); +A y(B{}); diff --git a/gcc/testsuite/g++.dg/ext/integer-pack6.C b/gcc/testsuite/g++.dg/ext/integer-pack6.C new file mode 100644 index 000..7e16bc44b40 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/integer-pack6.C @@ -0,0 +1,13 @@ +// Verify a non-constant argument to __integer_pack is a SFINAE error. +// { dg-do compile { target c++11 } } + +temp
[PATCH] c++: some missing-SFINAE fixes
It looks like we aren't respecting SFINAE for: * an invalid/non-constant conditional explicit-specifier * a non-constant conditional noexcept-specifier * a non-constant argument to __integer_pack This patch fixes these issues in the usual way, by passing complain and propagating error_mark_node appropriately. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? gcc/cp/ChangeLog: * decl.cc (build_explicit_specifier): Pass complain to cxx_constant_value. * except.cc (build_noexcept_spec): Likewise. * pt.cc (expand_integer_pack): Likewise. (tsubst_function_decl): Propagate error_mark_node returned from build_explicit_specifier. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/noexcept-type26.C: New test. * g++.dg/cpp2a/explicit19.C: New test. * g++.dg/ext/integer-pack6.C: New test. --- gcc/cp/decl.cc | 2 +- gcc/cp/except.cc | 2 +- gcc/cp/pt.cc | 6 -- gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C | 12 gcc/testsuite/g++.dg/cpp2a/explicit19.C | 12 gcc/testsuite/g++.dg/ext/integer-pack6.C | 13 + 6 files changed, 43 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C create mode 100644 gcc/testsuite/g++.dg/cpp2a/explicit19.C create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack6.C diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 936f1cf0197..5404d7e084c 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -18557,7 +18557,7 @@ build_explicit_specifier (tree expr, tsubst_flags_t complain) expr = build_converted_constant_bool_expr (expr, complain); expr = instantiate_non_dependent_expr (expr, complain); - expr = cxx_constant_value (expr); + expr = cxx_constant_value (expr, NULL_TREE, complain); return expr; } diff --git a/gcc/cp/except.cc b/gcc/cp/except.cc index 7fdbc747c22..4d7f0ce102d 100644 --- a/gcc/cp/except.cc +++ b/gcc/cp/except.cc @@ -1257,7 +1257,7 @@ build_noexcept_spec (tree expr, tsubst_flags_t complain) { expr = build_converted_constant_bool_expr (expr, complain); expr = instantiate_non_dependent_expr (expr, complain); - expr = cxx_constant_value (expr); + expr = cxx_constant_value (expr, NULL_TREE, complain); } if (TREE_CODE (expr) == INTEGER_CST) { diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index 1c6f4b84612..f16f5ffca20 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -3869,7 +3869,7 @@ expand_integer_pack (tree call, tree args, tsubst_flags_t complain, else { hi = instantiate_non_dependent_expr (hi, complain); - hi = cxx_constant_value (hi); + hi = cxx_constant_value (hi, NULL_TREE, complain); int len = valid_constant_size_p (hi) ? tree_to_shwi (hi) : -1; /* Calculate the largest value of len that won't make the size of the vec @@ -14312,7 +14312,9 @@ tsubst_function_decl (tree t, tree args, tsubst_flags_t complain, /*function_p=*/false, /*i_c_e_p=*/true); spec = build_explicit_specifier (spec, complain); - if (instantiation_dependent_expression_p (spec)) + if (spec == error_mark_node) + return error_mark_node; + else if (instantiation_dependent_expression_p (spec)) store_explicit_specifier (r, spec); else { diff --git a/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C b/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C new file mode 100644 index 000..73058395f0a --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1z/noexcept-type26.C @@ -0,0 +1,12 @@ +// Verify a non-constant conditional noexcept-specifier of a function type +// is a SFINAE error. +// { dg-do compile { target c++17 } } + +template void f(void() noexcept(T::value)) = delete; +template void f(...); + +struct B { static bool value; }; + +int main() { + f(nullptr); +} diff --git a/gcc/testsuite/g++.dg/cpp2a/explicit19.C b/gcc/testsuite/g++.dg/cpp2a/explicit19.C new file mode 100644 index 000..47903813680 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/explicit19.C @@ -0,0 +1,12 @@ +// Verify a conditional explicit-specifier is a SFINAE context. +// { dg-do compile { target c++20 } } + +struct A { + template explicit(T::value) A(T) = delete; + A(...); +}; + +struct B { static bool value; }; + +A x(0); +A y(B{}); diff --git a/gcc/testsuite/g++.dg/ext/integer-pack6.C b/gcc/testsuite/g++.dg/ext/integer-pack6.C new file mode 100644 index 000..7e16bc44b40 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/integer-pack6.C @@ -0,0 +1,13 @@ +// Verify a non-constant argument to __integer_pack is a SFINAE error. +// { dg-do compile { target c++11 } } + +template struct A { }; + +template auto f(int) -> A<__integer_pack(T::value)...> = delete; +template void f(...); + +struct B { static int value; }; + +int main() { + f(0);
Re: [OG12][PATCH] openmp: Fix handling of target constructs in static member
On 13/09/2022 12:03, Paul-Antoine Arras wrote: Hello, This patch intends to backport e90af965e5c by Jakub Jelinek to devel/omp/gcc-12. The original patch was described here: https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601189.html I've merged and committed it for you. Andrew
Re: [PATCH 2/2] xtensa: Implement new target hook: TARGET_CONSTANT_OK_FOR_CPROP_P
On Mon, Sep 12, 2022 at 8:00 PM Takayuki 'January June' Suwa wrote: > On 2022/09/13 4:34, Max Filippov wrote: > > On Sun, Sep 11, 2022 at 1:50 PM Takayuki 'January June' Suwa > > wrote: > >> > >> This patch implements new target hook TARGET_CONSTANT_OK_FOR_CPROP_P in > >> order to exclude CONST_INTs that cannot fit into a MOVI machine instruction > >> from cprop. > >> > >> gcc/ChangeLog: > >> > >> * config/xtensa/xtensa.c (TARGET_CONSTANT_OK_FOR_CPROP_P): > >> New macro definition. > >> (xtensa_constant_ok_for_cprop_p): > >> Implement the hook as mentioned above. > >> --- > >> gcc/config/xtensa/xtensa.cc | 20 +--- > >> 1 file changed, 17 insertions(+), 3 deletions(-) > > > > Regtested for target=xtensa-linux-uclibc, no new regressions. > > Committed to master. > > > > Oops, sorry, this patch doesn't have the prerequisite patch merged in, so > please revert (that target hook isn't working yet). OK, reverted. -- Thanks. -- Max
Re: [PATCH 4/4] libstdc++: Implement ranges::slide_view from P2442R1
On Mon, 12 Sept 2022 at 17:48, Patrick Palka via Libstdc++ wrote: > > This also implements the LWG 3711 and 3712 changes to slide_view. OK, thanks.
Re: [PATCH 3/4] libstdc++: Implement ranges::chunk_view from P2442R1
On Mon, 12 Sept 2022 at 17:48, Patrick Palka via Libstdc++ wrote: > > This also implements the LWG 3707, 3710 and 3712 changes to chunk_view. > + > + template > +requires input_range<_Vp> > + class chunk_view<_Vp>::_OuterIter > + { > +chunk_view* _M_parent; > + > +constexpr explicit > +_OuterIter(chunk_view& __parent) This can be noexcept. > + : _M_parent(std::__addressof(__parent)) > +{ } > + > +friend chunk_view; > + > + public: > +using iterator_concept = input_iterator_tag; > +using difference_type = range_difference_t<_Vp>; > + > +struct value_type; > + > +_OuterIter(_OuterIter&&) = default; > +_OuterIter& operator=(_OuterIter&&) = default; > + > +constexpr value_type > +operator*() const > +{ > + __glibcxx_assert(*this != default_sentinel); > + return value_type(*_M_parent); > +} > + > +constexpr _OuterIter& > +operator++() > +{ > + __glibcxx_assert(*this != default_sentinel); > + ranges::advance(*_M_parent->_M_current, _M_parent->_M_remainder, > + ranges::end(_M_parent->_M_base)); > + _M_parent->_M_remainder = _M_parent->_M_n; > + return *this; > +} > + > +constexpr void > +operator++(int) > +{ ++*this; } > + > +friend constexpr bool > +operator==(const _OuterIter& __x, default_sentinel_t) > +{ > + return *__x._M_parent->_M_current == > ranges::end(__x._M_parent->_M_base) > + && __x._M_parent->_M_remainder != 0; > +} > + > +friend constexpr difference_type > +operator-(default_sentinel_t, const _OuterIter& __x) > +requires sized_sentinel_for, iterator_t<_Vp>> > +{ > + const auto __dist = ranges::end(__x._M_parent->_M_base) - > *__x._M_parent->_M_current; > + > + if (__dist < __x._M_parent->_M_remainder) > + return __dist == 0 ? 0 : 1; > + > + return 1 + __detail::__div_ceil(__dist - __x._M_parent->_M_remainder, > + __x._M_parent->_M_n); > +} > + > +friend constexpr difference_type > +operator-(const _OuterIter& __x, default_sentinel_t __y) > +requires sized_sentinel_for, iterator_t<_Vp>> > +{ return -(__y - __x); } > + }; > + > + template > +requires input_range<_Vp> > + struct chunk_view<_Vp>::_OuterIter::value_type : view_interface > + { > + private: > +chunk_view* _M_parent; > + > +constexpr explicit > +value_type(chunk_view& __parent) And this. > +: _M_parent(std::__addressof(__parent)) > +{ } > + > +friend _OuterIter; > + > + template > +requires input_range<_Vp> > + class chunk_view<_Vp>::_InnerIter > + { > +chunk_view* _M_parent; > + > +constexpr explicit > +_InnerIter(chunk_view& __parent) noexcept And this already is, so that's nice. > +: _M_parent(std::__addressof(__parent)) > +{ } > + > +friend _OuterIter::value_type; > + > + public: > +using iterator_concept = input_iterator_tag; > +using difference_type = range_difference_t<_Vp>; > +using value_type = range_value_t<_Vp>; > + > +_InnerIter(_InnerIter&&) = default; > +_InnerIter& operator=(_InnerIter&&) = default; > + > +constexpr const iterator_t<_Vp>& > +base() const & > +{ return *_M_parent->_M_current; } > + > +constexpr range_reference_t<_Vp> > +operator*() const > +{ > + __glibcxx_assert(*this != default_sentinel); > + return **_M_parent->_M_current; > +} > + > +constexpr _InnerIter& > +operator++() > +{ > + __glibcxx_assert(*this != default_sentinel); > + ++*_M_parent->_M_current; > + if (*_M_parent->_M_current == ranges::end(_M_parent->_M_base)) > + _M_parent->_M_remainder = 0; > + else > + --_M_parent->_M_remainder; > + return *this; > +} > + > +constexpr void > +operator++(int) > +{ ++*this; } > + > +friend constexpr bool > +operator==(const _InnerIter& __x, default_sentinel_t) noexcept OK with those tweaks.
[OG12][PATCH] openmp: Fix handling of target constructs in static member
Hello, This patch intends to backport e90af965e5c by Jakub Jelinek to devel/omp/gcc-12. The original patch was described here: https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601189.html Thanks, -- Paul-Antoine ArrasFrom c1fb6ff897d0b929807d52cf52d4894e252e7d96 Mon Sep 17 00:00:00 2001 From: Jakub Jelinek Date: Wed, 7 Sep 2022 08:54:13 +0200 Subject: [PATCH] openmp: Fix handling of target constructs in static member functions [PR106829] Just calling current_nonlambda_class_type in static member functions returns non-NULL, but something that isn't *this and if unlucky can match part of the IL and can be added to target clauses. if (DECL_NONSTATIC_MEMBER_P (decl) && current_class_ptr) is a guard used elsewhere (in check_accessibility_of_qualified_id). 2022-09-07 Jakub Jelinek PR c++/106829 * semantics.cc (finish_omp_target_clauses): If current_function_decl isn't a nonstatic member function, don't set data.current_object to non-NULL. * g++.dg/gomp/pr106829.C: New test. (cherry picked from commit e90af965e5c858ba02c0cdfbac35d0a19da1c2f6) --- gcc/cp/ChangeLog.omp | 10 ++ gcc/cp/semantics.cc | 17 - gcc/testsuite/ChangeLog.omp | 8 gcc/testsuite/g++.dg/gomp/pr106829.C | 15 +++ 4 files changed, 41 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/g++.dg/gomp/pr106829.C diff --git gcc/cp/ChangeLog.omp gcc/cp/ChangeLog.omp index 35504f4c92b..c3214f31252 100644 --- gcc/cp/ChangeLog.omp +++ gcc/cp/ChangeLog.omp @@ -1,3 +1,13 @@ +2022-09-09 Paul-Antoine Arras + + Backport from mainline: + 2022-09-07 Jakub Jelinek + + PR c++/106829 + * semantics.cc (finish_omp_target_clauses): If current_function_decl + isn't a nonstatic member function, don't set data.current_object to + non-NULL. + 2022-09-07 Tobias Burnus Backport from mainline: diff --git gcc/cp/semantics.cc gcc/cp/semantics.cc index 7717a820f0d..dad04dc4757 100644 --- gcc/cp/semantics.cc +++ gcc/cp/semantics.cc @@ -9727,16 +9727,15 @@ finish_omp_target_clauses (location_t loc, tree body, tree *clauses_ptr) { omp_target_walk_data data; data.this_expr_accessed = false; + data.current_object = NULL_TREE; - tree ct = current_nonlambda_class_type (); - if (ct) -{ - tree object = maybe_dummy_object (ct, NULL); - object = maybe_resolve_dummy (object, true); - data.current_object = object; -} - else -data.current_object = NULL_TREE; + if (DECL_NONSTATIC_MEMBER_P (current_function_decl) && current_class_ptr) +if (tree ct = current_nonlambda_class_type ()) + { + tree object = maybe_dummy_object (ct, NULL); + object = maybe_resolve_dummy (object, true); + data.current_object = object; + } if (DECL_LAMBDA_FUNCTION_P (current_function_decl)) { diff --git gcc/testsuite/ChangeLog.omp gcc/testsuite/ChangeLog.omp index ed42b20b099..9ca4d936ff5 100644 --- gcc/testsuite/ChangeLog.omp +++ gcc/testsuite/ChangeLog.omp @@ -1,3 +1,11 @@ +2022-09-09 Paul-Antoine Arras + + Backport from mainline: + 2022-09-07 Jakub Jelinek + + PR c++/106829 + * g++.dg/gomp/pr106829.C: New test. + 2022-09-07 Tobias Burnus Backport from mainline: diff --git gcc/testsuite/g++.dg/gomp/pr106829.C gcc/testsuite/g++.dg/gomp/pr106829.C new file mode 100644 index 000..0295efb88ee --- /dev/null +++ gcc/testsuite/g++.dg/gomp/pr106829.C @@ -0,0 +1,15 @@ +// PR c++/106829 + +namespace std +{ + template class complex; + template <> struct complex { complex (double); _Complex double d; }; +} +struct S { void static foo (); }; + +void +S::foo () +{ +#pragma omp target + std::complex c = 0.0; +} -- 2.31.1
Re: [PATCH 2/4] libstdc++: Implement LWG 3569 changes to join_view::_Iterator
On Mon, 12 Sept 2022 at 17:46, Patrick Palka via Libstdc++ wrote: > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk only? I briefly wondered whether we could just use a union there and provide the special members to init/copy/destroy it properly, but we already use in so it's probably not worth it. OK for trunk. > > libstdc++-v3/ChangeLog: > > * include/std/ranges (join_view::_Iterator::_M_satisfy): > Adjust resetting _M_inner as per LWG 3569. > (join_view::_Iterator::_M_inner): Wrap in std::optional > as per LWG 3569. > (join_view::_Iterator::_Iterator): Relax constraints as > per LWG 3569. > (join_view::_Iterator::operator*): Adjust as per LWG 3569. > (join_view::_Iterator::operator->): Likewise. > (join_view::_Iterator::operator++): Likewise. > (join_view::_Iterator::operator--): Likewise. > (join_view::_Iterator::iter_move): Likewise. > (join_view::_Iterator::iter_swap): Likewise. > * testsuite/std/ranges/adaptor/join.cc (test14): New test. > --- > libstdc++-v3/include/std/ranges | 28 +-- > .../testsuite/std/ranges/adaptors/join.cc | 17 +++ > 2 files changed, 30 insertions(+), 15 deletions(-) > > diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges > index 20eb4e82ac8..6297ce7cee3 100644 > --- a/libstdc++-v3/include/std/ranges > +++ b/libstdc++-v3/include/std/ranges > @@ -2746,7 +2746,7 @@ namespace views::__adaptor > } > > if constexpr (_S_ref_is_glvalue) > - _M_inner = _Inner_iter(); > + _M_inner.reset(); > } > > static constexpr auto > @@ -2769,7 +2769,7 @@ namespace views::__adaptor > using _Inner_iter = join_view::_Inner_iter<_Const>; > > _Outer_iter _M_outer = _Outer_iter(); > - _Inner_iter _M_inner = _Inner_iter(); > + optional<_Inner_iter> _M_inner; > _Parent* _M_parent = nullptr; > > public: > @@ -2780,9 +2780,7 @@ namespace views::__adaptor > = common_type_t, > range_difference_t>>; > > - _Iterator() requires (default_initializable<_Outer_iter> > - && default_initializable<_Inner_iter>) > - = default; > + _Iterator() requires default_initializable<_Outer_iter> = default; > > constexpr > _Iterator(_Parent* __parent, _Outer_iter __outer) > @@ -2801,7 +2799,7 @@ namespace views::__adaptor > > constexpr decltype(auto) > operator*() const > - { return *_M_inner; } > + { return **_M_inner; } > > // _GLIBCXX_RESOLVE_LIB_DEFECTS > // 3500. join_view::iterator::operator->() is bogus > @@ -2809,7 +2807,7 @@ namespace views::__adaptor > operator->() const > requires __detail::__has_arrow<_Inner_iter> > && copyable<_Inner_iter> > - { return _M_inner; } > + { return *_M_inner; } > > constexpr _Iterator& > operator++() > @@ -2820,7 +2818,7 @@ namespace views::__adaptor > else > return *_M_parent->_M_inner; > }(); > - if (++_M_inner == ranges::end(__inner_range)) > + if (++*_M_inner == ranges::end(__inner_range)) > { > ++_M_outer; > _M_satisfy(); > @@ -2850,9 +2848,9 @@ namespace views::__adaptor > { > if (_M_outer == ranges::end(_M_parent->_M_base)) > _M_inner = ranges::end(*--_M_outer); > - while (_M_inner == ranges::begin(*_M_outer)) > - _M_inner = ranges::end(*--_M_outer); > - --_M_inner; > + while (*_M_inner == ranges::begin(*_M_outer)) > + *_M_inner = ranges::end(*--_M_outer); > + --*_M_inner; > return *this; > } > > @@ -2879,14 +2877,14 @@ namespace views::__adaptor > > friend constexpr decltype(auto) > iter_move(const _Iterator& __i) > - noexcept(noexcept(ranges::iter_move(__i._M_inner))) > - { return ranges::iter_move(__i._M_inner); } > + noexcept(noexcept(ranges::iter_move(*__i._M_inner))) > + { return ranges::iter_move(*__i._M_inner); } > > friend constexpr void > iter_swap(const _Iterator& __x, const _Iterator& __y) > - noexcept(noexcept(ranges::iter_swap(__x._M_inner, __y._M_inner))) > + noexcept(noexcept(ranges::iter_swap(*__x._M_inner, > *__y._M_inner))) > requires indirectly_swappable<_Inner_iter> > - { return ranges::iter_swap(__x._M_inner, __y._M_inner); } > + { return ranges::iter_swap(*__x._M_inner, *__y._M_inner); } > > friend _Iterator; > template friend struct _Sentinel; > diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/join.cc > b/libstdc++-v3/testsuite/std/ra
Re: [PATCH] libstdc++: Squelch -Wparentheses warning with debug iterators
On Mon, 12 Sept 2022 at 18:40, Patrick Palka via Libstdc++ wrote: > > I noticed compiling e.g. std/ranges/adaptors/join.cc with > -D_GLIBCXX_DEBUG -Wsystem-headers -Wall gives the warning: > > gcc/libstdc++-v3/include/debug/safe_iterator.h:477:9: warning: suggest > parentheses around ‘&&’ within ‘||’ [-Wparentheses] > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk? OK, thanks. > > libstdc++-v3/ChangeLog: > > * include/debug/safe_iterator.h (_GLIBCXX_DEBUG_VERIFY_OPERANDS): > Add parentheses to squelch -Wparentheses. > --- > libstdc++-v3/include/debug/safe_iterator.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libstdc++-v3/include/debug/safe_iterator.h > b/libstdc++-v3/include/debug/safe_iterator.h > index 33f7a86478a..117dc93de60 100644 > --- a/libstdc++-v3/include/debug/safe_iterator.h > +++ b/libstdc++-v3/include/debug/safe_iterator.h > @@ -40,7 +40,7 @@ > #endif > > #define _GLIBCXX_DEBUG_VERIFY_OPERANDS(_Lhs, _Rhs, _BadMsgId, _DiffMsgId) \ > - _GLIBCXX_DEBUG_VERIFY(!_Lhs._M_singular() && !_Rhs._M_singular() \ > + _GLIBCXX_DEBUG_VERIFY((!_Lhs._M_singular() && !_Rhs._M_singular()) \ > || (_Lhs._M_value_initialized() \ > && _Rhs._M_value_initialized()),\ > _M_message(_BadMsgId) \ > -- > 2.37.3.542.gdd3f6c4cae >
Re: [PATCH] PR target/106877: Robustify reg-stack to malformed asm.
On Tue, Sep 13, 2022 at 11:57 AM Roger Sayle wrote: > > > This patch resolves PR target/106877 an ICE-on-invalid inline-asm > regression. An innocent upstream change means that the test case > from PR inline-asm/84683 now hits a different assert in reg-stack.cc's > move_for_stack_reg. Fixed by duplicating Jakub's solution to PR 84683 > https://gcc.gnu.org/pipermail/gcc-patches/2018-March/495193.html > at this second (similar) gcc_assert. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32}, > with no new failures. Ok for mainline? > > > 2022-09-13 Roger Sayle > > gcc/ChangeLog > PR target/106877 > * reg-stack.cc (move_for_stack_reg): Check for any_malformed_asm > in gcc_assert. > > gcc/testsuite/ChangeLog > PR target/106877 > * g++.dg/ext/pr106877.C: New test case. OK. Thanks, Uros.
[PATCH] PR target/106877: Robustify reg-stack to malformed asm.
This patch resolves PR target/106877 an ICE-on-invalid inline-asm regression. An innocent upstream change means that the test case from PR inline-asm/84683 now hits a different assert in reg-stack.cc's move_for_stack_reg. Fixed by duplicating Jakub's solution to PR 84683 https://gcc.gnu.org/pipermail/gcc-patches/2018-March/495193.html at this second (similar) gcc_assert. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-09-13 Roger Sayle gcc/ChangeLog PR target/106877 * reg-stack.cc (move_for_stack_reg): Check for any_malformed_asm in gcc_assert. gcc/testsuite/ChangeLog PR target/106877 * g++.dg/ext/pr106877.C: New test case. Thanks, Roger -- diff --git a/gcc/reg-stack.cc b/gcc/reg-stack.cc index fd03250..95e0e61 100644 --- a/gcc/reg-stack.cc +++ b/gcc/reg-stack.cc @@ -1073,7 +1073,8 @@ move_for_stack_reg (rtx_insn *insn, stack_ptr regstack, rtx pat) break; /* The destination must be dead, or life analysis is borked. */ - gcc_assert (get_hard_regnum (regstack, dest) < FIRST_STACK_REG); + gcc_assert (get_hard_regnum (regstack, dest) < FIRST_STACK_REG + || any_malformed_asm); /* If the source is not live, this is yet another case of uninitialized variables. Load up a NaN instead. */ diff --git a/gcc/testsuite/g++.dg/ext/pr106877.C b/gcc/testsuite/g++.dg/ext/pr106877.C new file mode 100644 index 000..6bffed9 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/pr106877.C @@ -0,0 +1,13 @@ +// PR target/106877 +// { dg-do compile { target i?86-*-* x86_64-*-* } } +// { dg-options "-O1 -m16 -mtune=sandybridge -flive-range-shrinkage -fno-dce" } + +void +foo (float b, double c) +{ + for (int e = 0; e < 2; e++) +{ + asm volatile ("" : "+f" (c));// { dg-error "must specify a single register" } + asm ("" : "+rm" (c = b)); +} +}
[V2 PATCH] RISC-V:Add '-m[no]-csr-check' option in gcc.
From: Jiawei Add -m[no]-csr-check option in gcc part, when enable -mcsr-check option, it will add csr-check in .option section and pass this to assembler. V2: Add assembler support check info for -mcsr-check. Thanks for Kito's suggestions. gcc/ChangeLog: * config.in: New def. * config/riscv/riscv.cc (riscv_file_start): New .option. * config/riscv/riscv.opt: New options. * configure.ac: New check. * doc/invoke.texi: New def. --- gcc/config.in | 6 ++ gcc/config/riscv/riscv.cc | 5 + gcc/config/riscv/riscv.opt | 6 ++ gcc/configure.ac | 5 + gcc/doc/invoke.texi| 6 ++ 5 files changed, 28 insertions(+) diff --git a/gcc/config.in b/gcc/config.in index 9c53319b544..a4c39e1384d 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -616,6 +616,12 @@ #endif +/* Define if your assembler supports -mcsr-check. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_MCSR_CHECK +#endif + + /* Define if your Mac OS X assembler supports -mllvm -x86-pad-for-align=false. */ #ifndef USED_FOR_TARGET diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 675d92c0961..e98e6b1f561 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -5135,6 +5135,11 @@ riscv_file_start (void) if (! riscv_mrelax) fprintf (asm_out_file, "\t.option norelax\n"); + /* If the user specifies "-mcsr-check" on the command line then enable csr + check in the assembler. */ + if (riscv_mcsr_check) +fprintf (asm_out_file, "\t.option csr-check\n"); + if (riscv_emit_attribute_p) riscv_emit_attribute (); } diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt index fbca91b956c..3a12dd47310 100644 --- a/gcc/config/riscv/riscv.opt +++ b/gcc/config/riscv/riscv.opt @@ -132,6 +132,12 @@ Target Bool Var(riscv_mrelax) Init(1) Take advantage of linker relaxations to reduce the number of instructions required to materialize symbol addresses. +mcsr-check +Target Bool Var(riscv_mcsr_check) Init(1) +Enable the CSR checking for the ISA-dependent CRS and the read-only CSR. +The ISA-dependent CSR are only valid when the specific ISA is set. The +read-only CSR can not be written by the CSR instructions. + Mask(64BIT) Mask(MUL) diff --git a/gcc/configure.ac b/gcc/configure.ac index 50bb61c1b61..1a9288ee659 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -5269,6 +5269,11 @@ configured with --enable-newlib-nano-formatted-io.]) [-march=rv32i_zifencei2p0],,, [AC_DEFINE(HAVE_AS_MARCH_ZIFENCEI, 1, [Define if the assembler understands -march=rv*_zifencei.])]) +gcc_GAS_CHECK_FEATURE([-mcsr-check], + gcc_cv_as_riscv_csr_check, + [-mcsr-check],,, + [AC_DEFINE(HAVE_AS_MCSR_CHECK, 1, +[Define if the assembler understands -mcsr-check.])]) ;; loongarch*-*-*) gcc_GAS_CHECK_FEATURE([.dtprelword support], diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index dd3302fcd15..7caade26b94 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1224,6 +1224,7 @@ See RS/6000 and PowerPC Options. -mbig-endian -mlittle-endian @gol -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{reg} @gol -mstack-protector-guard-offset=@var{offset}} +-mcsr-check -mno-csr-check @gol @emph{RL78 Options} @gccoptlist{-msim -mmul=none -mmul=g13 -mmul=g14 -mallregs @gol @@ -28551,6 +28552,11 @@ linker relaxations. Emit (do not emit) RISC-V attribute to record extra information into ELF objects. This feature requires at least binutils 2.32. +@item -mcsr-check +@itemx -mno-csr-check +@opindex mcsr-check +Enables or disables the CSR checking. + @item -malign-data=@var{type} @opindex malign-data Control how GCC aligns variables and constants of array, structure, or union -- 2.34.1
[PATCH] middle-end/106909 - CTRL altering flag after folding
The following makes sure to clear the CTRL altering flag when folding emits a __builitin_unreachable in place of a virtual call which now might become a trap. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR middle-end/106909 * gimple-fold.cc (gimple_fold_call): Clear the ctrl-altering flag of a unreachable call. --- gcc/gimple-fold.cc | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index a1704784bc9..9055cd8982d 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -5512,6 +5512,7 @@ gimple_fold_call (gimple_stmt_iterator *gsi, bool inplace) { location_t loc = gimple_location (stmt); gimple *new_stmt = gimple_build_builtin_unreachable (loc); + gimple_call_set_ctrl_altering (new_stmt, false); /* If the call had a SSA name as lhs morph that into an uninitialized value. */ if (lhs && TREE_CODE (lhs) == SSA_NAME) -- 2.35.3
[PATCH] tree-optimization/106913 - ICE with -da and -Wuninitialized
The following avoids setting and not clearing an auto_bb_flag on EXIT_BLOCK which we don't verify for such stale flags but dump_bb_info still asserts on them. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/106913 * tree-ssa-uninit.cc (warn_uninitialized_vars): Do not set ft_reachable on EXIT_BLOCK. --- gcc/tree-ssa-uninit.cc | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/gcc/tree-ssa-uninit.cc b/gcc/tree-ssa-uninit.cc index 4a1c333d9cb..eae29f88f9d 100644 --- a/gcc/tree-ssa-uninit.cc +++ b/gcc/tree-ssa-uninit.cc @@ -1013,11 +1013,9 @@ warn_uninitialized_vars (bool wmaybe_uninit) if (ee) bb = ee->dest; else - { - bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb); - if (!bb || bb->index == EXIT_BLOCK) - break; - } + bb = get_immediate_dominator (CDI_POST_DOMINATORS, bb); + if (!bb || bb->index == EXIT_BLOCK) + break; } FOR_EACH_BB_FN (bb, cfun) -- 2.35.3
[pushed] aarch64: Vector move fixes for +nosimd
This patch fixes various issues around the handling of vectors and (particularly) vector structures with +nosimd. Previously, passing and returning structures would trigger an ICE, since: * we didn't allow the structure modes to be stored in FPRs * we didn't provide +nosimd move patterns * splitting the moves into word-sized pieces (the default strategy without move patterns) doesn't work because the registers are doubleword sized. The patch is a bit of a hodge-podge since a lot of the handling of moves, register costs, and register legitimacy is so interconnected. It didn't seem feasible to split things further. Some notes: * The patch recognises vector and tuple modes based on TARGET_FLOAT rather than TARGET_SIMD, and instead adds TARGET_SIMD to places that really do need the vector ISA. This is necessary for the modes to be handled correctly in register arguments and returns. * The 64-bit (DREG) STP peephole required TARGET_SIMD but the LDP peephole didn't. I think the LDP one is right, since DREG moves could involve GPRs as well as FPRs. * The patch keeps the existing choices of instructions for TARGET_SIMD, just in case they happen to be better than FMOV on some uarches. * Before the patch, +nosimd Q<->Q moves of 128-bit scalars went via a GPR, thanks to a secondary reload pattern. This approach might not be ideal, but there's no reason that 128-bit vectors should behave differently from 128-bit scalars. The patch therefore extends the current scalar approach to vectors. * Multi-vector LD1 and ST1 require TARGET_SIMD, so the TARGET_FLOAT structure moves need to use LDP/STP and LDR/STR combinations instead. That's also what we do for big-endian even with TARGET_SIMD, so most of the code was already there. The patterns for structures of 64-bit vectors are identical, but the patterns for structures of 128-bit vectors need to cope with the lack of 128-bit Q<->Q moves. It isn't feasible to move multi-vector tuples via GPRs, so the patch moves them via memory instead. This contaminates the port with its first secondary memory reload. Tested on aarch64-linux-gnu & pushed. Richard gcc/ * config/aarch64/aarch64.cc (aarch64_classify_vector_mode): Use TARGET_FLOAT instead of TARGET_SIMD. (aarch64_vectorize_related_mode): Restrict ADVSIMD handling to TARGET_SIMD. (aarch64_hard_regno_mode_ok): Don't allow tuples of 2 64-bit vectors in GPRs. (aarch64_classify_address): Treat little-endian structure moves like big-endian for TARGET_FLOAT && !TARGET_SIMD. (aarch64_secondary_memory_needed): New function. (aarch64_secondary_reload): Handle 128-bit Advanced SIMD vectors in the same way as TF, TI and TD. (aarch64_rtx_mult_cost): Restrict ADVSIMD handling to TARGET_SIMD. (aarch64_rtx_costs): Likewise. (aarch64_register_move_cost): Treat a pair of 64-bit vectors separately from a single 128-bit vector. Handle the cost implied by aarch64_secondary_memory_needed. (aarch64_simd_valid_immediate): Restrict ADVSIMD handling to TARGET_SIMD. (aarch64_expand_vec_perm_const_1): Likewise. (TARGET_SECONDARY_MEMORY_NEEDED): New macro. * config/aarch64/iterators.md (VTX): New iterator. * config/aarch64/aarch64.md (arches): Add fp_q as a synonym of simd. (arch_enabled): Adjust accordingly. (@aarch64_reload_mov): Extend to... (@aarch64_reload_mov): ...this. * config/aarch64/aarch64-simd.md (mov): Require TARGET_FLOAT rather than TARGET_SIMD. (movmisalign): Likewise. (load_pair): Likewise. (vec_store_pair): Likewise. (load_pair): Likewise. (vec_store_pair): Likewise. (@aarch64_split_simd_mov): Likewise. (aarch64_get_low): Likewise. (aarch64_get_high): Likewise. (aarch64_get_half): Likewise. Canonicalize to a move for lowpart extracts. (*aarch64_simd_mov): Require TARGET_FLOAT rather than TARGET_SIMD. Use different w<-w and r<-w instructions for !TARGET_SIMD. Disable immediate moves for !TARGET_SIMD but add an alternative specifically for w<-Z. (*aarch64_simd_mov): Require TARGET_FLOAT rather than TARGET_SIMD. Likewise for the associated define_splits. Disable FPR moves and immediate moves for !TARGET_SIMD but add an alternative specifically for w<-Z. (aarch64_simd_mov_from_high): Require TARGET_FLOAT rather than TARGET_SIMD. Restrict the existing alternatives to TARGET_SIMD but add a new r<-w one for !TARGET_SIMD. (*aarch64_get_high): New pattern. (load_pair_lanes): Require TARGET_FLOAT rather than TARGET_SIMD. (store_pair_lanes): Likewise. (*aarch64_combine_internal): Likewise. Restrict existing w<-w, w<-r and w<-m alternatives to TARGET_SIMD but a
[pushed] aarch64: Disassociate ls64 from simd
The ls64-related move expanders and splits required TARGET_SIMD. That isn't necessary, since the 64-byte values are stored entirely in GPRs. (The associated define_insn was already correct.) I wondered about moving the patterns to aarch64.md, but it wasn't clear-cut. Tested on aarch64-linux-gnu & pushed. Richard gcc/ * config/aarch64/aarch64-simd.md (movv8di): Remove TARGET_SIMD condition. Likewise for the related define_split. Tweak formatting. gcc/testsuite/ * gcc.target/aarch64/acle/ls64_asm_2.c: New test. --- gcc/config/aarch64/aarch64-simd.md | 18 +- .../gcc.target/aarch64/acle/ls64_asm_2.c | 9 + 2 files changed, 18 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/ls64_asm_2.c diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 587a45d7772..d4662c76a58 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -7087,7 +7087,7 @@ (define_expand "mov" (define_expand "movv8di" [(set (match_operand:V8DI 0 "nonimmediate_operand") (match_operand:V8DI 1 "general_operand"))] - "TARGET_SIMD" + "" { if (can_create_pseudo_p () && MEM_P (operands[0])) operands[1] = force_reg (V8DImode, operands[1]); @@ -7479,7 +7479,7 @@ (define_split (define_split [(set (match_operand:V8DI 0 "nonimmediate_operand") (match_operand:V8DI 1 "general_operand"))] - "TARGET_SIMD && reload_completed" + "reload_completed" [(const_int 0)] { if (register_operand (operands[0], V8DImode) @@ -7489,15 +7489,15 @@ (define_split DONE; } else if ((register_operand (operands[0], V8DImode) -&& memory_operand (operands[1], V8DImode)) - || (memory_operand (operands[0], V8DImode) -&& register_operand (operands[1], V8DImode))) + && memory_operand (operands[1], V8DImode)) + || (memory_operand (operands[0], V8DImode) + && register_operand (operands[1], V8DImode))) { for (int offset = 0; offset < 64; offset += 16) -emit_move_insn (simplify_gen_subreg (TImode, operands[0], - V8DImode, offset), -simplify_gen_subreg (TImode, operands[1], - V8DImode, offset)); + emit_move_insn (simplify_gen_subreg (TImode, operands[0], +V8DImode, offset), + simplify_gen_subreg (TImode, operands[1], +V8DImode, offset)); DONE; } else diff --git a/gcc/testsuite/gcc.target/aarch64/acle/ls64_asm_2.c b/gcc/testsuite/gcc.target/aarch64/acle/ls64_asm_2.c new file mode 100644 index 000..1b4277180a6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/acle/ls64_asm_2.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-O" } */ + +#pragma GCC target "+ls64+nofp" + +#include "ls64_asm.c" + +/* { dg-final { scan-assembler-times {\tldp\t} 12 } } */ +/* { dg-final { scan-assembler-times {\tstp\t} 4 } } */ -- 2.25.1
RE: [PATCH] Enhance final_value_replacement_loop to handle bitop with an invariant induction.[PR105735]
Hi Richard, Thanks you so much for reviewing this patch. I really appreciate it. For these review comments, I have made some changes. > That's a single-stmt match, you shouldn't use match.pd matching for this. > Instead just do > > if (is_gimple_assign (stmt) > && ((code = gimple_assign_rhs_code (stmt)), true) > && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == > BIT_XOR_EXPR)) Yes, I fixed it and dropped modification for match.pd. > and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in bit_op:c is > redundant btw. - while the name suggests "with invariant" you don't actually > check for that. But again, given canonicalization rules the invariant will > be rhs2 > so above add > > && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST For " with invariant", this needed op1 is invariant, and I used `expr_invariant_in_loop_p (loop, match_op[0])` for check. And op2 just be PHI is ok. If op2 is INTEGER_CST, existing gcc can be directly optimized and do not need modification. > you probably need dg-require-effective-target longlong, but is it necessary to > use long long for the testcases in the first place? > The IV seems to be unused, if it should match the variables bit size use > sizeof > (type) * 8 Yes, It is not necessary to use long long for the testcases. I changed type to unsigned int. > > + inv = PHI_ARG_DEF_FROM_EDGE (header_phi, loop_preheader_edge > > + (loop)); return fold_build2 (code1, type, inv, match_op[0]); } > > The } goes to the next line. Sorry, It might be something wrong with my use of gcc send-email format. > > + tree bitinv_def; > > + if ((bitinv_def > > please use else if here Sorry, If use the else if here, there is no corresponding above if. I'm not sure if you mean change bitwise induction expression if to else if. Do you agree with these changes? Thanks again for taking a look. Thanks, Lingling > -Original Message- > From: Richard Biener > Sent: Tuesday, August 23, 2022 3:27 PM > To: Kong, Lingling > Cc: Liu, Hongtao ; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] Enhance final_value_replacement_loop to handle bitop > with an invariant induction.[PR105735] > > On Thu, Aug 18, 2022 at 8:48 AM Kong, Lingling via Gcc-patches patc...@gcc.gnu.org> wrote: > > > > Hi, > > > > This patch is for pr105735/pr101991. It will enable below optimization: > > { > > - long unsigned int bit; > > - > > - [local count: 32534376]: > > - > > - [local count: 1041207449]: > > - # tmp_10 = PHI > > - # bit_12 = PHI > > - tmp_7 = bit2_6(D) & tmp_10; > > - bit_8 = bit_12 + 1; > > - if (bit_8 != 32) > > -goto ; [96.97%] > > - else > > -goto ; [3.03%] > > - > > - [local count: 1009658865]: > > - goto ; [100.00%] > > - > > - [local count: 32534376]: > > - # tmp_11 = PHI > > - return tmp_11; > > + tmp_11 = tmp_4(D) & bit2_6(D); > > + return tmp_11; > > > > } > > > > Ok for master ? > > > > gcc/ChangeLog: > > > > PR middle-end/105735 > > * match.pd (bitop_with_inv_p): New match. > > * tree-scalar-evolution.cc (gimple_bitop_with_inv_p): Declare. > > (analyze_and_compute_bitop_with_inv_effect): New function. > > (final_value_replacement_loop): Enhanced to handle bitop > > with inv induction. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/i386/pr105735-1.c: New test. > > * gcc.target/i386/pr105735-2.c: New test. > > --- > > gcc/match.pd | 4 + > > gcc/testsuite/gcc.target/i386/pr105735-1.c | 88 ++ > gcc/testsuite/gcc.target/i386/pr105735-2.c | 28 +++ > > gcc/tree-scalar-evolution.cc | 59 +++ > > 4 files changed, 179 insertions(+) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr105735-2.c > > > > diff --git a/gcc/match.pd b/gcc/match.pd index > > 562138a8034..cfe593ebb02 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -8050,6 +8050,10 @@ and, > > (bit_not > >(nop_convert1? (bit_xor@0 (convert2? (lshift integer_onep@1 @2)) > > @3 > > > > +(for bit_op (bit_and bit_ior bit_xor) (match (bitop_with_inv_p @0 > > +@1) > > + (bit_op:c @0 @1))) > > + > > That's a single-stmt match, you shouldn't use match.pd matching for this. > Instead just do > > if (is_gimple_assign (stmt) > && ((code = gimple_assign_rhs_code (stmt)), true) > && (code == BIT_AND_EXPR || code == BIT_IOR_EXPR || code == > BIT_XOR_EXPR)) >.. > > and pick gimple_assign_rhs{1,2} (stmt) as the operands. The :c in bit_op:c is > redundant btw. - while the name suggests "with invariant" you don't actually > check for that. But again, given canonicalization rules the invariant will > be rhs2 > so above add > > && TREE_CODE (gimple_assign_rhs2 (stmt)) == INTEGER_CST > > for example. > > > /* n - (((n > C1) ? n : C1) & -C2) -> n & C1 for unsigned case. > >
[committed] libgomp.texi: move item from gcn to nvptx
Placed the reverse-offload -march=sm_30 vs. >=sm_35 remark accidentally to the 'gcn' section. The surrounding wording it very similar, hence, I did not spot it. Committed as r13-2636. Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 commit eec36f27c3ca85d3b6b469e7161b63b69a5823ac Author: Tobias Burnus Date: Tue Sep 13 09:08:57 2022 +0200 libgomp.texi: move item from gcn to nvptx I misplaced one remark into 'gcn' instead of 'nvptx' in commit r13-2625-g6b43f556f392a7165582aca36a19fe7389d995b2 libgomp/ChangeLog: * libgomp.texi (gcn): Move misplaced -march=sm_30 remark to ... (nvptx): ... here. diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi index 1f402d6df79..5aef987f00a 100644 --- a/libgomp/libgomp.texi +++ b/libgomp/libgomp.texi @@ -4386,9 +4386,6 @@ The implementation remark: @item I/O within OpenMP target regions and OpenACC parallel/kernels is supported using the C library @code{printf} functions and the Fortran @code{print}/@code{write} statements. -@item Compilation OpenMP code that contains @code{requires reverse_offload} - requires at least @code{-march=sm_35}, compiling for @code{-march=sm_30} - is not supported. @end itemize @@ -4435,6 +4432,9 @@ The implementation remark: @item I/O within OpenMP target regions and OpenACC parallel/kernels is supported using the C library @code{printf} functions. Note that the Fortran @code{print}/@code{write} statements are not supported, yet. +@item Compilation OpenMP code that contains @code{requires reverse_offload} + requires at least @code{-march=sm_35}, compiling for @code{-march=sm_30} + is not supported. @end itemize
Re: [PATCH v2, rs6000] Change insn condition from TARGET_64BIT to TARGET_POWERPC64 for VSX scalar extract/insert instructions
Hi > On 13 Sep 2022, at 03:34, HAO CHEN GUI via Gcc-patches > wrote: > On 7/9/2022 上午 1:19, Segher Boessenkool wrote: >> make -k -j60 check >> RUNTESTFLAGS="--target_board=unix'{-m64,-m32,-m32/-mpowerpc64}'" >> >> It is fine to not test -m32/-mpowerpc64 so often, and certaionly not >> something I will ask everyone to always do :-) > > IMO, if we add "-mpowerpc64" into dg-options, the "-m32/-mpowerpc64" will be > tested > automatically. It will increase the test coverage. So the concern is it > increases test > time? Do you mean for the powerpc-target tests, or for the entire suite? - 50% increase is pretty significant to the entire suite. It would probably only be acceptable on the newer powerful machines. The stategy would need to be a little more subtle than adding it everywhere - it is not applicable to 32b targets running on 32b hardware - however, it *is* potentially applicable to 32b targets running on 64b hardware (I have one of those). There are tricks you can play with dejagnu to automate this for your setup… by pointing the DEJAGNU environment to a script (or by using .dejagnurc in your home directory). You can add customisation in this way, even to set different parameters per host. This saves me from remembering to put the RUNTESTFLAGS=… extras for the target on each test line. — so here’s an example as I use it.. I have an environment/script setup like this (which I use to control what is done across a range of machines with capabilities that vary by at least a factor of 10): DEJAGNU=/path/to/gcc-runtest-site.exp gcc-runtest-site.exp: global target_list global tool_timeout case "$target_triplet" in { { "*-*-darwin[5-8]*" } { set target_list { "unix" } set tool_timeout 240 puts "tool timeout 3mins" } { "*-*-darwin9*" } { set target_list { "unix{-m32,-m64}" } set tool_timeout 240 } { "i?86-*-darwin10*" } { set target_list { "unix" } } { "powerpc-*-darwin10*" } { set target_list { "unix" } } { "*-*-darwin1[01234567]*" } { set target_list { "unix{-m64,-m32}" } set tool_timeout 120 puts "gcc-runtest-site : tool timeout 2 mins" } { "*-*-darwin1[89]*" } { set target_list { "unix" } set tool_timeout 60 puts "gcc-runtest-site : tool timeout 1 min" } { "x86_64-*-linux-gnu" } { set target_list { "unix{-m32,-m64}" } set tool_timeout 60 } default { # Works for most unixy things and for darwin < 9. set target_list { "unix" } } } - sometimes I override this with specific values (especially for timeout) in ~/..dejagnurc on very slow boxes. 0.02GBP only, cheers Iain
Re: [Patch] libgomp/nvptx: Prepare for reverse-offload callback handling
@Alexander/@Tom – Can you comment on both libgomp/config/nvptx + libgomp/plugin/plugin-nvptx.c ? (Comments on the rest are welcome, too) (Updated patch enclosed) Because Jakub asked: I'm afraid you need Alexander or Tom here, I don't feel I can review it; I could rubber stamp it if they are ok with it. Regarding: How did the standardization process for this feature look like, how did it pass if it's not efficiently implementable for the major offloading targets? It doesn't have to be implementable on all major offloading targets, it is enough when it can work on some. As one needs to request the reverse offloading through a declarative directive, it is always possible in that case to just pretend devices that don't support it don't exist. First, I think it is better to provide a feature even if it is slow – than not providing it at all. Secondly, as Jakub mentioned, it is not required that all devices support this feature well. It is sufficient some do. I believe on of the main uses is debugging and for this use, the performance is not critical. This patch attempts to have no overhead if the feature is not used (no 'omp requires reverse_offload' and no actual reverse offload). Additionally, for GCN, it can be implemented with almost no overhead by using the feature used for I/O. (CUDA implements 'printf' internally – but does not permit to piggyback on this feature.) * * * I think in the future, we could additionally pass information to GOMP_target_ext whether a target region is known not to do reverse offload – both by checking what's in the region and by utilizing an 'absent(target)' assumption places in the outer target regsion on an '(begin) assume(s)' directive. That should at least help with the common case of having no reverse offload – even if it does not for some large kernel which does use reverse offload for non-debugging purpose (e.g. to trigger file I/O or inter-node communication). * * * Regarding the implementation: I left in 'usleep(1)' for now – 1µs seems to be not too bad and I have no idea what's better. I also don't have any idea what's the overhead for accessing concurrently accessible managed memory from the host (is it on the host until requested from the device – or is it always on the device and needs to be copied/migrated to the host for every access). Likewise, I don't know how much overhead it is to D2H copy the memory via the second CUDA stream. Suggestions are welcome. But as this code is strictly confined to a single function, it can easily be modified later. Documentation: I have not mentioned caveats in https://gcc.gnu.org/onlinedocs/libgomp/nvptx.html as the reverse offload is not yet enabled, even with this patch. On 26.08.22 11:07, Tobias Burnus wrote: PRE-REMARK As nvptx (and all other plugins) returns <= 0 for GOMP_OFFLOAD_get_num_devices if GOMP_REQUIRES_REVERSE_OFFLOAD is set. This patch is currently still a no op. The patch is almost stand alone, except that it either needs a void *rev_fn_table = NULL; in GOMP_OFFLOAD_load_image or the following patch: [Patch][2/3] nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600348.html (which in turn needs the '[1/3]' patch). Not required to be compilable, but the patch is based on the ideas/code from the reverse-offload ME patch; the latter adds calls to GOMP_target_ext (omp_initial_device, which is for host fallback code processed by the normal target_ext and for device code by the target_ext of this patch. → "[Patch] OpenMP: Support reverse offload (middle end part)" https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598662.html * * * This patch adds initial offloading support for nvptx. When the nvptx's device GOMP_target_ext is called - it creates a lock, fills a struct with the argument pointers (addr, kinds, sizes), its device number and the set the function pointer address. On the host side, the last address is checked - if fn_addr != NULL, it passes all arguments on to the generic (target.c) gomp_target_rev to do the actual offloading. CUDA does lockup when trying to copy data from the currently running stream; hence, a new stream is generated to do the memory copying. Just having managed memory is not enough - it needs to be concurrently accessible - otherwise, it will segfault on the host when migrated to the device. OK for mainline? * * * Future work for nvptx: * Adjust 'sleep', possibly using different values with and without USM and to do shorter sleeps than usleep(1)? * Set a flag whether there is any offload function at all, avoiding to run the more expensive check if there is 'requires reverse_offload' without actual reverse-offloading functions present. (Recall that the '2/3' patch, mentioned above, only has fn != NULL for reverse-offload functions.) * Document → libgomp.texi that reverse offload may cause some performance overhead for all target regions. + That reverse of