Re: [PATCH v2] fix spelling of -linker-output-auto-nolto-rel
Hi Rasmus, The VxWorks part is ok. The plugin part looks good to me as well, now in line with Richard's comment, but is not for me to approve. Thanks, and Thanks Richard for the original review. Cheers, Olivier > On 2 Dec 2021, at 08:22, Rasmus Villemoes wrote: > > The transposition nolto -> notlo is confusing and it makes the long > name even harder to read than it already is - I kept reading it as > "not lo" until I realized it was a simple typo. > > Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks) > > lto-plugin/ > * lto-plugin.c: Fix -linker-output-auto-notlo-rel -> > -linker-output-auto-nolto-rel typo. > (process_option): Adjust accordingly, accepting both old and > new spelling. > > gcc/ > * config/vxworks.h (LTO_PLUGIN_SPEC): Adapt to corrected > spelling of -linker-output-auto-nolto-rel. > --- > v2: still accept -linker-output-auto-notlo-rel in the plugin. > > Like this? > > gcc/config/vxworks.h| 2 +- > lto-plugin/lto-plugin.c | 6 -- > 2 files changed, 5 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/vxworks.h b/gcc/config/vxworks.h > index e41f16a51e8..bddf2c37f42 100644 > --- a/gcc/config/vxworks.h > +++ b/gcc/config/vxworks.h > @@ -306,4 +306,4 @@ extern void vxworks_emit_call_builtin___clear_cache (rtx > begin, rtx end); >further incremental LTO linking. We do not do repeated incremental linking >so silence the warning (instead of passing -flinker-output=nolto-rel). */ > #undef LTO_PLUGIN_SPEC > -#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-notlo-rel}" > +#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-nolto-rel}" > diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c > index 6ab9822f369..b73483de994 100644 > --- a/lto-plugin/lto-plugin.c > +++ b/lto-plugin/lto-plugin.c > @@ -32,7 +32,7 @@ along with this program; see the file COPYING3. If not see >-nop: Instead of running lto-wrapper, pass the original to the plugin. This >only works if the input files are hybrid. >-linker-output-known: Do not determine linker output > - -linker-output-auto-notlo-rel: Switch from rel to nolto-rel mode without > + -linker-output-auto-nolto-rel: Switch from rel to nolto-rel mode without >warning. This is used on systems like VxWorks (kernel) where the link is >always partial and repeated incremental linking is generally not used. >-sym-style={none,win32,underscore|uscore} > @@ -1321,7 +1321,9 @@ process_option (const char *option) > { > if (strcmp (option, "-linker-output-known") == 0) > linker_output_known = true; > - else if (strcmp (option, "-linker-output-auto-notlo-rel") == 0) > + /* Also accept "notlo" for backwards compatibility. */ > + else if ((strcmp (option, "-linker-output-auto-nolto-rel") == 0) > + || (strcmp (option, "-linker-output-auto-notlo-rel") == 0)) > linker_output_auto_nolto_rel = true; > else if (strcmp (option, "-debug") == 0) > debug = true; > -- > 2.31.1 >
[PATCH v2] fix spelling of -linker-output-auto-nolto-rel
The transposition nolto -> notlo is confusing and it makes the long name even harder to read than it already is - I kept reading it as "not lo" until I realized it was a simple typo. Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks) lto-plugin/ * lto-plugin.c: Fix -linker-output-auto-notlo-rel -> -linker-output-auto-nolto-rel typo. (process_option): Adjust accordingly, accepting both old and new spelling. gcc/ * config/vxworks.h (LTO_PLUGIN_SPEC): Adapt to corrected spelling of -linker-output-auto-nolto-rel. --- v2: still accept -linker-output-auto-notlo-rel in the plugin. Like this? gcc/config/vxworks.h| 2 +- lto-plugin/lto-plugin.c | 6 -- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/gcc/config/vxworks.h b/gcc/config/vxworks.h index e41f16a51e8..bddf2c37f42 100644 --- a/gcc/config/vxworks.h +++ b/gcc/config/vxworks.h @@ -306,4 +306,4 @@ extern void vxworks_emit_call_builtin___clear_cache (rtx begin, rtx end); further incremental LTO linking. We do not do repeated incremental linking so silence the warning (instead of passing -flinker-output=nolto-rel). */ #undef LTO_PLUGIN_SPEC -#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-notlo-rel}" +#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-nolto-rel}" diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c index 6ab9822f369..b73483de994 100644 --- a/lto-plugin/lto-plugin.c +++ b/lto-plugin/lto-plugin.c @@ -32,7 +32,7 @@ along with this program; see the file COPYING3. If not see -nop: Instead of running lto-wrapper, pass the original to the plugin. This only works if the input files are hybrid. -linker-output-known: Do not determine linker output - -linker-output-auto-notlo-rel: Switch from rel to nolto-rel mode without + -linker-output-auto-nolto-rel: Switch from rel to nolto-rel mode without warning. This is used on systems like VxWorks (kernel) where the link is always partial and repeated incremental linking is generally not used. -sym-style={none,win32,underscore|uscore} @@ -1321,7 +1321,9 @@ process_option (const char *option) { if (strcmp (option, "-linker-output-known") == 0) linker_output_known = true; - else if (strcmp (option, "-linker-output-auto-notlo-rel") == 0) + /* Also accept "notlo" for backwards compatibility. */ + else if ((strcmp (option, "-linker-output-auto-nolto-rel") == 0) + || (strcmp (option, "-linker-output-auto-notlo-rel") == 0)) linker_output_auto_nolto_rel = true; else if (strcmp (option, "-debug") == 0) debug = true; -- 2.31.1
Re: [PATCH] c++: Handle auto(x) in parameter-declaration-clause [PR103401]
On 12/1/21 10:16, Marek Polacek wrote: In C++23, auto(x) is valid, so decltype(auto(x)) should also be valid, so void f(decltype(auto(0))); should be just as void f(int); but currently, everytime we see 'auto' in a parameter-declaration-clause, we try to synthesize_implicit_template_parm for it, creating a new template parameter list. The code above actually has us calling s_i_t_p twice; once from cp_parser_decltype_expr -> cp_parser_postfix_expression which fails and then again from cp_parser_decltype_expr -> cp_parser_expression. So it looks like we have f and we accept ill-formed code. So we need to be more careful about synthesizing the implicit template parameter. cp_parser_postfix_expression looked like a sensible place. Does this cover other uses of auto in decltype, such as void f(decltype(new auto{0})); ? Should we adjust this flag in cp_parser_decltype along with all the other flags? The r11-1913 change is OK: we need to make sure that we see '(auto)' after decltype to go ahead with 'decltype(auto)'. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? PR c++/103401 gcc/cp/ChangeLog: * parser.c (cp_parser_postfix_expression): Set auto_is_implicit_function_template_parm_p when parsing a postfix expression. gcc/testsuite/ChangeLog: * g++.dg/cpp23/auto-fncast7.C: New test. * g++.dg/cpp23/auto-fncast8.C: New test. --- gcc/cp/parser.c | 2 ++ gcc/testsuite/g++.dg/cpp23/auto-fncast7.C | 9 gcc/testsuite/g++.dg/cpp23/auto-fncast8.C | 28 +++ 3 files changed, 39 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast7.C create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast8.C diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 55e6a1a8b3a..c43b180f888 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -7508,6 +7508,8 @@ cp_parser_postfix_expression (cp_parser *parser, bool address_p, bool cast_p, looking at a functional cast. We could also be looking at an id-expression. So, we try the functional cast, and if that doesn't work we fall back to the primary-expression. */ + auto cleanup = make_temp_override + (parser->auto_is_implicit_function_template_parm_p, false); cp_parser_parse_tentatively (parser); /* Look for the simple-type-specifier. */ ++parser->prevent_constrained_type_specifiers; diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C b/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C new file mode 100644 index 000..763164f3e5b --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C @@ -0,0 +1,9 @@ +// PR c++/103401 +// { dg-do compile { target c++23 } } + +void f(decltype(auto(0))) { } + +int main() +{ + f(0); // { dg-error "no matching function" } +} diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C b/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C new file mode 100644 index 000..760827a5d6e --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C @@ -0,0 +1,28 @@ +// PR c++/103401 +// { dg-do compile { target c++23 } } + +void f1 (decltype(auto(0))); +void f2 (decltype(auto{0})); +void f3 (int = auto(42)); +void f4 (int = auto{42}); +void f5 (decltype(auto(0)) = auto(42)); +void f6 (auto (x)); +void f7 (int[auto(10)]); +void f8 (int[auto{10}]); + +void +g () +{ + f1 (1); + f2 (1); + f3 (); + f3 (1); + f4 (); + f4 (1); + f5 (); + f5 (1); + f6 ('a'); + int a[10]; + f7 (&a[0]); + f8 (&a[0]); +} base-commit: e5440bc08e07fd491dcccd47e1b86a5985ee117c
Re: [PATCH] c++: ICE with unnamed tparm and concept [PR103408]
On 12/1/21 10:16, Marek Polacek wrote: Here we crash when issuing the "constraint C has type T, not bool" error, because pp_cxx_parameter_mapping wasn't prepared to see an anonymous template parameter. With this patch we print error: constraint 'auto() [with = 0]' has type '', not 'bool' The "" is what this patch adds. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? OK. PR c++/103408 gcc/cp/ChangeLog: * cxx-pretty-print.c (pp_cxx_parameter_mapping): Print "" rather than crash on an unnamed template parameter. gcc/testsuite/ChangeLog: * g++.dg/cpp23/concepts-err1.C: New test. --- gcc/cp/cxx-pretty-print.c | 4 +++- gcc/testsuite/g++.dg/cpp23/concepts-err1.C | 6 ++ 2 files changed, 9 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp23/concepts-err1.C diff --git a/gcc/cp/cxx-pretty-print.c b/gcc/cp/cxx-pretty-print.c index 25cabfee39f..3ea357deb80 100644 --- a/gcc/cp/cxx-pretty-print.c +++ b/gcc/cp/cxx-pretty-print.c @@ -2891,8 +2891,10 @@ pp_cxx_parameter_mapping (cxx_pretty_printer *pp, tree map) if (TYPE_P (parm)) pp->type_id (parm); + else if (tree name = DECL_NAME (TEMPLATE_PARM_DECL (parm))) + pp_cxx_tree_identifier (pp, name); else - pp_cxx_tree_identifier (pp, DECL_NAME (TEMPLATE_PARM_DECL (parm))); + pp->translate_string (""); pp_cxx_whitespace (pp); pp_equal (pp); diff --git a/gcc/testsuite/g++.dg/cpp23/concepts-err1.C b/gcc/testsuite/g++.dg/cpp23/concepts-err1.C new file mode 100644 index 000..e5bdc542bad --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp23/concepts-err1.C @@ -0,0 +1,6 @@ +// PR c++/103408 +// { dg-do compile { target c++23 } } + +template +concept C = auto([]{}); // { dg-error "constraint" } +static_assert(C<0>); // { dg-error "non-constant condition for static assertion" } base-commit: 6b8ecbc6d6652d061d7c72c64352d51eca2df6ca
Re: [PATCH] c++: Fix for decltype(auto) and parenthesized expr [PR103403]
On 12/1/21 10:16, Marek Polacek wrote: In r11-4758, I tried to fix this problem: int &&i = 0; decltype(auto) j = i; // should behave like int &&j = i; error wherein do_auto_deduction was getting confused with a REFERENCE_REF_P and it didn't realize its operand was a name, not an expression, and deduced the wrong type. Unfortunately that fix broke this: int&& r = 1; decltype(auto) rr = (r); where 'rr' should be 'int &' since '(r)' is an expression, not a name. But because I stripped the INDIRECT_REF with the r11-4758 change, we deduced 'rr's type as if decltype had gotten a name, resulting in 'int &&'. I suspect I thought that the REF_PARENTHESIZED_P check when setting 'bool id' in do_auto_deduction would handle the (r) case, but that's not the case; while the documentation for REF_PARENTHESIZED_P specifically says it can be set in INDIRECT_REF, we don't actually do so. This patch sets REF_PARENTHESIZED_P even on REFERENCE_REF_P, so that do_auto_deduction can use it. It also removes code in maybe_undo_parenthesized_ref that I think is dead -- and we don't hit it while running dg.exp. To adduce more data, it also looks dead here: https://splichal.eu/lcov/gcc/cp/semantics.c.gcov.html Agreed, that code is dead since r9-1417. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and 11? PR c++/103403 gcc/cp/ChangeLog: * cp-gimplify.c (cp_fold): Don't recurse if maybe_undo_parenthesized_ref doesn't change its argument. * pt.c (do_auto_deduction): Don't strip REFERENCE_REF_P trees. Also REF_PARENTHESIZED_P for REFERENCE_REF_P. * semantics.c (force_paren_expr): Set REF_PARENTHESIZED_P on REFERENCE_REF_P trees too. (maybe_undo_parenthesized_ref): Remove dead code. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/decltype-auto2.C: New test. * g++.dg/cpp1y/decltype-auto3.C: New test. --- gcc/cp/cp-gimplify.c| 3 ++- gcc/cp/pt.c | 5 ++--- gcc/cp/semantics.c | 18 -- gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C | 12 gcc/testsuite/g++.dg/cpp1y/decltype-auto3.C | 12 5 files changed, 32 insertions(+), 18 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto3.C diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c index 0a002db14e7..e3ede02a48e 100644 --- a/gcc/cp/cp-gimplify.c +++ b/gcc/cp/cp-gimplify.c @@ -2421,7 +2421,8 @@ cp_fold (tree x) if (REF_PARENTHESIZED_P (x)) { tree p = maybe_undo_parenthesized_ref (x); - return cp_fold (p); + if (p != x) + return cp_fold (p); } goto unary; diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index f4b9d9673fb..c5b41b57028 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -29889,11 +29889,10 @@ do_auto_deduction (tree type, tree init, tree auto_node, else if (AUTO_IS_DECLTYPE (auto_node)) { tree stripped_init = tree_strip_any_location_wrapper (init); - if (REFERENCE_REF_P (stripped_init)) - stripped_init = TREE_OPERAND (stripped_init, 0); bool id = (DECL_P (stripped_init) || ((TREE_CODE (init) == COMPONENT_REF - || TREE_CODE (init) == SCOPE_REF) + || TREE_CODE (init) == SCOPE_REF + || REFERENCE_REF_P (init)) && !REF_PARENTHESIZED_P (init))); This change seems wrong; not all references are id-expressions or member accesses. For instance, a call to a function returning a reference is not. I think we still want to depend on the TREE_CODE, but we need to look at the TREE_CODE of stripped_init, not init. tree deduced = finish_decltype_type (init, id, complain); deduced = canonicalize_type_argument (deduced, complain); diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index cd1956497f8..edba4b60e10 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -2049,7 +2049,8 @@ force_paren_expr (tree expr, bool even_uneval) return expr; if (TREE_CODE (expr) == COMPONENT_REF - || TREE_CODE (expr) == SCOPE_REF) + || TREE_CODE (expr) == SCOPE_REF + || REFERENCE_REF_P (expr)) REF_PARENTHESIZED_P (expr) = true; else if (DECL_P (tree_strip_any_location_wrapper (expr))) { @@ -2072,19 +2073,8 @@ maybe_undo_parenthesized_ref (tree t) if (cxx_dialect < cxx14) return t; - if (INDIRECT_REF_P (t) && REF_PARENTHESIZED_P (t)) -{ - t = TREE_OPERAND (t, 0); - while (TREE_CODE (t) == NON_LVALUE_EXPR -|| TREE_CODE (t) == NOP_EXPR) - t = TREE_OPERAND (t, 0); - - gcc_assert (TREE_CODE (t) == ADDR_EXPR - || TREE_CODE (t) == STATIC_CAST_EXPR); - t = TREE_OPERAND (t, 0); -} - else if ((TREE_CODE (t) == PAREN_EXPR || TREE_CODE (t) == VIEW_CONVERT
Re: [PATCH] c++: Fix bogus error with __integer_pack [PR94490]
On 12/1/21 12:22, Marek Polacek wrote: Here we issue a bogus: error: '(0 ? fake_tuple_size_v : fake_tuple_size_v)' is not a constant expression because cxx_constant_value in expand_integer_pack gets *(0 ? VIEW_CONVERT_EXPR(fake_tuple_size_v) : VIEW_CONVERT_EXPR(fake_tuple_size_v)) which is a REFERENCE_REF_P and we evaluate its operand to 3, so we end up with *3 and that fails. Sounds like we need to get rid of the REFERENCE_REF_P then. That is what tsubst_copy_and_build/INDIRECT_REF will do: if (REFERENCE_REF_P (t)) { /* A type conversion to reference type will be enclosed in such an indirect ref, but the substitution of the cast will have also added such an indirect ref. */ r = convert_from_reference (r); } so I think it's reasonable to call instantiate_non_dependent_expr_sfinae. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11? OK. PR c++/94490 gcc/cp/ChangeLog: * pt.c (expand_integer_pack): Call instantiate_non_dependent_expr_sfinae. gcc/testsuite/ChangeLog: * g++.dg/ext/integer-pack5.C: New test. --- gcc/cp/pt.c | 1 + gcc/testsuite/g++.dg/ext/integer-pack5.C | 29 2 files changed, 30 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack5.C diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index f4b9d9673fb..6b560952639 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -3792,6 +3792,7 @@ expand_integer_pack (tree call, tree args, tsubst_flags_t complain, } else { + hi = instantiate_non_dependent_expr_sfinae (hi, complain); hi = cxx_constant_value (hi); int len = valid_constant_size_p (hi) ? tree_to_shwi (hi) : -1; diff --git a/gcc/testsuite/g++.dg/ext/integer-pack5.C b/gcc/testsuite/g++.dg/ext/integer-pack5.C new file mode 100644 index 000..84938649d31 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/integer-pack5.C @@ -0,0 +1,29 @@ +// PR c++/94490 +// { dg-do compile { target c++14 } } + +template +constexpr int fake_tuple_size_v = 3; +template struct intseq {}; + +// So that it compiles with clang++. +#if __has_builtin(__make_integer_seq) +using size_t = decltype(sizeof(1)); +template +using _IdxTuple = intseq<_Indices...>; + +template using genseq = __make_integer_seq<_IdxTuple, size_t, N>; +#else +template using genseq = intseq<__integer_pack(N)...>; +#endif + +template> +struct arith_result +{ }; + +template +auto Mul(const T&) +{ +return [](auto) { return arith_result> { }; }(0); +} + +auto x = Mul(0); base-commit: 54ebec35abec09a24b47b997172622ca0d8e2318
Re: [PATCH] Modify combine pattern by a pseudo AND with its nonzero bits [PR93453]
Hi Haochen, on 2021/12/1 下午5:01, HAO CHEN GUI via Gcc-patches wrote: > Hi, > This patch modifies the combine pattern with a helper - > change_pseudo_and_mask when recog fails. > The helper converts a single pseudo to the pseudo AND with a mask if the > outer operator is IOR/XOR/PLUS > and the inner operator is ASHIFT/LSHIFTRT/AND. The conversion helps match > shift + ior pattern. > Thanks for improving this! I would expect this patch can make us remove the below existing splitting, which was added for a similar purpose but only works for more insns combination. (define_split [(set (match_operand:GPR 0 "gpc_reg_operand") (plus_ior_xor:GPR (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand") (match_operand:SI 2 "const_int_operand")) (match_operand:GPR 3 "gpc_reg_operand")))] "nonzero_bits (operands[3], mode) < HOST_WIDE_INT_1U << INTVAL (operands[2])" [(set (match_dup 0) (ior:GPR (and:GPR (match_dup 3) (match_dup 4)) (ashift:GPR (match_dup 1) (match_dup 2] { operands[4] = GEN_INT ((HOST_WIDE_INT_1U << INTVAL (operands[2])) - 1); }) As the description, this seems to check some special operations and mainly for those targets having Power style rotate* operations? If the understanding is correct, not sure it's a good idea to make this under a target hook/macro? > Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. > Is this okay for trunk? > Any recommendations? Thanks a lot. > > ChangeLog > 2021-12-01 Haochen Gui > > gcc/ > * combine.c (change_pseudo_and_mask): New. > (recog_for_combine): If recog fails, try again with the pattern > modified by change_pseudo_and_mask. > > gcc/testsuite/ > * gcc.target/powerpc/20050603-3.c: Modify dump check conditions. > * gcc.target/powerpc/rlwimi-2.c: Likewise. > > > patch.diff > diff --git a/gcc/combine.c b/gcc/combine.c > index 03e9a780919..82ee3b2e9db 100644 > --- a/gcc/combine.c > +++ b/gcc/combine.c > @@ -11539,6 +11539,37 @@ change_zero_ext (rtx pat) >return changed; > } > > +/* When the outer code of set_src is IOR/XOR/PLUS and the inner code is > + ASHIFT/LSHIFTRT/AND, convert a pseudo to psuedo AND with a mask if its > + nonzero_bits is less than its mode mask. The nonzero_bits in other pass > + doesn't return the same value as it does in combine pass. */ > +static bool > +change_pseudo_and_mask (rtx pat) > +{ > + bool changed = false; Maybe we can remove this variable ... > + > + rtx src = SET_SRC (pat); > + if ((GET_CODE (src) == IOR > + || GET_CODE (src) == XOR > + || GET_CODE (src) == PLUS) > + && (((GET_CODE (XEXP (src, 0)) == ASHIFT > + || GET_CODE (XEXP (src, 0)) == LSHIFTRT > + || GET_CODE (XEXP (src, 0)) == AND) > +&& REG_P (XEXP (src, 1) > +{ > + rtx *reg = &XEXP (SET_SRC (pat), 1); > + machine_mode mode = GET_MODE (*reg); > + unsigned HOST_WIDE_INT nonzero = nonzero_bits (*reg, mode); > + if (nonzero < GET_MODE_MASK (mode)) > + { > + rtx x = gen_rtx_AND (mode, *reg, GEN_INT (nonzero)); > + SUBST (*reg, x); > + changed = true; ..., directly return true here ... > + } > + } > + return changed; and return false here. > +} > + > /* Like recog, but we receive the address of a pointer to a new pattern. > We try to match the rtx that the pointer points to. > If that fails, we may try to modify or replace the pattern, > @@ -11586,7 +11617,14 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx > *pnotes) > } > } >else > - changed = change_zero_ext (pat); > + { > + if (change_pseudo_and_mask (pat)) > + { > + maybe_swap_commutative_operands (SET_SRC (pat)); maybe we can call maybe_swap_commutative_operands directly in function change_pseudo_and_mask and make the code here compact. BR, Kewen > + changed = true; > + } > + changed |= change_zero_ext (pat); > + } > } >else if (GET_CODE (pat) == PARALLEL) > { > diff --git a/gcc/testsuite/gcc.target/powerpc/20050603-3.c > b/gcc/testsuite/gcc.target/powerpc/20050603-3.c > index 4017d34f429..e628be11532 100644 > --- a/gcc/testsuite/gcc.target/powerpc/20050603-3.c > +++ b/gcc/testsuite/gcc.target/powerpc/20050603-3.c > @@ -12,7 +12,7 @@ void rotins (unsigned int x) >b.y = (x<<12) | (x>>20); > } > > -/* { dg-final { scan-assembler-not {\mrlwinm} } } */ > +/* { dg-final { scan-assembler-not {\mrlwinm} { target ilp32 } } } */ > /* { dg-final { scan-assembler-not {\mrldic} } } */ > /* { dg-final { scan-assembler-not {\mrot[lr]} } } */ > /* { dg-final { scan-assembler-not {\ms[lr][wd]} } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c > b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c > index bafa371db73..ffb5f9e450f 100644 > --- a/
[PR103149] introduce asmnesia internal function
This is now used by -fharden-conditional-branches and -fharden-compares to detach the values used in redundant hardening compares from their original locations. It eventually still expands to an asm statement, as before, but the temporary is only introduced at expand time, preventing gimple optimizations from diverging input and output, which may prevent their unification in asm match operands. Additional logic to check asm operands matches, and to keep them matchable, which may improve some cases of "+m" and "+X" in asm that are currently rejected because the single operand, split into input and output, ends up diverging in ways that can't be merged back by reload. While investigating regressions hit by an earlier iteration of this patch, I came across pr93027.c, and noticed that, with optimization (unlike the test), reload overwrites one of the inputs with another. This preexisted this patch, and I have not attempted to fix it. The changes above paved the way to fix PR103149: a vectorized loop ends up using vector booleans on aarch64, and the hardening pass attempts to "copy-and-forget" them with an asm statement (thus ASMNESIA) for the redundant hardening compare. The problem there is that vector booleans, on the affected platform, can't go in GENERAL_REGS, but the asm statement we used to emit could only use those registers or memory. The work-around introduced in the newly-introduced ASMNESIA expander is to check whether the mode is suitable for GENERAL_REGS, and force the asm operands to use MEM otherwise. This is not ideal, but functionality to map a mode to a constraint string that selects a suitable register class doesn't seem readily available. Maybe in a future patch... ASMNESIA currently works only with gimple registers of types with scalar modes. I have not attempted to make it more general than that, though I envision future uses that might end up requiring it. I'll cross that bridge if/when I get to it. Regstrapped on x86_64-linux-gnu. Also bootstrapped along with a patch that enables both compare hardening passes. Built crosses to aarch64- and riscv64-, and verified that the mentioned aarch64 PR is fixed. Ok to install? (Manual inspection of the asm output for riscv PR103302 suggests that the reported problem may be gone with the given testcase, but I have not built enough of the target environment to enable me to run that.) for gcc/ChangeLog PR middle-end/103149 * cfgexpand.c (expand_asm_stmt): Pass original input constraint, and constraint array, to asm_operand_ok. * gimple-hander-conditionals.cc (detach_value): Call ASMNESIA internal function. * internal-fn.c (expand_ASMNESIA): New. * internal-fn.def (ASMNESIA): New. * recog.c (asm_operand_ok): Turn into wrapper of... (asm_operand_ok_match): ... renamed. Skip non-constraint characters faster. Don't require register_operand for matched operands. Combine results with... (check_match): New. (check_asm_operands): Enable match checking. for gcc/testsuite/ChangeLog PR middle-end/103149 * gcc.target/aarch64/pr103149.c: New. * gcc.target/i386/pr85030.c: Also accept impossible constraint error. --- gcc/cfgexpand.c |6 + gcc/gimple-harden-conditionals.cc | 17 gcc/internal-fn.c | 81 +++ gcc/internal-fn.def |4 + gcc/recog.c | 114 +++ gcc/testsuite/gcc.target/aarch64/pr103149.c | 13 +++ gcc/testsuite/gcc.target/i386/pr85030.c |2 7 files changed, 200 insertions(+), 37 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/pr103149.c diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index fb84d469f1e77..f0272ba1ac9ba 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -3374,10 +3374,10 @@ expand_asm_stmt (gasm *stmt) tree val = input_tvec[i]; tree type = TREE_TYPE (val); bool allows_reg, allows_mem, ok; - const char *constraint; + const char *constraint, *orig_constraint; rtx op; - constraint = constraints[i + noutputs]; + orig_constraint = constraint = constraints[i + noutputs]; ok = parse_input_constraint (&constraint, i, ninputs, noutputs, 0, constraints.address (), &allows_mem, &allows_reg); @@ -3397,7 +3397,7 @@ expand_asm_stmt (gasm *stmt) else if (MEM_P (op)) op = validize_mem (op); - if (asm_operand_ok (op, constraint, NULL) <= 0) + if (asm_operand_ok (op, orig_constraint, constraints.address ()) <= 0) { if (allows_reg && TYPE_MODE (type) != BLKmode) op = force_reg (TYPE_MODE (type), op); diff --git a/gcc/gimple-harden-conditionals.cc b/gcc/gimple-harden-conditionals.cc index cfa2361d65be0..3768
Re: [PATCH] i386: Improve V8HI and V8HF inserts [PR102811]
On Thu, Dec 2, 2021 at 6:07 AM Uros Bizjak wrote: > > Introduce vec_set_0 pattern for V8HI and V8HF modes to implement scalar > element 0 inserts to from a GP register, SSE register or memory. Also > add V8HI and V8HF AVX2 (x,x,x) alternative to PINSR insn pattern, which is > split after reload to a sequence of PBROADCASTW and PBLENDW. > > The V8HF inserts from memory improve from: > > - vpbroadcastw4(%esp), %xmm1 > - vpblendw$16, %xmm1, %xmm0, %xmm0 > + vpinsrw $4, 4(%esp), %xmm0, %xmm0 > > and V8HF inserts from SSE register to element 0 improve from: > > vpxor %xmm2, %xmm2, %xmm2 > - vpbroadcastw%xmm0, %xmm0 > vpblendw$1, %xmm0, %xmm2, %xmm0 > > Based on the above improvements, the register allocator is able to determine > the optimal instruction (or instruction sequence) based on the register set > of the input value, so there is no need to manually expand V8HI and V8HF > inserts to the sequence of VEC_DUPLICATE and VEC_MERGE RTXes. > Thanks. > 2021-12-01 Uroš Bizjak > > gcc/ChangeLog: > > PR target/102811 > * config/i386/sse.md (VI2F): Remove mode iterator. > (VI2F_256_512): New mode iterator. > (vec_set_0): New insn pattern. > (vec_set_0>): Rename from vec_setmode. > Use VI2F_256_512 mode iterator instead of VI2F. > (*axv512fp16_movsh): Remove. > (_pinsr): Add (x,x,x) AVX2 alternative. > Do not disable V8HF mode insn on AVX2 targets. > (pinsrw -> pbroadcast + pblendw peephole2): New peephole. > (pinsrw -> pbroadcast + pblendw splitter): New post-reload splitter. > * config/i386/i386.md (extendhfsf): Call gen_vec_setv8hf_0. > * config/i386/i386-expand.c (ix86_expand_vector_set) > : Use vec_merge path for TARGET_AVX2. > > gcc/testsuite/ChangeLog: > > PR target/102881 > * gcc.target/i386/pr102811-1.c: New test. > * gcc.target/i386/avx512fp16-1c.c (dg-final): Update > scan-assembler-times scan strings for ia32 targets. > * gcc.target/i386/pr102327-1.c (dg-final): Ditto. > * gcc.target/i386/pr102811.c: Rename from ... > * gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c: ... this. > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} w/ and > w/o -mf16c. > > Pushed to master. > > Uros. -- BR, Hongtao
[committed] libphobos: Add missing ControlState variable for AArch64
Hi, This patch fixes a typo that occurred during the splitting of the std.math module into a package. Bootstrapped and regression tested on aarch64-linux-gnu, and committed to mainline. Regards, Iain. --- libphobos/ChangeLog: * src/std/math/hardware.d (FloatingPointControl.getControlState): Add missing ControlState variable for AArch64. --- libphobos/src/std/math/hardware.d | 1 + 1 file changed, 1 insertion(+) diff --git a/libphobos/src/std/math/hardware.d b/libphobos/src/std/math/hardware.d index 90bc96df148..b768969d4cf 100644 --- a/libphobos/src/std/math/hardware.d +++ b/libphobos/src/std/math/hardware.d @@ -912,6 +912,7 @@ private: } else version (AArch64) { +ControlState cont; asm pure nothrow @nogc { "mrs %0, FPCR;" : "=r" (cont); -- 2.30.2
[committed] d: Disable the D runtime garbage collector after initializing (PR103520)
Hi, This patch disables the D runtime garbage collector after initializing. Not all targets that support building libdruntime have a stable garbage collector, so to avoid running into problems where live memory allocated by the D GC is freed, disable all in-flight collections until a time when scanning is more reliably implemented everywhere. Bootstrapped and regression tested on aarch64-linux-gnu, and committed to mainline. Regards, Iain. --- gcc/d/ChangeLog: PR d/103520 * d-frontend.h (gc_disable): Declare. * d-lang.cc (d_init_options): Disable the D runtime garbage collector after initializing. --- gcc/d/d-frontend.h | 2 +- gcc/d/d-lang.cc| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/d/d-frontend.h b/gcc/d/d-frontend.h index 3edf812212a..e7695d5718b 100644 --- a/gcc/d/d-frontend.h +++ b/gcc/d/d-frontend.h @@ -21,7 +21,7 @@ along with GCC; see the file COPYING3. If not see /* These functions are defined in D runtime. */ extern "C" int rt_init (void); extern "C" int rt_term (void); -//extern "C" void gc_disable (void); +extern "C" void gc_disable (void); extern "C" void *gc_malloc (size_t sz, unsigned ba = 0, const void *ti = NULL); extern "C" void gc_free (void *); extern "C" void gc_collect (void); diff --git a/gcc/d/d-lang.cc b/gcc/d/d-lang.cc index d20370e5d8a..dbf7a8b60b9 100644 --- a/gcc/d/d-lang.cc +++ b/gcc/d/d-lang.cc @@ -288,7 +288,7 @@ d_init_options (unsigned int, cl_decoded_option *decoded_options) { /* Initialize the D runtime. */ rt_init (); -// gc_disable (); + gc_disable (); /* Set default values. */ global._init (); -- 2.30.2
[committed] d: Prefix object files from the root package with 'root-'
Hi, This patch prefixes D object files from the root package with 'root-'. None of the front-end module names in either the dmd or root package collide just yet, but that does not mean they won't in the future. Bootstrapped and regression tested on x86_64-linux-gnu, and committed to mainline. Regards, Iain. --- gcc/d/ChangeLog: * Make-lang.in (D_FRONTEND_OBJS): Prefix object files from the root package with root-. (d/root-%.o): New recipe. --- gcc/d/Make-lang.in | 34 +- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/gcc/d/Make-lang.in b/gcc/d/Make-lang.in index 4c0a0321eba..4ce11e3cada 100644 --- a/gcc/d/Make-lang.in +++ b/gcc/d/Make-lang.in @@ -74,19 +74,16 @@ endif # D Frontend object files. D_FRONTEND_OBJS = \ - d/aav.o \ d/access.o \ d/aggregate.o \ d/aliasthis.o \ d/apply.o \ - d/array.o \ d/arrayop.o \ d/arraytypes.o \ d/attrib.o \ d/ast_node.o \ d/astcodegen.o \ d/astenums.o \ - d/bitarray.o \ d/blockexit.o \ d/builtin.o \ d/canthrow.o \ @@ -99,7 +96,6 @@ D_FRONTEND_OBJS = \ d/cparse.o \ d/cppmangle.o \ d/ctfeexpr.o \ - d/ctfloat.o \ d/ctorflow.o \ d/dcast.o \ d/dclass.o \ @@ -124,13 +120,10 @@ D_FRONTEND_OBJS = \ d/escape.o \ d/expression.o \ d/expressionsem.o \ - d/file.o \ - d/filename.o \ d/foreachvar.o \ d/func.o \ d/globals.o \ d/gluelayer.o \ - d/hash.o \ d/hdrgen.o \ d/iasm.o \ d/iasmgcc.o \ @@ -145,7 +138,6 @@ D_FRONTEND_OBJS = \ d/json.o \ d/lambdacomp.o \ d/lexer.o \ - d/longdouble.o \ d/mtype.o \ d/nogc.o \ d/nspace.o \ @@ -153,29 +145,37 @@ D_FRONTEND_OBJS = \ d/objc.o \ d/opover.o \ d/optimize.o \ - d/outbuffer.o \ d/parse.o \ d/parsetimevisitor.o \ d/permissivevisitor.o \ - d/port.o \ d/printast.o \ - d/region.o \ - d/rmem.o \ - d/rootobject.o \ + d/root-aav.o \ + d/root-array.o \ + d/root-bitarray.o \ + d/root-ctfloat.o \ + d/root-file.o \ + d/root-filename.o \ + d/root-hash.o \ + d/root-longdouble.o \ + d/root-outbuffer.o \ + d/root-port.o \ + d/root-region.o \ + d/root-rmem.o \ + d/root-rootobject.o \ + d/root-speller.o \ + d/root-string.o \ + d/root-stringtable.o \ d/safe.o \ d/sapply.o \ d/semantic2.o \ d/semantic3.o \ d/sideeffect.o \ - d/speller.o \ d/statement.o \ d/statement_rewrite_walker.o \ d/statementsem.o \ d/staticassert.o \ d/staticcond.o \ d/stmtstate.o \ - d/string.o \ - d/stringtable.o \ d/target.o \ d/templateparamsem.o \ d/tokens.o \ @@ -393,6 +393,6 @@ d/%.o: d/dmd/%.d $(DCOMPILE) $(D_INCLUDES) $< $(DPOSTCOMPILE) -d/%.o: d/dmd/root/%.d +d/root-%.o: d/dmd/root/%.d $(DCOMPILE) $(D_INCLUDES) $< $(DPOSTCOMPILE) -- 2.30.2
[PATCH v2] c++: ICE with auto(0) in requires-expression [PR103408]
On Wed, Dec 01, 2021 at 12:17:47PM -0500, Patrick Palka wrote: > On Wed, Dec 1, 2021 at 10:26 AM Marek Polacek via Gcc-patches > wrote: > > > > Here we ICE on > > > > int f() requires (auto(0)); > > > > in do_auto_deduction when handling the auto: we're in a non-templated > > requires-expression which are parsed under processing_template_decl == 1 > > and empty current_template_parms, so 'current_template_args ()' will > > crash. This code is invalid as per "C++20 CA378: Remove non-templated > > constrained functions", but of course we shouldn't crash. > > FWIW it looks like we can trip over the same bug with valid code: > > static_assert(requires { auto(0); }); Right, I hadn't thought of this at all. So it's not just an ICE-on-invalid! > > Since in the scenario above it's expected that current_template_parms is > > null, I've just added a check, and let grokfndecl issue an error. > > I guess another approch would be to fake up a template parameter list > > before calling do_auto_deduction. > > > > For good measure, I've added several well-formed cases with auto(x) in > > a requires-expression. > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > PR c++/103408 > > > > gcc/cp/ChangeLog: > > > > * pt.c (do_auto_deduction): Check current_template_parms before > > current_template_args (). > > > > gcc/testsuite/ChangeLog: > > > > * g++.dg/cpp23/auto-fncast9.C: New test. > > --- > > gcc/cp/pt.c | 2 +- > > gcc/testsuite/g++.dg/cpp23/auto-fncast9.C | 27 +++ > > 2 files changed, 28 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast9.C > > > > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c > > index f4b9d9673fb..012ca5d06c0 100644 > > --- a/gcc/cp/pt.c > > +++ b/gcc/cp/pt.c > > @@ -30041,7 +30041,7 @@ do_auto_deduction (tree type, tree init, tree > > auto_node, > > (but we still may have used them for constraint checking above). > > */; > >else if (context == adc_unify) > > targs = add_to_template_args (outer_targs, targs); > > - else if (processing_template_decl) > > + else if (processing_template_decl && current_template_parms) > > targs = add_to_template_args (current_template_args (), targs); > >return tsubst (type, targs, complain, NULL_TREE); > > Won't this mean the call to tsubst here will end up lowering the level > of the auto from 2 to 1 rather than replacing it with the actual > deduced type? Yes, exactly -- targs has depth 1 but the level of auto is 2. :( > It also looks like this approach doesn't handle static_assert(requires > { auto(auto(0)); }), probably due to this substitution issue. I guess > we could add a dummy level to 'targs' to work around this.. Thanks for pointing this out. Here's a v2 which uses a dummy level. The description hopefully explains what's going on here... Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- Here we ICE on int f() requires (auto(0)); in do_auto_deduction when handling the auto: we're in a non-templated requires-expression which are parsed under processing_template_decl == 1 and empty current_template_parms, so 'current_template_args ()' will crash. This code is invalid as per "C++20 CA378: Remove non-templated constrained functions", but of course we shouldn't crash. However, as Patrick pointed out, this is valid: static_assert(requires { auto(0); }); // #1 static_assert(requires { auto(auto(0)); }); // #2 To make #1 and #2 work, I'm playing games with faking up a dummy template parameter level to current_template_parms, which need elaborating a bit. autos are created with the level processing_template_decl + 1. When we call do_auto_deduction for the inner auto(0) in #2, the auto will therefore have level 2. type_unification_real in do_auto_deduction deduces targs to (TMPL_ARGS_DEPTH = 1), so the return tsubst (type, targs, complain, NULL_TREE); will only reduce the level of the auto to 1. So it remains undeduced, causing the "invalid use of auto" error. If I create the dummy template parameter level, then this at the end of do_auto_deduction else if (processing_template_decl) targs = add_to_template_args (current_template_args (), targs); transforms into <, int>, which has TMPL_ARGS_DEPTH = 2 and so the following tsubst actually replaces the auto with int, and things work. For good measure, I've added several well-formed cases with auto(x) in a requires-expression. PR c++/103408 gcc/cp/ChangeLog: * cp-tree.h (add_dummy_template_parameter_level): Declare. * pt.c (add_dummy_template_parameter_level): New function. (begin_template_parm_list): Use it. * semantics.c (finish_compound_literal): Call add_dummy_template_parameter_level if there are no template parameters. * typeck2.c (build_functional_cast_1): Likewise. gcc/testsuite/ChangeLog: * g++.dg/cpp23/auto-
[RFC]PR103271 ICE in assign_stack_temp_for_type with -ftrivial-auto-var-init=pattern and VLAs and -mno-strict-align on riscv64
Hi, As I studied, the issue with this bug is similar as PR102285: For the testing case: int qy (void) { int tw = 4; int fb[tw]; return fb[2]; /* { dg-warning "uninitialized" } */ } if compiled with /home/opc/Install/cross/bin/riscv64-unknown-linux-gnu-gcc -c -O1 -mno-strict-align -ftrivial-auto-var-init=pattern -fdump-tree-all t.c **t.c.032t.cpp1: int qy () { unsigned char fb.3[16]; int tw; int[0:D.1494] * fb.1; sizetype D.1494; void * saved_stack.2_16; int _27; : saved_stack.2_16 = __builtin_stack_save (); fb.1_25 = &fb.3; (*fb.1_25) = .DEFERRED_INIT (16, 1, 1); _27 = (*fb.1_25)[2]; fb.3 ={v} {CLOBBER}; __builtin_stack_restore (saved_stack.2_16); return _27; } **t.c.033t.forwprop1: int qy () { unsigned char fb.3[16]; int tw; int[0:D.1494] * fb.1; sizetype D.1494; void * saved_stack.2_16; int _27; : saved_stack.2_16 = __builtin_stack_save (); MEM[(int[0:D.1494] *)&fb.3] = .DEFERRED_INIT (16, 1, 1); _27 = MEM[(int[0:D.1494] *)&fb.3][2]; fb.3 ={v} {CLOBBER}; __builtin_stack_restore (saved_stack.2_16); return _27; } the problem with the above IR for .DEFERRED_INIT expansion is: for MEM[(int[0:D.1494] *)&fb.3] = .DEFERRED_INIT (16, 1, 1) after the “forwprop1” phase, the original address that point to VLA fb.1 is replaced with the the new address that point to fixed array fb.3, however, the TREE_TYPE of the MEM still is kept the old VLA type int[0:D.1984]. Such inconsistency causes the ICE during expanding of call to the above .DEFERRED_INIT. If during the forwprop, when replacing “fb.1” with “&fb.3”, the TREE_TYPE of the MEM_REF can be updated accordingly, we should fix this inconsistency of the IR, and therefore fix the issue in DEFERRED_INIT expansion. So, I think that the fix should be in “forwprop”, and update the TREE_TYPE of the MEM_REF accordingly. What do you think on this? thanks. Qing
[committed] analyzer: fix false leak seen in Juliet 1.3 [PR102471]
Juliet 1.3's CWE415_Double_Free__malloc_free_*_67a.c were showing leak false positives in non-LTO builds; fixed thusly. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r12-5701-g860c56b5bc356960a4d0445dadc43ceddbe3c7e2. gcc/analyzer/ChangeLog: PR analyzer/102471 * region-model-reachability.cc (reachable_regions::handle_parm): Treat all svalues within a compound parm has reachable, and those wrapped in a cast. gcc/testsuite/ChangeLog: PR analyzer/102471 * gcc.dg/analyzer/leak-3.c: New test. Signed-off-by: David Malcolm --- gcc/analyzer/region-model-reachability.cc | 13 +++ gcc/testsuite/gcc.dg/analyzer/leak-3.c| 41 +++ 2 files changed, 54 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/analyzer/leak-3.c diff --git a/gcc/analyzer/region-model-reachability.cc b/gcc/analyzer/region-model-reachability.cc index b5ae787cac9..f82f7e6556d 100644 --- a/gcc/analyzer/region-model-reachability.cc +++ b/gcc/analyzer/region-model-reachability.cc @@ -258,6 +258,19 @@ reachable_regions::handle_parm (const svalue *sval, tree param_type) const region *pointee_reg = parm_ptr->get_pointee (); add (pointee_reg, is_mutable); } + /* Treat all svalues within a compound_svalue as reachable. */ + if (const compound_svalue *compound_sval + = sval->dyn_cast_compound_svalue ()) +{ + for (compound_svalue::iterator_t iter = compound_sval->begin (); + iter != compound_sval->end (); ++iter) + { + const svalue *iter_sval = (*iter).second; + handle_sval (iter_sval); + } +} + if (const svalue *cast = sval->maybe_undo_cast ()) +handle_sval (cast); } /* Update the store to mark the clusters that were found to be mutable diff --git a/gcc/testsuite/gcc.dg/analyzer/leak-3.c b/gcc/testsuite/gcc.dg/analyzer/leak-3.c new file mode 100644 index 000..d11cc03b36f --- /dev/null +++ b/gcc/testsuite/gcc.dg/analyzer/leak-3.c @@ -0,0 +1,41 @@ +#include + +/* Reduced from Juliet 1.3's CWE415_Double_Free__malloc_free_char_67a.c + goodG2B which was showing a false leak report in a non-LTO build. */ + +struct s1 +{ +char * structFirst; +}; +void external_fn_1(struct s1 myStruct); +void test_1() +{ +char * data; +struct s1 myStruct; +data = (char *)malloc(100*sizeof(char)); +if (data == NULL) + exit(-1); +myStruct.structFirst = data; +external_fn_1(myStruct); +} /* { dg-bogus "leak of 'data'" } */ + +/* As above, but with padding before the field. */ + +struct s2 +{ + void *padding; + char *ptr; +}; +void external_fn_2(struct s2 myStruct); +void test_2() +{ +char * data; +struct s2 myStruct; +data = (char *)malloc(100*sizeof(char)); +if (data == NULL) + exit(-1); +myStruct.padding = NULL; +myStruct.ptr = data; +external_fn_2(myStruct); +} /* { dg-bogus "leak of 'data'" } */ + -- 2.26.3
Re: [PATCH] rs6000: Builtins test changes for test_fpscr_[d]rn_builtin_error.c
On Thu, Nov 18, 2021 at 10:36:52AM -0600, Bill Schmidt wrote: > Hi! This is the last patch broken out of the previous test suite patch > for the new builtins support. Whew :-) > One advantage of the new builtins support is uniform error messages for > arguments with restricted values. Previously this was done in many places > in an ad hoc manner, with little uniformity. This patch adjusts the > expected error messages accordingly. > > All such error messages are now one of the following: > "argument %d must be a %d-bit unsigned literal" > "argument %d must be a literal between %d and %d, inclusive" > "argument %d must be a variable or a literal between %d and %d, inclusive" > "argument %d must be either a literal %d or a literal %d" > > These messages were chosen to require the fewest changes from previous > messages while still introducing uniformity. This patch adjusts error > messages for some cases where this produces changed messages. In > particular, some messages are improved because previously they did not > admit the possibility that an argument could hold a variable. Same comment as on the previous patch. But, okay for trunk. Thanks! Segher
Re: [PATCH] rs6000: Builtins test changes for pr80315-*.c, pr88100.c
Hi! On 12/1/21 4:29 PM, Segher Boessenkool wrote: > On Thu, Nov 18, 2021 at 10:15:21AM -0600, Bill Schmidt wrote: >> Hi! This patch is broken out from the test case patch for the new >> builtins support. >> >> One advantage of the new builtins support is uniform error messages for >> arguments with restricted values. Previously this was done in many places >> in an ad hoc manner, with little uniformity. This patch adjusts the >> expected error messages accordingly. >> >> All error messages are now one of the following: >> "argument %d must be a %d-bit unsigned literal" >> "argument %d must be a literal between %d and %d, inclusive" >> "argument %d must be a variable or a literal between %d and %d, inclusive" >> "argument %d must be either a literal %d or a literal %d" >> >> These messages were chosen to require the fewest changes from previous >> messages while still introducing uniformity. This patch adjusts error >> messages for some cases where this produces changed messages. >> >> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with >> no regressions. is this okay for trunk? > We should have opnly the middle two of those messages. But, okay for > trunk if you put this on some to-do list. Thanks! The last one is actually needed also, because we have at least one case where the two supported values aren't contiguous. We can do without the first one, but that will affect quite a number of test cases, so agree that this should be done later. (We already had a whole lot of tests of this form.) Thanks for the review! Bill > > Segher
Re: [PATCH] rs6000: Builtins test changes for pragma_misc9.c
On Thu, Nov 18, 2021 at 10:18:06AM -0600, Bill Schmidt wrote: > Hi! This patch is broken out from the test suite patch for the new > builtins support. This one is just a minor adjustment for the error > message wording. > > Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) > with no regressions. Is this okay for trunk? Okay. Thanks! Segher
Re: [PATCH] rs6000: Builtins test changes for pr80315-*.c, pr88100.c
On Thu, Nov 18, 2021 at 10:15:21AM -0600, Bill Schmidt wrote: > Hi! This patch is broken out from the test case patch for the new > builtins support. > > One advantage of the new builtins support is uniform error messages for > arguments with restricted values. Previously this was done in many places > in an ad hoc manner, with little uniformity. This patch adjusts the > expected error messages accordingly. > > All error messages are now one of the following: > "argument %d must be a %d-bit unsigned literal" > "argument %d must be a literal between %d and %d, inclusive" > "argument %d must be a variable or a literal between %d and %d, inclusive" > "argument %d must be either a literal %d or a literal %d" > > These messages were chosen to require the fewest changes from previous > messages while still introducing uniformity. This patch adjusts error > messages for some cases where this produces changed messages. > > Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with > no regressions. is this okay for trunk? We should have opnly the middle two of those messages. But, okay for trunk if you put this on some to-do list. Thanks! Segher
Re: [PATCH] rs6000: Builtins test changes for compare-bytes tests
On Thu, Nov 18, 2021 at 07:47:38AM -0600, Bill Schmidt wrote: > Hi! This patch is broken out from the patch with test suite changes for the > new builtins support. > > With the old builtins support, cmpb-2.c produces: > warning: implicit declaration of function '__builtin_cmpb; did you mean > '__builtin_bcmp'? > > With the new support, it produces: > error: '__builtin_p6_cmpb requires the '-mcpu=power6' option and either the > '-m64' or '-mpowerpc64' option > note: builtin '__builtin_cmpb' requires builtin '__builtin_p6_cmpb' I am still not happy with this at all, it is clearly worse than what we had. But, okay for trunk, and hopefully we can fix it before GCC 12 release. Thanks! Segher
[PATCH] i386: Improve V8HI and V8HF inserts [PR102811]
Introduce vec_set_0 pattern for V8HI and V8HF modes to implement scalar element 0 inserts to from a GP register, SSE register or memory. Also add V8HI and V8HF AVX2 (x,x,x) alternative to PINSR insn pattern, which is split after reload to a sequence of PBROADCASTW and PBLENDW. The V8HF inserts from memory improve from: - vpbroadcastw4(%esp), %xmm1 - vpblendw$16, %xmm1, %xmm0, %xmm0 + vpinsrw $4, 4(%esp), %xmm0, %xmm0 and V8HF inserts from SSE register to element 0 improve from: vpxor %xmm2, %xmm2, %xmm2 - vpbroadcastw%xmm0, %xmm0 vpblendw$1, %xmm0, %xmm2, %xmm0 Based on the above improvements, the register allocator is able to determine the optimal instruction (or instruction sequence) based on the register set of the input value, so there is no need to manually expand V8HI and V8HF inserts to the sequence of VEC_DUPLICATE and VEC_MERGE RTXes. 2021-12-01 Uroš Bizjak gcc/ChangeLog: PR target/102811 * config/i386/sse.md (VI2F): Remove mode iterator. (VI2F_256_512): New mode iterator. (vec_set_0): New insn pattern. (vec_set_0>): Rename from vec_setmode. Use VI2F_256_512 mode iterator instead of VI2F. (*axv512fp16_movsh): Remove. (_pinsr): Add (x,x,x) AVX2 alternative. Do not disable V8HF mode insn on AVX2 targets. (pinsrw -> pbroadcast + pblendw peephole2): New peephole. (pinsrw -> pbroadcast + pblendw splitter): New post-reload splitter. * config/i386/i386.md (extendhfsf): Call gen_vec_setv8hf_0. * config/i386/i386-expand.c (ix86_expand_vector_set) : Use vec_merge path for TARGET_AVX2. gcc/testsuite/ChangeLog: PR target/102881 * gcc.target/i386/pr102811-1.c: New test. * gcc.target/i386/avx512fp16-1c.c (dg-final): Update scan-assembler-times scan strings for ia32 targets. * gcc.target/i386/pr102327-1.c (dg-final): Ditto. * gcc.target/i386/pr102811.c: Rename from ... * gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c: ... this. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} w/ and w/o -mf16c. Pushed to master. Uros. diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 354a9a713b7..cba2880dc8d 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -16204,18 +16204,8 @@ ix86_expand_vector_set (bool mmx_ok, rtx target, rtx val, int elt) } return; -case E_V8HFmode: - if (TARGET_AVX2) - { - mmode = SImode; - gen_blendm = gen_sse4_1_pblendph; - blendm_const = true; - } - else - use_vec_merge = true; - break; - case E_V8HImode: +case E_V8HFmode: case E_V2HImode: use_vec_merge = TARGET_SSE2; break; diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 4e9fae80479..9d7d1161d15 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -4656,15 +4656,7 @@ rtx tmp = gen_reg_rtx (V8HFmode); rtx zero = force_reg (V8HFmode, CONST0_RTX (V8HFmode)); - if (TARGET_AVX2) - { - rtx dup = gen_reg_rtx (V8HFmode); - emit_move_insn (dup, gen_rtx_VEC_DUPLICATE (V8HFmode, operands[1])); - emit_move_insn (tmp, gen_rtx_VEC_MERGE (V8HFmode, dup, - zero, const1_rtx)); - } - else - emit_insn (gen_sse2_pinsrph (tmp, zero, operands[1], const1_rtx)); + emit_insn (gen_vec_setv8hf_0 (tmp, zero, operands[1])); emit_insn (gen_vcvtph2ps (res, gen_lowpart (V8HImode, tmp))); emit_move_insn (operands[0], gen_lowpart (SFmode, res)); DONE; diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 08bdcddc111..f8b34a15cc6 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -827,7 +827,7 @@ (V32HF "TARGET_AVX512BW")]) ;; Int-float size matches -(define_mode_iterator VI2F [V8HI V16HI V32HI V8HF V16HF V32HF]) +(define_mode_iterator VI2F_256_512 [V16HI V32HI V16HF V32HF]) (define_mode_iterator VI4F_128 [V4SI V4SF]) (define_mode_iterator VI8F_128 [V2DI V2DF]) (define_mode_iterator VI4F_256 [V8SI V8SF]) @@ -10170,13 +10170,84 @@ ] (symbol_ref "true")))]) +(define_insn "vec_set_0" + [(set (match_operand:V8_128 0 "register_operand" + "=v,v,v,x,x,Yr,*x,x,x,x,v,v") + (vec_merge:V8_128 + (vec_duplicate:V8_128 + (match_operand: 2 "nonimmediate_operand" + " r,m,v,r,m,Yr,*x,r,m,x,r,m")) + (match_operand:V8_128 1 "reg_or_0_operand" + " C,C,v,0,0,0 ,0 ,x,x,x,v,v") + (const_int 1)))] + "TARGET_SSE2" + "@ + vmovw\t{%k2, %0|%0, %k2} + vmovw\t{%2, %0|%0, %2} + vmovsh\t{%2, %1, %0|%0, %1, %2} + pinsrw\t{$0, %k2, %0|%0, %k2, 0} + pinsrw\t{$0, %2, %0|%0, %2, 0} + pblendw\t{$1, %2, %0|%0, %2, 1} + pblendw\t{$1, %2, %0|%0, %2, 1} + vpinsrw\t{$0, %k2, %1, %0|%0, %1, %k2, 0} + vpinsrw\t{$0, %2, %1, %0|%0, %1, %2, 0} + vpblendw\t{$1, %2, %
Re: [PATCH v2] rs6000: Fix a handful of 32-bit built-in function problems
Hi! On 12/1/21 3:08 PM, Segher Boessenkool wrote: > On Tue, Nov 16, 2021 at 12:56:52PM -0600, Bill Schmidt wrote: >> Hi! I previously posted [1] to correct some problems with the new builtins >> support targeting 32-bit code gen. Based on the discussion, I've made some >> adjustments and would like to submit this for consideration. >> >> We eventually agreed that the strange behavior for -m32 -mpowerpc64 for >> certain >> HTM builtins should be removed. All of the registers TEXASR, TEXASRU, TFHAR, >> and TFIAR are now accessed using the unsigned long data type in all >> configurations. >> gcc/ >> * config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit. >> (BPERMD): Flag as 32bit (needing special handling for 32-bit). >> (UNPACK_TD): Return unsigned long long instead of unsigned long. >> (GET_TEXASR): Return unsigned long instead of unsigned long long. >> (GET_TEXASRU): Likewise. >> (GET_TFHAR): Likewise. >> (GET_TFIAR): Likewise. >> (SET_TEXASR): Pass unsigned long instead of unsigned long long. >> (SET_TEXASRU): Likewise. >> (SET_TFHAR): Likewise. >> (SET_TFIAR): Likewise. >> (TABORTDC): Likewise. >> (TABORTDCI): Likewise. >> * config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error >> handling for no32bit. Add 32bit handling for RS6000_BIF_BPERMD. >> >> gcc/testsuite/ >> * gcc.target/powerpc/cmpb-3.c: Adjust error message. > Okay for trunk. Thanks! > > > Could you put some short blurb about the changed prototype of the HTM > reg builtins in the release notes please? Thanks x2 :-) Already done! Bill > > Segher
Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests
On Wed, Nov 17, 2021 at 02:58:54PM -0600, Bill Schmidt wrote: > Hi! This patch is broken out of the previous patch for all the builtins test > suite adjustments. Here we have some slight changes in error messages due to > how the internals have changed between the old and new builtins methods. > > For scalar-extract-exp-2.c we change: > error: '__builtin_vec_scalar_extract_exp is not supported in this compiler > configuration' > > to: > error: '__builtin_vsx_scalar_extract_exp' requires the '-mcpu=power9' > option and either the '-m64' or '-mpowerpc64' option > note: builtin '__builtin_vec_scalar_extract_exp' requires builtin > '__builtin_vsx_scalar_extract_exp' So, okay for trunk. Thanks! One thing though: > gcc/testsuite/ > * gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust error > message. Don't wrap unnecessarily please. > * gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Likewise. > * gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Likewise. > * gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Likewise. > * gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Likewise. > * gcc.target/powerpc/bfp/scalar-test-neg-2.c: Likewise. > * gcc.target/powerpc/bfp/scalar-test-neg-3.c: Likewise. > * gcc.target/powerpc/bfp/scalar-test-neg-5.c: Likewise. Segher
Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests
On Thu, Nov 18, 2021 at 03:59:41PM -0600, Bill Schmidt wrote: > On 11/18/21 3:32 PM, Segher Boessenkool wrote: > > On Thu, Nov 18, 2021 at 03:30:48PM -0600, Bill Schmidt wrote: > >> On 11/18/21 3:16 PM, Segher Boessenkool wrote: > >>> On Wed, Nov 17, 2021 at 05:06:05PM -0600, Bill Schmidt wrote: > > I don't like that at all. The user didn't write the _vsx thing, and it > > isn't documented either (neither is the _vec one, but that is a separate > > issue, specific to this builtin). > I feel like I haven't explained this well. This kind of thing has been > in > existence forever even in the old builtins code. The combination of the > error showing the internal builtin name, and the note tying the overload > name to the internal builtin name, has been there all along. The name of > the internal builtin is pretty meaningless. The only thing that's > interesting > in this case is that we previously didn't get this *for this specific > case* > because the old code went to a generic fallback. But in many other cases > you get exactly this same kind of error message for the old code. > >>> Yes. And it still is a regression (in *this* case). > >> Sorry, I don't understand. Why specifically is this a regression? > > It is wrong now, in ways that it wasn't wrong before. That is the > > definition of regression! > > I'm sorry, I disagree. With clarification of the note, I don't see how > this can be considered a regression. It is providing information in a > different way, but it is still clear that the user has misused the builtin > in the context, and, unlike before, it now tells them *what* is wrong with > the options that were used (not just "unavailable in this configuration"). > The fact that an internal builtin name is *also* disclosed as part of > this does not make it wrong. It is claiming something is wrong with something the user didn't write. The blow is softened a lot by the inform(), but it still is a regression. But you said you'll look into it, and it will be okay for now, it isn't like this is super important. Just an ugly wart :-) > The way that overloads work, we can only tell whether a builtin is > enabled with the current set of options by looking at the true builtin > that the overload maps to. The enablement checking code doesn't have > any knowledge that an overloaded function maps to it. So that error > message is produced without knowledge of the overloading. The note > diagnostic is added by the overload processing code that *is* aware > that the mapping exists. Yeah, but you can keep track of what the original name was, somewhere. Hopefully, anyway :-) > The enablement checking code (rs6000_invalid_builtin in the old code, > rs6000_invalid_new_builtin in the new code) is called from multiple > places, and not always in an overload context, so we can't assume > this is the case. Not all builtins are mapped to by overloads, but > they still need enablement checking. A direct call would not get the "this comes from overload" field set, so that is easy to see in the error message code. > Would it be possible to change things so that we pass in the overload > name to be used in the error message when appropriate? Yes. But > this would have a much larger impact on the test suite than the > current method, because all error tests for overloads would now > have to change. That is, there are many existing tests that are > already "wrong" in the sense that they report the internal builtin > name. > > I suggest that we add that to the list of future cleanups due to > the size of the impact. I agree that it has never been ideal to > mention the internal builtin name on these messages. It's just > not unique to the changes I've made here; it's a pre-existing > condition that needs work to cleanse it everywhere. > > Can we move forward this way? We can yes :-) And thanks in advance! It would be nice if we could see this in GCC 12 still. It is fine for stage 4 still, if it is error message only (and a lot of testsuite :-) ) Segher
[committed] wwwdocs: git: Go has a new home
Pushed. --- htdocs/git.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/htdocs/git.html b/htdocs/git.html index de8f0584..493b5734 100644 --- a/htdocs/git.html +++ b/htdocs/git.html @@ -329,7 +329,7 @@ in Git. gccgo This branch is for the Go front end to gcc. For more information about the Go programming language, -see https://golang.org/";>https://golang.org. The +see https://go.dev/";>go.dev. The branch is maintained by Ian Lance Taylor. Patches should be marked with the tag [gccgo] wwwdocs: in the Subject line. -- 2.34.0
[committed] wwwdocs: gcc-12: Update link to our Bugzilla
On the way tweak language a bit. Pushed. --- htdocs/gcc-12/changes.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html index 45a8d99a..000501fb 100644 --- a/htdocs/gcc-12/changes.html +++ b/htdocs/gcc-12/changes.html @@ -399,7 +399,7 @@ a work-in-progress. has been rewritten to be easier and less error-prone to maintain. Every attempt has been made to ensure that the new behavior matches the old behavior, but inevitably some bugs can be expected. Please report any -problems using https://gcc.gnu.org/bugzilla";>GCC Bugzilla. +problems via https://gcc.gnu.org/bugzilla/";>GCC Bugzilla. The built-in functions __builtin_get_texasr, -- 2.34.0
Re: [PATCH v2] rs6000: Fix a handful of 32-bit built-in function problems
On Tue, Nov 16, 2021 at 12:56:52PM -0600, Bill Schmidt wrote: > Hi! I previously posted [1] to correct some problems with the new builtins > support targeting 32-bit code gen. Based on the discussion, I've made some > adjustments and would like to submit this for consideration. > > We eventually agreed that the strange behavior for -m32 -mpowerpc64 for > certain > HTM builtins should be removed. All of the registers TEXASR, TEXASRU, TFHAR, > and TFIAR are now accessed using the unsigned long data type in all > configurations. > gcc/ > * config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit. > (BPERMD): Flag as 32bit (needing special handling for 32-bit). > (UNPACK_TD): Return unsigned long long instead of unsigned long. > (GET_TEXASR): Return unsigned long instead of unsigned long long. > (GET_TEXASRU): Likewise. > (GET_TFHAR): Likewise. > (GET_TFIAR): Likewise. > (SET_TEXASR): Pass unsigned long instead of unsigned long long. > (SET_TEXASRU): Likewise. > (SET_TFHAR): Likewise. > (SET_TFIAR): Likewise. > (TABORTDC): Likewise. > (TABORTDCI): Likewise. > * config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error > handling for no32bit. Add 32bit handling for RS6000_BIF_BPERMD. > > gcc/testsuite/ > * gcc.target/powerpc/cmpb-3.c: Adjust error message. Okay for trunk. Thanks! Could you put some short blurb about the changed prototype of the HTM reg builtins in the release notes please? Thanks x2 :-) Segher
RE: [PATCH] Final value replacement improvements for until-wrap loops.
Hi Richard, Many thanks for the review. Here's the final version that I've committed, including your suggested improvements, following another make bootstrap and make -k check on x86_64-pc-linux-gnu with no new failures. Thanks again. 2021-12-01 Roger Sayle Richard Biener gcc/ChangeLog * tree-ssa-loop-niter.c (number_of_iterations_until_wrap): Check if simplify_using_initial_conditions allows us to simplify the expression for may_be_zero. * match.pd (X != C ? -X : -C -> -X): New transform. (X != C ? ~X : ~C -> ~X): Likewise. ((X+1) > Y ? -X : 1 -> X >= Y ? -X : 1): Likewise. gcc/testsuite/ChangeLog * gcc.dg/fold-condneg-1.c: New test case. * gcc.dg/fold-condneg-2.c: New test case. * gcc.dg/fold-condnot-1.c: New test case. * gcc.dg/pr101145-1.c: New test case. * gcc.dg/pr101145-2.c: New test case. Roger -- > -Original Message- > From: Richard Biener > Sent: 30 November 2021 10:00 > To: Roger Sayle > Cc: GCC Patches > Subject: Re: [PATCH] Final value replacement improvements for until-wrap > loops. > > On Mon, Nov 29, 2021 at 10:07 AM Roger Sayle > wrote: > > > > > > This middle-end patch is inspired by Richard Biener's until-wrap loop > > example in PR tree-optimization/101145. > > > > unsigned foo(unsigned val, unsigned start) { > > unsigned cnt = 0; > > for (unsigned i = start; i > val; ++i) > > cnt++; > > return cnt; > > } > > > > For this loop, the tree optimizers currently generate: > > > > unsigned int foo (unsigned int val, unsigned int start) { > > unsigned int cnt; > > unsigned int _1; > > unsigned int _5; > > > >[local count: 118111600]: > > if (start_3(D) > val_4(D)) > > goto ; [89.00%] > > else > > goto ; [11.00%] > > > >[local count: 105119324]: > > _1 = start_3(D) + 1; > > _5 = -start_3(D); > > cnt_2 = _1 > val_4(D) ? _5 : 1; > > > >[local count: 118111600]: > > # cnt_11 = PHI > > return cnt_11; > > } > > > > or perhaps slightly easier to read: > > > > if (start > val) { > > cnt = (start+1) > val ? -start : 1; > > } else cnt = 0; > > > > In this snippet, if we know start > val, then (start+1) > val unless > > start+1 overflows, i.e. (start+1) == 0 and start == ~0. > > We can use this (loop header) context to simplify the ternary > > expression to "(start != -1) ? -start : 1", which with a little help > > from match.pd can be folded to -start. Hence the optimal final value > > replacement should be: > > > > cnt = (start > val) ? -start : 0; > > > > Or as now generated by this patch: > > > > unsigned int foo (unsigned int val, unsigned int start) { > > unsigned int cnt; > > > >[local count: 118111600]: > > if (start_3(D) > val_4(D)) > > goto ; [89.00%] > > else > > goto ; [11.00%] > > > >[local count: 105119324]: > > cnt_2 = -start_3(D); > > > >[local count: 118111600]: > > # cnt_11 = PHI > > return cnt_11; > > } > > > > > > We can also improve until-wrap loops that don't have a (suitable) loop > > header, as determined by simplify_using_initial_conditions. > > > > unsigned bar(unsigned val, unsigned start) { > > unsigned cnt = 0; > > unsigned i = start; > > do { > > cnt++; > > i++; > > } while (i > val); > > return cnt; > > } > > > > which is currently optimized to: > > > > unsigned int foo (unsigned int val, unsigned int start) { > > unsigned int cnt; > > unsigned int _9; > > unsigned int _10; > > > >[local count: 118111600]: > > _9 = start_4(D) + 1; > > _10 = -start_4(D); > > cnt_3 = val_7(D) < _9 ? _10 : 1; > > return cnt_3; > > } > > > > Here we have "val < (start+1) ? -start : 1", which again with the help > > of match.pd can be slightly simplified to "val <= start ? -start : 1" > > when dealing with unsigned types, because at the complicating value > > where start == ~0, we fortunately have -start == 1, hence it doesn't > > matter whether the second or third operand of the ternary operator is > returned. > > > > To summarize, this patch (in addition to tweaking may_be_zero in > > number_of_iterations_until_wrap) adds three new constant folding > > transforms to match.pd. > > > > X != C1 ? -X : C2 simplifies to -X when -C1 == C2. > > which is the generalized form of the simplification above. > > > > X != C1 ? ~X : C2 simplifies to ~X when ~C1 == C2. > > which is the BIT_NOT_EXPR analog of the NEGATE_EXPR case. > > > > and the "until-wrap final value replacement without context": > > > > (X + 1) > Y ? -X : 1 simplifies to X >= Y ? -X : 1 when X is unsigned, > > as when X + 1 overflows, X is -1, so -X == 1. > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > and make -k check with no new failures. Ok for mainline? > > +/* X != C1 ? -X : C2 simplifies to -X when -C1 == C2. */ (simplify > +(cond (ne @0 INTEGER_CST@1) (negate@3 @0) INTEGER_CST@2) (if > +((!wi::only_sign_bit_p (wi::to_wide (@1)) > +
Re: [RFC/PATCH] C++ constexpr vs. floating point exceptions.
On Tue, Sep 21, 2021 at 5:19 PM Jason Merrill wrote: > On Tue, Sep 21, 2021 at 4:30 PM Joseph Myers > wrote: > >> On Tue, 21 Sep 2021, Jakub Jelinek via Gcc-patches wrote: >> >> > On Tue, Sep 21, 2021 at 02:15:59PM +0100, Roger Sayle wrote: >> > > Can you double check? Integer division by zero is undefined, but >> isn't floating point >> > > division by zero defined by the appropriate IEEE standards? >> > >> > https://eel.is/c++draft/expr.mul#4 doesn't make the division by zero >> > behavior conditional on integral types. >> > C has similar wording. >> >> The position for C is that Annex F semantics take precedence over all the >> ways in which floating-point arithmetic has undefined behavior in the >> main >> body of the standard. So floating-point overflow and division by zero >> are >> fully defined in the presence of Annex F support, while out-of-range >> conversions from floating point to integer types produce an unspecified >> value (not necessarily the same unspecified value for different >> executions >> of the conversion in the abstract machine - as discussed in bug 93806, >> GCC >> can get that wrong and act as if a single execution of such a conversion >> in the abstract machine produces more than one result). >> >> In C, as specified in Annex F, initializers for floating-point objects >> with static or thread storage duration are evaluated with exceptions >> discarded and the default rounding mode in effect; 7.0 / 0.0 is a fully >> valid initializer for such an object to initialize it to positive >> infinity. As I understand it, the question for this thread is whether >> C++ >> constexpr should have a similar rule to C static initializers (which >> ought >> to apply to 1.0 / 3.0, raising inexact, just as much as to 7.0 / 0.0). >> > > The C rules seem to be > > F.8.2 Translation > During translation the IEC 60559 default modes are in effect: > — The rounding direction mode is rounding to nearest. > — The rounding precision mode (if supported) is set so that results are > not shortened. > — Trapping or stopping (if supported) is disabled on all floating-point > exceptions. > Recommended practice: > The implementation should produce a diagnostic message for each > translation-time floating-point exception, other than “inexact”; the > implementation should then proceed with the translation of the program. > > I think following the same rules for C++ would be appropriate in a > diagnosing context: warn and continue. In a template argument deduction > (SFINAE) context, where errors become silent substitution failures, it's > probably better to treat them as non-constant. > I've now added -fconstexpr-fp-except to treat manifestly-constant-evaluated expressions the same as C static initializers for floating point arithmetic. This does not distinguish between diagnosing and SFINAE context as I was suggesting above, and indeed neither front end currently warns about compile-time exceptions as Annex F recommends. Jason
Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]
On 12/1/21 1:07 PM, Richard Biener wrote: > On December 1, 2021 6:42:21 PM GMT+01:00, Peter Bergner > wrote: >> On 12/1/21 3:01 AM, Richard Biener wrote: >>> Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init >>> instead of fixing up things at expansion time. >> >> The following fixes the ICE. The bootstrap/regtesting is still running >> though. > > OK. Great, pushed. For posterity, below is what I committed. Peter middle-end: Skip initialization of opaque type variables [PR103127] For -ftrivial-auto-var-init=*, skip initializing the variable if it is an opaque type, because CONST0_RTX(mode) is not defined for opaque modes. 2021-12-01 Peter Bergner gcc/ PR middle-end/103127 * gimplify.c (is_var_need_auto_init): Handle opaque types. gcc/testsuite/ PR middle-end/103127 * gcc.target/powerpc/pr103127.c: New test. diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 8624f8221fd..326476f0238 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -1829,6 +1829,7 @@ is_var_need_auto_init (tree decl) || !DECL_HARD_REGISTER (decl)) && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED) && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl))) + && !OPAQUE_TYPE_P (TREE_TYPE (decl)) && !is_empty_type (TREE_TYPE (decl))) return true; return false; diff --git a/gcc/testsuite/gcc.target/powerpc/pr103127.c b/gcc/testsuite/gcc.target/powerpc/pr103127.c new file mode 100644 index 000..801fc0a4620 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103127.c @@ -0,0 +1,19 @@ +/* PR target/103127 */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -ftrivial-auto-var-init=zero" } */ + +/* Verify we do not ICE on the following tests. */ + +void +foo (__vector_quad *dst) +{ + __vector_quad acc; + *dst = acc; +} + +void +bar (__vector_pair *dst) +{ + __vector_pair pair; + *dst = pair; +}
Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]
On December 1, 2021 6:42:21 PM GMT+01:00, Peter Bergner wrote: >On 12/1/21 3:01 AM, Richard Biener wrote: >> Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init >> instead of fixing up things at expansion time. > >The following fixes the ICE. The bootstrap/regtesting is still running though. OK. Thanks, Richard. >Peter > > >diff --git a/gcc/gimplify.c b/gcc/gimplify.c >index 8624f8221fd..326476f0238 100644 >--- a/gcc/gimplify.c >+++ b/gcc/gimplify.c >@@ -1829,6 +1829,7 @@ is_var_need_auto_init (tree decl) > || !DECL_HARD_REGISTER (decl)) > && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED) > && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl))) >+ && !OPAQUE_TYPE_P (TREE_TYPE (decl)) > && !is_empty_type (TREE_TYPE (decl))) > return true; > return false;
Re: [committed] libstdc++: Optimize ref-count updates in COW std::string
On Wed, 1 Dec 2021 at 18:16, Florian Weimer wrote: > > * Jonathan Wakely via Libstdc: > > > diff --git a/libstdc++-v3/include/bits/cow_string.h > > b/libstdc++-v3/include/bits/cow_string.h > > index ced395b80b8..4fae1d02981 100644 > > --- a/libstdc++-v3/include/bits/cow_string.h > > +++ b/libstdc++-v3/include/bits/cow_string.h > > @@ -105,7 +105,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > * destroy the empty-string _Rep object. > > * > > * All but the last paragraph is considered pretty conventional > > - * for a C++ string implementation. > > + * for a Copy-On-Write C++ string implementation. > >*/ > >// 21.3 Template class basic_string > >template > > @@ -207,10 +207,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > // so we need to use an atomic load. However, _M_is_leaked > > // predicate does not change concurrently (i.e. the string is either > > // leaked or not), so a relaxed load is enough. > > - return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0; > > -#else > > - return this->_M_refcount < 0; > > + if (!__gnu_cxx::__is_single_threaded()) > > + return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0; > > #endif > > + return this->_M_refcount < 0; > > } > > Relaxed MO loads of word-size values on all current architectures only > have a compiler barrier, so I think the optimization makes things worse? Hmm, yes. > (I doubt the conditional lack of a compiler barrier leads to > optimization improvements elsewhere.) Probably not. I'll revert the change to _M_is_leaked() and just keep it for _M_is_shared(). Thanks for pointing that out.
Re: [PATCH] Loop unswitching: support gswitch statements.
On 12/1/21 09:48, Martin Liška wrote: On 12/1/21 15:34, Richard Biener wrote: On Wed, Dec 1, 2021 at 3:25 PM Martin Liška wrote: On 12/1/21 15:19, Richard Biener wrote: which is compute the range of 'lhs' on edge_true into predicate->true_range, assign that same range to ->false_range and then invert it to get the range on the false_edge. What I am saying is that for better precision you should do ranger->range_on_edge (predicate->false_range, edge_false, lhs); rather than prematurely optimize this to the inversion of the true range since yes, ranger is CFG sensitive and only the_last_ predicate on a long CFG path is actually inverted. What am I missing? I might be misunderstood, but I think it's the problem defined here: https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584605.html where I used the ranger->range_on_edge on the false_edge. Ah, OK. But then even the true_edge range is possibly wrong, no? You are of course correct, I've just proved that in debugger :// Consider for (;;) { if (a < 100) if (a > 50) // unswitch on this /* .. */ if (a < 120) /* ... */ } then you record [51, 99] for true_range of the a > 50 predicate and thus simplification will simplify the if (a < 120) check, no? Yep. You can only record the range from the (CFG independent) a > 50 check, thus [51, +INF] but of course at simplification time you can also use the CFG context at each simplification location. @Andrew: How can I easily get irange based just on a stmt? Something like fold_range with int_range_max as the 3rd argument? Sorry, I miss these things if I'm not directly CC'd a lot :-) So you just want to know the basic range the stmt generates without context? Sure, what you say would be fine, but your want to initialize it to the type range: int_range_max range (TREE_TYPE (name)); you can also simply trigger it using the current SSA_NAME_RANGE_INFO global values query instead of the default current contextual one... which , if there isnt a global range, will automatically use the range of the type of the argument.. so maybe just try fold_range (r, stmt, get_global_range_query ()) Andrew
Re: [committed] libstdc++: Optimize ref-count updates in COW std::string
* Jonathan Wakely via Libstdc: > diff --git a/libstdc++-v3/include/bits/cow_string.h > b/libstdc++-v3/include/bits/cow_string.h > index ced395b80b8..4fae1d02981 100644 > --- a/libstdc++-v3/include/bits/cow_string.h > +++ b/libstdc++-v3/include/bits/cow_string.h > @@ -105,7 +105,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > * destroy the empty-string _Rep object. > * > * All but the last paragraph is considered pretty conventional > - * for a C++ string implementation. > + * for a Copy-On-Write C++ string implementation. >*/ >// 21.3 Template class basic_string >template > @@ -207,10 +207,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > // so we need to use an atomic load. However, _M_is_leaked > // predicate does not change concurrently (i.e. the string is either > // leaked or not), so a relaxed load is enough. > - return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0; > -#else > - return this->_M_refcount < 0; > + if (!__gnu_cxx::__is_single_threaded()) > + return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0; > #endif > + return this->_M_refcount < 0; > } Relaxed MO loads of word-size values on all current architectures only have a compiler barrier, so I think the optimization makes things worse? (I doubt the conditional lack of a compiler barrier leads to optimization improvements elsewhere.) Thanks, Florian
Re: [PATCH] rs6000: Mirror fix for PR102347 into the new builtins support
On 12/1/21 11:21 AM, Segher Boessenkool wrote: > Hi! > > On Wed, Dec 01, 2021 at 09:29:42AM -0600, Bill Schmidt wrote: >> Recently Kewen fixed a problem in the old builtins support where >> rs6000_builtin_decl prematurely indicated that a target builtin is >> unavailable. This also needs to be done for the new builtins support, but in >> this case we have to ensure the error message is still produced from the >> overload support in rs6000-c.c. Unfortunately, this is less straightforward >> than it could be, because header file includes need to be adjusted to make >> this >> happen. Someday we'll consolidate all the builtin code in one file and this >> won't have to be so ugly. >> >> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions. Is >> this >> okay for trunk? > This is okay for trunk. Thanks! > > Is there some place we can store what original builtin was used when > some overload is resolved? Just in the new builtin code, don't spend > time on the old stuff :-) I think we can do much better, with a little work. The macros are a little problematic, but I have ideas about how to make this better in a future patch. (There's a fair amount of test suite fallout, so it should wait.) Thanks for the review! Bill > > > Segher
Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]
On 12/1/21 3:01 AM, Richard Biener wrote: > Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init > instead of fixing up things at expansion time. The following fixes the ICE. The bootstrap/regtesting is still running though. Peter diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 8624f8221fd..326476f0238 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -1829,6 +1829,7 @@ is_var_need_auto_init (tree decl) || !DECL_HARD_REGISTER (decl)) && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED) && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl))) + && !OPAQUE_TYPE_P (TREE_TYPE (decl)) && !is_empty_type (TREE_TYPE (decl))) return true; return false;
Re: [PATCH] ipa-sra: Check also ECF_LOOPING_CONST_OR_PURE when evaluating calls
On Tue, Nov 30 2021, Richard Biener wrote: > On Tue, Nov 30, 2021 at 3:24 PM Martin Jambor wrote: >> >> Hi, >> >> in PR 103267 Honza found out that IPA-SRA does not look at >> ECF_LOOPING_CONST_OR_PURE when evaluating if a call can have side >> effects. Fixed with this patch. The testcase infinitely loops in a >> const function, so it would not make a good addition to the testsuite. >> >> Bootstrapped and tested on x86_64-linux. OK for trunk? > > OK. > Thank you. For reference, I am also about to commit (after having bootstrapped and tested it on x86_64-linx) the following backport to the old IPA-SRA we have in GCC 9. Thanks, Martin gcc/ChangeLog: 2021-12-01 Martin Jambor PR ipa/103267 * tree-sra.c (scan_function): Also check ECF_LOOPING_CONST_OR_PURE flag. --- gcc/tree-sra.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 909b4fef9a8..252953b7512 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -1530,7 +1530,8 @@ scan_function (void) } if (final_bbs - && (flags & (ECF_CONST | ECF_PURE)) == 0) + && ((flags & (ECF_CONST | ECF_PURE)) == 0 + || (flags & ECF_LOOPING_CONST_OR_PURE))) bitmap_set_bit (final_bbs, bb->index); } -- 2.33.1
Re: [PATCH][wwwdocs] Update section on enormous source files in htdocs/projects/beginner.html
On Wed, Dec 1, 2021 at 10:54 AM Gerald Pfeifer wrote: > > Hi Eric, > > On Wed, 24 Nov 2021, Eric Gallager wrote: > > This next patch does more than just removing old stuff: it adds an > > extra sentence to describe a shell command used to generate a list, so > > to verify that I've got the shell command right, I'm asking for a > > review. > > -There are several other files in this size range, which I have left > +There are several other files in this size range, which are left > out because touching them at all is unwise (reload, the Fortran front > end). You can try, but I am not responsible for any damage to your > sanity which may result. > > while we are touching this, how about toning the second half of this > paragraph down and moving away from first person? > > How about something like "...breaking them up likely may prove rather > difficult" (or similar, happy to leave this to you as native spaker)? > OK yeah I did remove the first instance of the first person in that paragraph, but couldn't think of a better wording for the second one, so that suggestion is helpful; thanks. > > +Note that the list of large files in this section is generated with the > +following shell command, run from the gcc subdirectory: > + > + > + du -sh *.{c,h,cc} */*.{c,h,cc} | sort -hr | grep -v fortran | head -n > 14 > + > > This looks like it does what you want it to do. > > (Pulling up the database courses I have given in a previous life, I'd pull > in the grep before the sort - O(n) filtering before O(n·log n) processing > - but even on my notebook both are instantenous. ;-) ) > > > Please consider the two suggestions above and commit the result; just > share the final patch here (no review required). > > Thanks, > Gerald OK thanks, I committed the attached patch as f4b4d0f: https://gcc.gnu.org/git/?p=gcc-wwwdocs.git;a=commitdiff;h=f4b4d0f783246dd6f58944cdb542446d5e7589d3 patch-beginner-projects-02.diff Description: Binary data
[PATCH] c++: Fix bogus error with __integer_pack [PR94490]
Here we issue a bogus: error: '(0 ? fake_tuple_size_v : fake_tuple_size_v)' is not a constant expression because cxx_constant_value in expand_integer_pack gets *(0 ? VIEW_CONVERT_EXPR(fake_tuple_size_v) : VIEW_CONVERT_EXPR(fake_tuple_size_v)) which is a REFERENCE_REF_P and we evaluate its operand to 3, so we end up with *3 and that fails. Sounds like we need to get rid of the REFERENCE_REF_P then. That is what tsubst_copy_and_build/INDIRECT_REF will do: if (REFERENCE_REF_P (t)) { /* A type conversion to reference type will be enclosed in such an indirect ref, but the substitution of the cast will have also added such an indirect ref. */ r = convert_from_reference (r); } so I think it's reasonable to call instantiate_non_dependent_expr_sfinae. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11? PR c++/94490 gcc/cp/ChangeLog: * pt.c (expand_integer_pack): Call instantiate_non_dependent_expr_sfinae. gcc/testsuite/ChangeLog: * g++.dg/ext/integer-pack5.C: New test. --- gcc/cp/pt.c | 1 + gcc/testsuite/g++.dg/ext/integer-pack5.C | 29 2 files changed, 30 insertions(+) create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack5.C diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index f4b9d9673fb..6b560952639 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -3792,6 +3792,7 @@ expand_integer_pack (tree call, tree args, tsubst_flags_t complain, } else { + hi = instantiate_non_dependent_expr_sfinae (hi, complain); hi = cxx_constant_value (hi); int len = valid_constant_size_p (hi) ? tree_to_shwi (hi) : -1; diff --git a/gcc/testsuite/g++.dg/ext/integer-pack5.C b/gcc/testsuite/g++.dg/ext/integer-pack5.C new file mode 100644 index 000..84938649d31 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/integer-pack5.C @@ -0,0 +1,29 @@ +// PR c++/94490 +// { dg-do compile { target c++14 } } + +template +constexpr int fake_tuple_size_v = 3; +template struct intseq {}; + +// So that it compiles with clang++. +#if __has_builtin(__make_integer_seq) +using size_t = decltype(sizeof(1)); +template +using _IdxTuple = intseq<_Indices...>; + +template using genseq = __make_integer_seq<_IdxTuple, size_t, N>; +#else +template using genseq = intseq<__integer_pack(N)...>; +#endif + +template> +struct arith_result +{ }; + +template +auto Mul(const T&) +{ +return [](auto) { return arith_result> { }; }(0); +} + +auto x = Mul(0); base-commit: 54ebec35abec09a24b47b997172622ca0d8e2318 -- 2.33.1
Re: [PATCH] rs6000: Mirror fix for PR102347 into the new builtins support
Hi! On Wed, Dec 01, 2021 at 09:29:42AM -0600, Bill Schmidt wrote: > Recently Kewen fixed a problem in the old builtins support where > rs6000_builtin_decl prematurely indicated that a target builtin is > unavailable. This also needs to be done for the new builtins support, but in > this case we have to ensure the error message is still produced from the > overload support in rs6000-c.c. Unfortunately, this is less straightforward > than it could be, because header file includes need to be adjusted to make > this > happen. Someday we'll consolidate all the builtin code in one file and this > won't have to be so ugly. > > Bootstrapped and tested on powerpc64le-linux-gnu with no regressions. Is this > okay for trunk? This is okay for trunk. Thanks! Is there some place we can store what original builtin was used when some overload is resolved? Just in the new builtin code, don't spend time on the old stuff :-) Segher
Re: [PATCH] c++: ICE with auto(0) in requires-expression [PR103408]
On Wed, Dec 1, 2021 at 10:26 AM Marek Polacek via Gcc-patches wrote: > > Here we ICE on > > int f() requires (auto(0)); > > in do_auto_deduction when handling the auto: we're in a non-templated > requires-expression which are parsed under processing_template_decl == 1 > and empty current_template_parms, so 'current_template_args ()' will > crash. This code is invalid as per "C++20 CA378: Remove non-templated > constrained functions", but of course we shouldn't crash. FWIW it looks like we can trip over the same bug with valid code: static_assert(requires { auto(0); }); > > Since in the scenario above it's expected that current_template_parms is > null, I've just added a check, and let grokfndecl issue an error. > I guess another approch would be to fake up a template parameter list > before calling do_auto_deduction. > > For good measure, I've added several well-formed cases with auto(x) in > a requires-expression. > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > PR c++/103408 > > gcc/cp/ChangeLog: > > * pt.c (do_auto_deduction): Check current_template_parms before > current_template_args (). > > gcc/testsuite/ChangeLog: > > * g++.dg/cpp23/auto-fncast9.C: New test. > --- > gcc/cp/pt.c | 2 +- > gcc/testsuite/g++.dg/cpp23/auto-fncast9.C | 27 +++ > 2 files changed, 28 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast9.C > > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c > index f4b9d9673fb..012ca5d06c0 100644 > --- a/gcc/cp/pt.c > +++ b/gcc/cp/pt.c > @@ -30041,7 +30041,7 @@ do_auto_deduction (tree type, tree init, tree > auto_node, > (but we still may have used them for constraint checking above). */; >else if (context == adc_unify) > targs = add_to_template_args (outer_targs, targs); > - else if (processing_template_decl) > + else if (processing_template_decl && current_template_parms) > targs = add_to_template_args (current_template_args (), targs); >return tsubst (type, targs, complain, NULL_TREE); Won't this mean the call to tsubst here will end up lowering the level of the auto from 2 to 1 rather than replacing it with the actual deduced type? It also looks like this approach doesn't handle static_assert(requires { auto(auto(0)); }), probably due to this substitution issue. I guess we could add a dummy level to 'targs' to work around this.. > } > diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C > b/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C > new file mode 100644 > index 000..45a0ff9b460 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C > @@ -0,0 +1,27 @@ > +// PR c++/103408 > +// { dg-do compile { target c++23 } } > + > +int bad1() requires (auto(true)); // { dg-error "constraints on a > non-templated function" } > + > +template > +int f1() requires (auto(B)); > + > +template > +struct S { T t; constexpr operator bool() { return true; } }; > + > +int bad2() requires (bool(S{1})); // { dg-error "constraints on a > non-templated function" } > +int bad3() requires (bool(S(1))); // { dg-error "constraints on a > non-templated function" } > + > +template > +int f2() requires (bool(S{N})); > + > +template > +int f3() requires (bool(S(N))); > + > +void > +g () > +{ > + f1(); > + f2<42>(); > + f3<42>(); > +} > > base-commit: 6b8ecbc6d6652d061d7c72c64352d51eca2df6ca > -- > 2.33.1 >
Re: [PATCH] path solver: Minimize exported ranges to subsequent blocks.
On 12/1/2021 9:18 AM, Aldy Hernandez via Gcc-patches wrote: I'm going to hold off pushing this until I hear from the global maintainers and/or release managers. This patch fixes a pathological performance case, but it may also be handled by Andrew's fixes in this area. It's up to y'all to decide if it's something we want in this release. An alternative would be to keep track of any pathological issues going forward, and see if this fixes the issue. I'm going to be taking leave any day now :). I'm leaving this to you and Andrew to decide. Y'all know this code better than anyone. And yes, I keep expecting you to disappear without notice :-) Jeff
[PATCH, PING] rs6000: Builtins test changes for test_fpscr_[d]rn_builtin_error.c
Hi! I'd like to ping this patch. Thanks! Bill On 11/18/21 10:36 AM, Bill Schmidt wrote: > Hi! This is the last patch broken out of the previous test suite patch > for the new builtins support. > > One advantage of the new builtins support is uniform error messages for > arguments with restricted values. Previously this was done in many places > in an ad hoc manner, with little uniformity. This patch adjusts the > expected error messages accordingly. > > All such error messages are now one of the following: > "argument %d must be a %d-bit unsigned literal" > "argument %d must be a literal between %d and %d, inclusive" > "argument %d must be a variable or a literal between %d and %d, inclusive" > "argument %d must be either a literal %d or a literal %d" > > These messages were chosen to require the fewest changes from previous > messages while still introducing uniformity. This patch adjusts error > messages for some cases where this produces changed messages. In > particular, some messages are improved because previously they did not > admit the possibility that an argument could hold a variable. > > Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) > with no regressions. Is this okay for trunk? > > Thanks! > Bill > > > 2021-11-17 Bill Schmidt > > gcc/testsuite/ > * gcc.target/powerpc/test_fpscr_drn_builtin_error.c: Adjust error > messages. > * gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Likewise. > --- > .../powerpc/test_fpscr_drn_builtin_error.c | 4 ++-- > .../gcc.target/powerpc/test_fpscr_rn_builtin_error.c | 12 ++-- > 2 files changed, 8 insertions(+), 8 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c > b/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c > index 028ab0b6d66..4f9d9e08e8a 100644 > --- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c > +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c > @@ -9,8 +9,8 @@ int main () > __builtin_set_fpscr_drn() also support a variable as an argument but > can't test variable value at compile time. */ > > - __builtin_set_fpscr_drn(-1); /* { dg-error "Argument must be a value > between 0 and 7" } */ > - __builtin_set_fpscr_drn(8); /* { dg-error "Argument must be a value > between 0 and 7" } */ > + __builtin_set_fpscr_drn(-1); /* { dg-error "argument 1 must be a variable > or a literal between 0 and 7, inclusive" } */ > + __builtin_set_fpscr_drn(8); /* { dg-error "argument 1 must be a variable > or a literal between 0 and 7, inclusive" } */ > > } > > diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c > b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c > index aea65091b0c..10391b71008 100644 > --- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c > +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c > @@ -8,13 +8,13 @@ int main () > int arguments. The builtins __builtin_set_fpscr_rn() also supports a > variable as an argument but can't test variable value at compile time. > */ > > - __builtin_mtfsb0(-1); /* { dg-error "Argument must be a constant between > 0 and 31" } */ > - __builtin_mtfsb0(32); /* { dg-error "Argument must be a constant between > 0 and 31" } */ > + __builtin_mtfsb0(-1); /* { dg-error "argument 1 must be a 5-bit unsigned > literal" } */ > + __builtin_mtfsb0(32); /* { dg-error "argument 1 must be a 5-bit unsigned > literal" } */ > > - __builtin_mtfsb1(-1); /* { dg-error "Argument must be a constant between > 0 and 31" } */ > - __builtin_mtfsb1(32); /* { dg-error "Argument must be a constant between > 0 and 31" } */ > + __builtin_mtfsb1(-1); /* { dg-error "argument 1 must be a 5-bit unsigned > literal" } */ > + __builtin_mtfsb1(32); /* { dg-error "argument 1 must be a 5-bit unsigned > literal" } */ > > - __builtin_set_fpscr_rn(-1); /* { dg-error "Argument must be a value > between 0 and 3" } */ > - __builtin_set_fpscr_rn(4); /* { dg-error "Argument must be a value > between 0 and 3" } */ > + __builtin_set_fpscr_rn(-1); /* { dg-error "argument 1 must be a variable > or a literal between 0 and 3, inclusive" } */ > + __builtin_set_fpscr_rn(4); /* { dg-error "argument 1 must be a variable > or a literal between 0 and 3, inclusive" } */ > } >
[PATCH, PING] rs6000: Builtins test changes for pragma_misc9.c
Hi! I'd like to ping this patch. Thanks! Bill On 11/18/21 10:18 AM, Bill Schmidt wrote: > Hi! This patch is broken out from the test suite patch for the new > builtins support. This one is just a minor adjustment for the error > message wording. > > Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) > with no regressions. Is this okay for trunk? > > Thanks! > Bill > > > 2021-11-17 Bill Schmidt > > gcc/testsuite/ > * gcc.target/powerpc/pragma_misc9.c: Adjust error message. > --- > gcc/testsuite/gcc.target/powerpc/pragma_misc9.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c > b/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c > index e03099bd084..c1667d9f7db 100644 > --- a/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c > +++ b/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c > @@ -20,7 +20,7 @@ vector bool int > test2 (vector signed int a, vector signed int b) > { >return vec_cmpnez (a, b); > - /* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mcpu=power9' > option" "" { target *-*-* } .-1 } */ > + /* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mcpu=power9' > and '-mvsx' options" "" { target *-*-* } .-1 } */ > } > > #pragma GCC target ("cpu=power7") > @@ -28,7 +28,7 @@ vector signed int > test3 (vector signed int a, vector signed int b) > { >return vec_mergee (a, b); > - /* { dg-error "'__builtin_altivec_vmrgew_v4si' requires the > '-mpower8-vector' option" "" { target *-*-* } .-1 } */ > + /* { dg-error "'__builtin_altivec_vmrgew_v4si' requires the '-mcpu=power8' > and '-mvsx' options" "" { target *-*-* } .-1 } */ > } > > #pragma GCC target ("cpu=power6")
[PATCH, PING] rs6000: Builtins test changes for pr80315-*.c, pr88100.c
Hi! I'd like to ping this patch. Thanks! Bill On 11/18/21 10:15 AM, Bill Schmidt wrote: > Hi! This patch is broken out from the test case patch for the new > builtins support. > > One advantage of the new builtins support is uniform error messages for > arguments with restricted values. Previously this was done in many places > in an ad hoc manner, with little uniformity. This patch adjusts the > expected error messages accordingly. > > All error messages are now one of the following: > "argument %d must be a %d-bit unsigned literal" > "argument %d must be a literal between %d and %d, inclusive" > "argument %d must be a variable or a literal between %d and %d, inclusive" > "argument %d must be either a literal %d or a literal %d" > > These messages were chosen to require the fewest changes from previous > messages while still introducing uniformity. This patch adjusts error > messages for some cases where this produces changed messages. > > Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with > no regressions. is this okay for trunk? > > Thanks! > Bill > > > 2021-11-17 Bill Schmidt > > gcc/testsuite/ > * gcc.target/powerpc/pr80315-1.c: Adjust error message. > * gcc.target/powerpc/pr80315-2.c: Likewise. > * gcc.target/powerpc/pr80315-3.c: Likewise. > * gcc.target/powerpc/pr80315-4.c: Likewise. > * gcc.target/powerpc/pr88100.c: Likewise. > --- > gcc/testsuite/gcc.target/powerpc/pr80315-1.c | 2 +- > gcc/testsuite/gcc.target/powerpc/pr80315-2.c | 2 +- > gcc/testsuite/gcc.target/powerpc/pr80315-3.c | 2 +- > gcc/testsuite/gcc.target/powerpc/pr80315-4.c | 2 +- > gcc/testsuite/gcc.target/powerpc/pr88100.c | 12 ++-- > 5 files changed, 10 insertions(+), 10 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-1.c > b/gcc/testsuite/gcc.target/powerpc/pr80315-1.c > index e2db0ff4b5f..f37f1f169a2 100644 > --- a/gcc/testsuite/gcc.target/powerpc/pr80315-1.c > +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-1.c > @@ -10,6 +10,6 @@ main() >int mask; > >/* Argument 2 must be 0 or 1. Argument 3 must be in range 0..15. */ > - res = __builtin_crypto_vshasigmaw (test, 1, 0xff); /* { dg-error {argument > 3 must be in the range \[0, 15\]} } */ > + res = __builtin_crypto_vshasigmaw (test, 1, 0xff); /* { dg-error {argument > 3 must be a 4-bit unsigned literal} } */ >return 0; > } > diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-2.c > b/gcc/testsuite/gcc.target/powerpc/pr80315-2.c > index 144b705c012..0819a0511b7 100644 > --- a/gcc/testsuite/gcc.target/powerpc/pr80315-2.c > +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-2.c > @@ -10,6 +10,6 @@ main () >int mask; > >/* Argument 2 must be 0 or 1. Argument 3 must be in range 0..15. */ > - res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error {argument > 3 must be in the range \[0, 15\]} } */ > + res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error {argument > 3 must be a 4-bit unsigned literal} } */ >return 0; > } > diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-3.c > b/gcc/testsuite/gcc.target/powerpc/pr80315-3.c > index 99a3e24eadd..cc2e46cf5cb 100644 > --- a/gcc/testsuite/gcc.target/powerpc/pr80315-3.c > +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-3.c > @@ -12,6 +12,6 @@ main () >int mask; > >/* Argument 2 must be 0 or 1. Argument 3 must be in range 0..15. */ > - res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be > in the range \[0, 15\]} } */ > + res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be a > 4-bit unsigned literal} } */ >return res; > } > diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-4.c > b/gcc/testsuite/gcc.target/powerpc/pr80315-4.c > index 7f5f6f75029..ac12910741b 100644 > --- a/gcc/testsuite/gcc.target/powerpc/pr80315-4.c > +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-4.c > @@ -12,6 +12,6 @@ main () >int mask; > >/* Argument 2 must be 0 or 1. Argument 3 must be in range 0..15. */ > - res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be > in the range \[0, 15\]} } */ > + res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be a > 4-bit unsigned literal} } */ >return res; > } > diff --git a/gcc/testsuite/gcc.target/powerpc/pr88100.c > b/gcc/testsuite/gcc.target/powerpc/pr88100.c > index 4452145ce95..764c897a497 100644 > --- a/gcc/testsuite/gcc.target/powerpc/pr88100.c > +++ b/gcc/testsuite/gcc.target/powerpc/pr88100.c > @@ -10,35 +10,35 @@ > vector unsigned char > splatu1 (void) > { > - return vec_splat_u8(0x100);/* { dg-error "argument 1 must be a 5-bit > signed literal" } */ > + return vec_splat_u8(0x100);/* { dg-error "argument 1 must be a literal > between -16 and 15, inclusive" } */ > } > > vector unsigned short > splatu2 (void) > { > - return vec_splat_u16(0x1);/* { dg-error "argument 1 must be a 5-bit > signed literal" } */ >
Patch ping related to OpenMP
Patch ping - lightly sorted by priority, simplicity and dependency: * [PATCH, v5, OpenMP 5.0] Improve OpenMP target support for C++ [PR92120 v5] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584602.html * [PATCH, v2, OpenMP 5.0] Remove array section base-pointer mapping semantics, and other front-end adjustments (mainline trunk) https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584994.html * [PATCH] C, C++, Fortran, OpenMP: Add 'has_device_addr' clause to 'target' construct https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585361.html * libgomp.texi: Update OMP_PLACES https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582121.html * [patch] Fortran/OpenMP: Support most of 5.1 atomic extensions https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584450.html * [PATCH 00/16] OpenMP: lvalues in "map" clauses and struct handling rework https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585439.html Thanks, Tobias - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
[PATCH, PING] rs6000: Builtins test changes for compare-bytes tests
Hi! I'd like to ping this patch. Thanks! Bill On 11/18/21 7:47 AM, Bill Schmidt wrote: > Hi! This patch is broken out from the patch with test suite changes for the > new builtins support. > > With the old builtins support, cmpb-2.c produces: > warning: implicit declaration of function '__builtin_cmpb; did you mean > '__builtin_bcmp'? > > With the new support, it produces: > error: '__builtin_p6_cmpb requires the '-mcpu=power6' option and either the > '-m64' or '-mpowerpc64' option > note: builtin '__builtin_cmpb' requires builtin '__builtin_p6_cmpb' > > The reason for this is that this builtin wasn't even initialized in the > old support. This reflects a difference in philosophy between the old and > new methods. The old support often doesn't initialize builtins for which > the conditions don't apply based on compile options, but this can backfire > in general when such constructs as "#pragma target" are used. The new > support initializes all builtins, and waits until expand time to determine > whether or not they are enabled. Besides added flexibility, we also get > better error messages as a result. > > The case for cmpb32-2.c is similar. > > Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with > no regressions. Is this okay for trunk? > > Thanks! > Bill > > > 2021-11-17 Bill Schmidt > > gcc/testsuite/ > * gcc.target/powerpc/cmpb-2.c: Adjust error message. > * gcc.target/powerpc/cmpb32-2.c: Likewise. > --- > gcc/testsuite/gcc.target/powerpc/cmpb-2.c | 2 +- > gcc/testsuite/gcc.target/powerpc/cmpb32-2.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb-2.c > b/gcc/testsuite/gcc.target/powerpc/cmpb-2.c > index 113ab6a5f99..02b84d0731d 100644 > --- a/gcc/testsuite/gcc.target/powerpc/cmpb-2.c > +++ b/gcc/testsuite/gcc.target/powerpc/cmpb-2.c > @@ -8,7 +8,7 @@ void abort (); > unsigned long long int > do_compare (unsigned long long int a, unsigned long long int b) > { > - return __builtin_cmpb (a, b); /* { dg-warning "implicit declaration > of function '__builtin_cmpb'" } */ > + return __builtin_cmpb (a, b); /* { dg-error "'__builtin_p6_cmpb' > requires the '-mcpu=power6' option" } */ > } > > void > diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c > b/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c > index 37b54745e0e..d4264ab6e7d 100644 > --- a/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c > +++ b/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c > @@ -7,7 +7,7 @@ void abort (); > unsigned int > do_compare (unsigned int a, unsigned int b) > { > - return __builtin_cmpb (a, b); /* { dg-warning "implicit declaration of > function '__builtin_cmpb'" } */ > + return __builtin_cmpb (a, b); /* { dg-error "'__builtin_p6_cmpb_32' > requires the '-mcpu=power6' option" } */ > } > > void
[PATCH, PING] rs6000: Builtins test changes for BFP scalar tests
Hi! I'd like to ping this patch. Segher had objected to the change in diagnostics, but I hope we've solved that now with the better informational message [1]. Thanks! Bill [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585250.html On 11/17/21 2:58 PM, Bill Schmidt wrote: > Hi! This patch is broken out of the previous patch for all the builtins test > suite adjustments. Here we have some slight changes in error messages due to > how the internals have changed between the old and new builtins methods. > > For scalar-extract-exp-2.c we change: > error: '__builtin_vec_scalar_extract_exp is not supported in this compiler > configuration' > > to: > error: '__builtin_vsx_scalar_extract_exp' requires the '-mcpu=power9' > option and either the '-m64' or '-mpowerpc64' option > note: builtin '__builtin_vec_scalar_extract_exp' requires builtin > '__builtin_vsx_scalar_extract_exp' > > The new message provides more information. In both cases, it is less than > ideal that we don't refer to scalar_extract_exp, which is referenced in > the source line, but this is because scalar_extract_exp is #define'd to > __builtin_vec_scalar_extract_exp, so it's unavoidable. Certainly this is no > worse than before, and arguably better. > > The cases for: > scalar-insert-exp-2.c > scalar-insert-exp-5.c > scalar-insert-exp-8.c > are all similar. > > For scalar-extract-sig-2.c we again change: > error: '__builtin_vec_scalar_extract_sig' is not supported in this compiler > configuration' > > to: > error: '__builtin_vsx_scalar_extract_sig' requires the '-mcpu=power9' > option and either the '-m64' or '-mpowerpc64' option > note: builtin '__builtin_vec_scalar_extract_sig' requires builtin > '__builtin_vsx_scalar_extract_sig' > > Here it is clearer because there is no #define to muddy things up, and > again the new message is arguably better than the old. > > For scalar-test-neg-{2,3,5}.c, we actually change the test case. This is > because we deliberately removed some undocumented and pointless > overloads, > where each overload mapped to a single builtin. These were: > __builtin_vec_scalar_test_neg_sp > __builtin_vec_scalar_test_neg_dp > __builtin_vec_scalar_test_neg_qp > which are redundant with the "real" overload: > __builtin_vec_scalar_test_neg > The latter maps to three builtins of the appropriate type. > > The revised test case uses the "real" overload instead, and otherwise the > changes to the error messages are the same as for all the other cases. > > 2021-11-17 Bill Schmidt > > gcc/testsuite/ > * gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust error > message. > * gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Likewise. > * gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Likewise. > * gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Likewise. > * gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Likewise. > * gcc.target/powerpc/bfp/scalar-test-neg-2.c: Likewise. > * gcc.target/powerpc/bfp/scalar-test-neg-3.c: Likewise. > * gcc.target/powerpc/bfp/scalar-test-neg-5.c: Likewise. > --- > gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c | 2 +- > gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c | 2 +- > gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c | 2 +- > gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c | 2 +- > gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-8.c | 2 +- > gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-2.c| 2 +- > gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-3.c| 2 +- > gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-5.c| 2 +- > 8 files changed, 8 insertions(+), 8 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c > b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c > index 922180675fc..53b67c95cf9 100644 > --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c > +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c > @@ -14,7 +14,7 @@ get_exponent (double *p) > { >double source = *p; > > - return scalar_extract_exp (source);/* { dg-error > "'__builtin_vec_scalar_extract_exp' is not supported in this compiler > configuration" } */ > + return scalar_extract_exp (source);/* { dg-error > "'__builtin_vsx_scalar_extract_exp' requires the" } */ > } > > > diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c > b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c > index e24d4bd23fe..39ee74c94dc 100644 > --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c > +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c > @@ -12,5 +12,5 @@ get_significand (double *p) > { >double source = *p; > > - return __builtin_vec_scalar_extract_sig (source); /* { dg-error > "'__builtin_vec_scalar_extract_sig' is not supported in this compiler
Re: [PATCH v2] rs6000: Fix a handful of 32-bit built-in function problems
Hi! I'd like to ping this patch. By the way, the diagnostics are improved [1] since I sent it, so that we now inform the user that the overloaded function is implemented by the instantiated function. Thanks! Bill [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585250.html On 11/16/21 12:56 PM, Bill Schmidt wrote: > Hi! I previously posted [1] to correct some problems with the new builtins > support targeting 32-bit code gen. Based on the discussion, I've made some > adjustments and would like to submit this for consideration. > > We eventually agreed that the strange behavior for -m32 -mpowerpc64 for > certain > HTM builtins should be removed. All of the registers TEXASR, TEXASRU, TFHAR, > and TFIAR are now accessed using the unsigned long data type in all > configurations. > > Segher didn't like the change in the error message for the cmpb-3.c test case, > but I think this should be fine. The test case just tests for the error > message, > but there is also a "note" message that provides additional information. The > diagnostics that the user sees will look like this: > > cmpb-3.c:11:3: error: '__builtin_p6_cmpb' requires the '-mcpu=power6' option > and either the '-m64' or '-mpowerpc64' option > cmpb-3.c:11:3: note: builtin '__builtin_cmpb' requires builtin > '__builtin_p6_cmpb' > > So it's clear to the user that their use of __builtin_cmpb at line 11 > triggered > the error. > > Bootstrapped and tested on powerpc64le-linux-gnu, and on powerpc64-linux-gnu > using -m32/-m64. Is this okay for trunk? > > Thanks! > Bill > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583905.html > > > 2021-11-16 Bill Schmidt > > gcc/ > * config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit. > (BPERMD): Flag as 32bit (needing special handling for 32-bit). > (UNPACK_TD): Return unsigned long long instead of unsigned long. > (GET_TEXASR): Return unsigned long instead of unsigned long long. > (GET_TEXASRU): Likewise. > (GET_TFHAR): Likewise. > (GET_TFIAR): Likewise. > (SET_TEXASR): Pass unsigned long instead of unsigned long long. > (SET_TEXASRU): Likewise. > (SET_TFHAR): Likewise. > (SET_TFIAR): Likewise. > (TABORTDC): Likewise. > (TABORTDCI): Likewise. > * config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error > handling for no32bit. Add 32bit handling for RS6000_BIF_BPERMD. > > gcc/testsuite/ > * gcc.target/powerpc/cmpb-3.c: Adjust error message. > --- > gcc/config/rs6000/rs6000-builtin-new.def | 30 +++ > gcc/config/rs6000/rs6000-call.c | 9 --- > gcc/testsuite/gcc.target/powerpc/cmpb-3.c | 2 +- > 3 files changed, 22 insertions(+), 19 deletions(-) > > diff --git a/gcc/config/rs6000/rs6000-builtin-new.def > b/gcc/config/rs6000/rs6000-builtin-new.def > index 58dfce1ca37..30556e5c7f2 100644 > --- a/gcc/config/rs6000/rs6000-builtin-new.def > +++ b/gcc/config/rs6000/rs6000-builtin-new.def > @@ -273,7 +273,7 @@ > ; Power6 builtins requiring 64-bit GPRs (even with 32-bit addressing). > [power6-64] >const signed long __builtin_p6_cmpb (signed long, signed long); > -CMPB cmpbdi3 {} > +CMPB cmpbdi3 {no32bit} > > > ; AltiVec builtins. > @@ -2018,7 +2018,7 @@ > ADDG6S addg6s {} > >const signed long __builtin_bpermd (signed long, signed long); > -BPERMD bpermd_di {} > +BPERMD bpermd_di {32bit} > >const unsigned int __builtin_cbcdtd (unsigned int); > CBCDTD cbcdtd {} > @@ -2971,7 +2971,7 @@ >void __builtin_set_fpscr_drn (const int[0,7]); > SET_FPSCR_DRN rs6000_set_fpscr_drn {} > > - const unsigned long __builtin_unpack_dec128 (_Decimal128, const int<1>); > + const unsigned long long __builtin_unpack_dec128 (_Decimal128, const > int<1>); > UNPACK_TD unpacktd {} > > > @@ -3014,39 +3014,39 @@ > > > [htm] > - unsigned long long __builtin_get_texasr (); > + unsigned long __builtin_get_texasr (); > GET_TEXASR nothing {htm,htmspr} > > - unsigned long long __builtin_get_texasru (); > + unsigned long __builtin_get_texasru (); > GET_TEXASRU nothing {htm,htmspr} > > - unsigned long long __builtin_get_tfhar (); > + unsigned long __builtin_get_tfhar (); > GET_TFHAR nothing {htm,htmspr} > > - unsigned long long __builtin_get_tfiar (); > + unsigned long __builtin_get_tfiar (); > GET_TFIAR nothing {htm,htmspr} > > - void __builtin_set_texasr (unsigned long long); > + void __builtin_set_texasr (unsigned long); > SET_TEXASR nothing {htm,htmspr} > > - void __builtin_set_texasru (unsigned long long); > + void __builtin_set_texasru (unsigned long); > SET_TEXASRU nothing {htm,htmspr} > > - void __builtin_set_tfhar (unsigned long long); > + void __builtin_set_tfhar (unsigned long); > SET_TFHAR nothing {htm,htmspr} > > - void __builtin_set_tfiar (unsigned long long); > + void __builtin_set_tfiar (unsigned long); >
Re: [PATCH] path solver: Minimize exported ranges to subsequent blocks.
I'm going to hold off pushing this until I hear from the global maintainers and/or release managers. This patch fixes a pathological performance case, but it may also be handled by Andrew's fixes in this area. It's up to y'all to decide if it's something we want in this release. An alternative would be to keep track of any pathological issues going forward, and see if this fixes the issue. I'm going to be taking leave any day now :). Aldy On Sat, Nov 27, 2021 at 9:32 AM Aldy Hernandez wrote: > > Currently we have a set of interesting names in the path that we solve > for. However, solving *all* of them on exit from each block is > unnecessary, even if an edge range is available. We can reduce the > queried edges by only solving SSA names from the current block > that will be needed in subsequent blocks. > > With this patch we avoid the explosion in PR103254, without the need for > Andrew's patch (which is still independently needed). > > In my testbed we loose 6 threads in the non-resolving threading passes, > but the overall thread count is the same, as we pick them up later. We > could probably try harder to get them, but it's probably not worth the > effort. > > Benchmarking this patch shows no difference in overall compilation speed > and a tiny slowdown (< 0.80%) for some of the threading passes. > Benchmarked for both -O2, -O3, and -O2 --param ranger-logical-depth=10. > > I'm unsure what the review requirements are in stage3, so... > > OK for trunk? > > p.s. I saw some cleanups that could be done (intersecting more bitmaps, > putting the rest of the bitmaps in the new obstack, etc etc), but I > avoided doing them to minimize the churn in stage3. > > Regstrapped on x86-64 & ppc64le Linux. > > PR tree-optimization/103254 > > gcc/ChangeLog: > > * gimple-range-path.cc (path_range_query::path_range_query): > Initialize obstack. > (path_range_query::~path_range_query): Release obstack. > (path_range_query::needs_on_entry_for_block): New. > (path_range_query::compute_needs_on_entry_set): New. > (path_range_query::compute_ranges_in_block): Only calculate > outgoing edges for m_needs_on_entry set. > (path_range_query::compute_ranges): Call > compute_needs_on_entry_set. > * gimple-range-path.h: Add needs_on_entry_for_block, > compute_needs_on_entry_set, m_bitmaps, and m_needs_on_entry. > > gcc/testsuite/ChangeLog: > > * gcc.dg/pr103254.c: Add -fdump-tree-vrp2-details. > * gcc.dg/pr103254-2.c: New test. > --- > gcc/gimple-range-path.cc | 67 +-- > gcc/gimple-range-path.h | 4 ++ > gcc/testsuite/gcc.dg/pr103254-2.c | 29 + > gcc/testsuite/gcc.dg/pr103254.c | 6 ++- > 4 files changed, 102 insertions(+), 4 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/pr103254-2.c > > diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc > index b9c71226c1c..32226df05b5 100644 > --- a/gcc/gimple-range-path.cc > +++ b/gcc/gimple-range-path.cc > @@ -48,6 +48,7 @@ path_range_query::path_range_query (bool resolve, > gimple_ranger *ranger) > m_ranger = ranger; > >m_oracle = new path_oracle (m_ranger->oracle ()); > + bitmap_obstack_initialize (&m_bitmaps); > } > > path_range_query::~path_range_query () > @@ -57,6 +58,7 @@ path_range_query::~path_range_query () > delete m_ranger; >BITMAP_FREE (m_has_cache_entry); >delete m_cache; > + bitmap_obstack_release (&m_bitmaps); > } > > // Return TRUE if NAME is in the import bitmap. > @@ -401,6 +403,62 @@ path_range_query::compute_ranges_in_phis (basic_block bb) > } > } > > +// Return the set of SSA names that would be useful to have resolved > +// on entry to this block. These are the names that will ultimately > +// help resolve the final conditional on the path. > + > +void > +path_range_query::needs_on_entry_for_block (bitmap on_entry, unsigned bbn) > +{ > + gori_compute &g = m_ranger->gori (); > + basic_block next = m_path[bbn]; > + bitmap_copy (on_entry, g.imports (next)); > + > + // Any incoming PHI args are needed on entry, but are not in the > + // g.imports() above, so add them manually. > + for (auto iter = gsi_start_phis (next); !gsi_end_p (iter); gsi_next > (&iter)) > +{ > + gphi *phi = iter.phi (); > + tree result = gimple_phi_result (phi); > + > + if (bitmap_bit_p (m_imports, SSA_NAME_VERSION (result))) > + for (size_t i = 0; i < gimple_phi_num_args (phi); ++i) > + { > + tree arg = gimple_phi_arg (phi, i)->def; > + > + if (TREE_CODE (arg) == SSA_NAME > + && m_path.contains (gimple_phi_arg_edge (phi, i)->src)) > + bitmap_set_bit (on_entry, SSA_NAME_VERSION (arg)); > + } > +} > + bitmap_and_into (on_entry, m_imports); > +} > + > +// Compute the set of needed on-entry names for all path blocks. > + > +void > +path_range_query::compute_needs_on_entr
Re: [PATCH] path solver: Use only one ssa_global_cache.
No one has said anything in 48 hours. By the power vested in me, I declare it approved. Pushed. On Mon, Nov 29, 2021 at 5:21 PM Aldy Hernandez wrote: > > We're using a temporary range cache while computing ranges for PHIs to > make sure the real cache doesn't get set until all PHIs are computed. > With the ltrans beast in LTO mode this causes undue overhead. > > Since we already have a bitmap to indicate whether there's a cache > entry, we can avoid the extra cache object by clearing it while PHIs > are being calculated. > > Tested on x86-64 Linux. > > OK for trunk? > > gcc/ChangeLog: > > PR tree-optimization/103409 > * gimple-range-path.cc (path_range_query::compute_ranges_in_phis): > Do all the work with just one ssa_global_cache. > * gimple-range-path.h: Remove m_tmp_phi_cache. > --- > gcc/gimple-range-path.cc | 23 +++ > gcc/gimple-range-path.h | 2 -- > 2 files changed, 11 insertions(+), 14 deletions(-) > > diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc > index b9c71226c1c..15ef72fd492 100644 > --- a/gcc/gimple-range-path.cc > +++ b/gcc/gimple-range-path.cc > @@ -375,30 +375,29 @@ void > path_range_query::compute_ranges_in_phis (basic_block bb) > { >int_range_max r; > - gphi_iterator iter; > + auto_bitmap phi_set; > >// PHIs must be resolved simultaneously on entry to the block >// because any dependencies must be satistifed with values on entry. >// Thus, we calculate all PHIs first, and then update the cache at >// the end. > > - m_tmp_phi_cache.clear (); > - for (iter = gsi_start_phis (bb); !gsi_end_p (iter); gsi_next (&iter)) > + for (auto iter = gsi_start_phis (bb); !gsi_end_p (iter); gsi_next (&iter)) > { >gphi *phi = iter.phi (); >tree name = gimple_phi_result (phi); > >if (import_p (name) && range_defined_in_block (r, name, bb)) > - m_tmp_phi_cache.set_global_range (name, r); > -} > - for (iter = gsi_start_phis (bb); !gsi_end_p (iter); gsi_next (&iter)) > -{ > - gphi *phi = iter.phi (); > - tree name = gimple_phi_result (phi); > - > - if (m_tmp_phi_cache.get_global_range (r, name)) > - set_cache (r, name); > + { > + unsigned v = SSA_NAME_VERSION (name); > + set_cache (r, name); > + bitmap_set_bit (phi_set, v); > + // Pretend we don't have a cache entry for this name until > + // we're done with all PHIs. > + bitmap_clear_bit (m_has_cache_entry, v); > + } > } > + bitmap_ior_into (m_has_cache_entry, phi_set); > } > > // Compute ranges defined in the current block, or exported to the > diff --git a/gcc/gimple-range-path.h b/gcc/gimple-range-path.h > index 57a9ae9bdcd..77c92c07bc1 100644 > --- a/gcc/gimple-range-path.h > +++ b/gcc/gimple-range-path.h > @@ -82,8 +82,6 @@ private: >// Range cache for SSA names. >ssa_global_cache *m_cache; > > - ssa_global_cache m_tmp_phi_cache; > - >// Set for each SSA that has an active entry in the cache. >bitmap m_has_cache_entry; > > -- > 2.31.1 >
Re: [PATCH, PR90030] Fortran OpenMP/OpenACC array mapping alignment fix
On Thu, Nov 04, 2021 at 04:23:43PM +0800, Chung-Lin Tang wrote: > 2021-11-04 Chung-Lin Tang > Thomas Schwinge > > PR fortran/90030 > > gcc/fortran/ChangeLog: > > * trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer > to char_type_node, add gcc_assert of POINTER_TYPE_P. > (gfc_trans_omp_array_section): Likewise. > (gfc_trans_omp_clauses): Likewise. > > gcc/testsuite/ChangeLog: > > * gfortran.dg/goacc/finalize-1.f: Adjust scan test. > * gfortran.dg/gomp/affinity-clause-1.f90: Likewise. > * gfortran.dg/gomp/affinity-clause-5.f90: Likewise. > * gfortran.dg/gomp/defaultmap-4.f90: Likewise. > * gfortran.dg/gomp/defaultmap-5.f90: Likewise. > * gfortran.dg/gomp/defaultmap-6.f90: Likewise. > * gfortran.dg/gomp/map-3.f90: Likewise. > * gfortran.dg/gomp/pr78260-2.f90: Likewise. > * gfortran.dg/gomp/pr78260-3.f90: Likewise. > > libgomp/ChangeLog: > > * testsuite/libgomp.oacc-fortran/pr90030.f90: New test. > * testsuite/libgomp.fortran/pr90030.f90: New test. Ok, thanks. Jakub
Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]
On 12/1/21 9:06 AM, Qing Zhao wrote: >> On Dec 1, 2021, at 3:01 AM, Richard Biener >> wrote: >> Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init >> instead of fixing up things at expansion time. > > Agreed. > > So, Peter, will you update the routine “is_var_need_auto_init” in gimplify.c > to exclude OPAQUE_TYPE to fix this issue? Sure, I can give that a try and verify it fixes the ICE too. Peter
Re: [PATCH] Allow loop crossing paths in back threader copier.
On Wed, Dec 1, 2021 at 2:36 PM Richard Biener wrote: > > On Tue, Nov 30, 2021 at 4:48 PM Aldy Hernandez via Gcc-patches > wrote: > > > > We are currently restricting loop crossing paths in the generic copier > > used by the back threader, but we should be able to handle them after > > loop_done has completed. > > But this isn't a cost thing but a correctness (as in, can we update > loops properly) > thing. We are passing 'loop' down to copy_bbs and very much rely on > seeing either BBs of that loop or copying complete subloops of it. > > You will likely see excess loop fixup caused by this or ICEs in case the > caller of this function will not always set LOOPS_NEED_FIXUP > (not that copy_bbs handling of loops is perfect). Ughh, fair enough. In which case at least I've commented in the PR what the problem is here. Perhaps we could benefit from a BB copier that could handle loops, which is way above my circle of competence ;-). Aldy
Re: [PATCH][wwwdocs] Update section on enormous source files in htdocs/projects/beginner.html
Hi Eric, On Wed, 24 Nov 2021, Eric Gallager wrote: > This next patch does more than just removing old stuff: it adds an > extra sentence to describe a shell command used to generate a list, so > to verify that I've got the shell command right, I'm asking for a > review. -There are several other files in this size range, which I have left +There are several other files in this size range, which are left out because touching them at all is unwise (reload, the Fortran front end). You can try, but I am not responsible for any damage to your sanity which may result. while we are touching this, how about toning the second half of this paragraph down and moving away from first person? How about something like "...breaking them up likely may prove rather difficult" (or similar, happy to leave this to you as native spaker)? +Note that the list of large files in this section is generated with the +following shell command, run from the gcc subdirectory: + + + du -sh *.{c,h,cc} */*.{c,h,cc} | sort -hr | grep -v fortran | head -n 14 + This looks like it does what you want it to do. (Pulling up the database courses I have given in a previous life, I'd pull in the grep before the sort - O(n) filtering before O(n·log n) processing - but even on my notebook both are instantenous. ;-) ) Please consider the two suggestions above and commit the result; just share the final patch here (no review required). Thanks, Gerald
Re: [PATCH] dwarf: Multi-register CFI address support.
On Thu, Nov 11, 2021 at 06:12:50PM +, Hafiz Abid Qadeer wrote: > * dwarf2cfi.c (dw_stack_pointer_regnum): Change type to struct cfa_reg. > (dw_frame_pointer_regnum): Likewise. > (new_cfi_row): Use set_by_dwreg. > (get_cfa_from_loc_descr): Use set_by_dwreg. Support register spans. > handle DW_OP_bregx with DW_OP_breg{0-31}. Support DW_OP_lit*, 2 spaces instead of 1 before Support above. > + cfa->reg.set_by_dwreg ((op == DW_OP_bregx) > + ? (ptr->dw_loc_oprnd1.v.val_int) : (op - DW_OP_breg0)); Formatting. All those 3 () inner pairs are unnecessary, and it would be nicer to use a temporary, like: unsigned regno = (op == DW_OP_bregx ? ptr->dw_loc_oprnd1.v.val_int : op - DW_OP_breg0); cfa->reg.set_by_dwreg (regno); > + unsigned int regno = (op == DW_OP_bregx) > + ? (ptr->dw_loc_oprnd1.v.val_int) : (op - DW_OP_breg0); With the ()s similarly, and also ? should be below op. > + gcc_assert (regno == (cfa->reg.reg - 1)); Again, the inner () pair is unnecessary. > + cfa->reg.span_width = (cfa->offset.to_constant () / 8); And here the outer () pair. > + > +#if CHECKING_P Please use if (CHECKING_P) instead of #if. > + /* Ensure that the above assumption is accurate. */ > + for (unsigned int i = 0; i < result.span; i++) > + { > + gcc_assert (known_eq (GET_MODE_SIZE (GET_MODE (XVECEXP (span, > + 0, i))), > + result.span_width)); > + gcc_assert (REG_P (XVECEXP (span, 0, i))); > + gcc_assert (dwf_regno (XVECEXP (span, 0, i)) == result.reg + i); > + } > +#endif > +} > + > + return result; > @@ -3135,7 +3267,8 @@ static unsigned int > execute_dwarf2_frame (void) > { >/* Different HARD_FRAME_POINTER_REGNUM might coexist in the same file. */ > - dw_frame_pointer_regnum = DWARF_FRAME_REGNUM (HARD_FRAME_POINTER_REGNUM); > + dw_frame_pointer_regnum = dwf_cfa_reg (gen_rtx_REG > + (Pmode, HARD_FRAME_POINTER_REGNUM)); If at all possible, avoid function name on one line and ( with first argument on another one. In the above case it can be easily avoided by dw_frame_pointer_regnum = dwf_cfa_reg (gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM)); > +static void > +build_breg_loc (struct dw_loc_descr_node **head, unsigned int regno) > +{ > + if (regno <= 31) > +add_loc_descr (head, new_loc_descr ((enum dwarf_location_atom) > + (DW_OP_breg0 + regno), 0, 0)); Bad formatting, (DW_OP_breg0 should be below (enum > + > + /* deal with the remaining registers in the span. */ Capital letter on Deal > + for (int i = (reg.span - 2); i >= 0; i--) Please remove the redundant inner () pair. > --- a/gcc/dwarf2out.h > +++ b/gcc/dwarf2out.h > @@ -119,6 +119,38 @@ struct GTY(()) dw_fde_node { > }; > > > +/* This represents a register, in DWARF_FRAME_REGNUM space, for use in CFA > + definitions and expressions. > + Most architectures only need a single register number, but some (amdgcn) > + have pointers that span multiple registers. DWARF permits arbitrary > + register sets but existing use-cases only require contiguous register > + sets, as represented here. */ > +struct GTY(()) cfa_reg { > + unsigned int reg; > + unsigned short span; > + unsigned short span_width; /* A.K.A. register mode size. */ > + > + cfa_reg& set_by_dwreg (unsigned int r) > +{ > + reg = r; > + span = 1; > + span_width = 0; /* Unknown size (permitted when span == 1). */ > + return *this; > +} > + > + bool operator== (const cfa_reg other) const The normal C++ way would be (const cfa_reg &other), wouldn't it? Or otherwise the const keyword doesn't make much sense. > +{ > + return (reg == other.reg && span == other.span > + && (span_width == other.span_width > + || (span == 1 > + && (span_width == 0 || other.span_width == 0; > +} > + bool operator!= (const cfa_reg other) const Likewise. Please add an empty line above operator!= > +{ > + return !(*this == other); > +} > +}; > + Otherwise LGTM. Jakub
[PATCH] rs6000: Mirror fix for PR102347 into the new builtins support
Hi! Recently Kewen fixed a problem in the old builtins support where rs6000_builtin_decl prematurely indicated that a target builtin is unavailable. This also needs to be done for the new builtins support, but in this case we have to ensure the error message is still produced from the overload support in rs6000-c.c. Unfortunately, this is less straightforward than it could be, because header file includes need to be adjusted to make this happen. Someday we'll consolidate all the builtin code in one file and this won't have to be so ugly. Bootstrapped and tested on powerpc64le-linux-gnu with no regressions. Is this okay for trunk? Thanks! Bill 2021-12-01 Bill Schmidt gcc/ PR target/102347 * config/rs6000/rs6000-c.c (rs6000-builtins.h): Stop including. (rs6000-internal.h): Include. (altivec_resolve_new_overloaded_builtin): Move call to rs6000_invalid_new_builtin here from rs6000_new_builtin_decl. * config/rs6000/rs6000-call.c (rs6000-builtins.h): Stop including. (rs6000_invalid_new_builtin): Remove static qualifier. (rs6000_new_builtin_decl): Remove test for supported builtin. * config/rs6000/rs6000-internal.h (rs6000-builtins.h): Include. (rs6000_invalid_new_builtin): Declare. * config/rs6000/rs6000.c (rs6000-builtins.h): Don't include. --- gcc/config/rs6000/rs6000-c.c| 11 +++ gcc/config/rs6000/rs6000-call.c | 9 + gcc/config/rs6000/rs6000-internal.h | 3 +++ gcc/config/rs6000/rs6000.c | 1 - 4 files changed, 11 insertions(+), 13 deletions(-) diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c index 5eeac9d4c06..8e83d97e72f 100644 --- a/gcc/config/rs6000/rs6000-c.c +++ b/gcc/config/rs6000/rs6000-c.c @@ -35,7 +35,7 @@ #include "langhooks.h" #include "c/c-tree.h" -#include "rs6000-builtins.h" +#include "rs6000-internal.h" static tree altivec_resolve_new_overloaded_builtin (location_t, tree, void *); @@ -2987,11 +2987,14 @@ altivec_resolve_new_overloaded_builtin (location_t loc, tree fndecl, const char *name = rs6000_overload_info[adj_fcode].ovld_name; if (!supported) { + /* Indicate that the instantiation of the overloaded builtin + name is not available with the target flags in effect. */ + rs6000_gen_builtins fcode = (rs6000_gen_builtins) instance->bifid; + rs6000_invalid_new_builtin (fcode); + /* Provide clarity of the relationship between the overload + and the instantiation. */ const char *internal_name = rs6000_builtin_info_x[instance->bifid].bifname; - /* An error message making reference to the name of the - non-overloaded function has already been issued. Add - clarification of the previous message. */ rich_location richloc (line_table, input_location); inform (&richloc, "overloaded builtin %qs is implemented by builtin %qs", diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index cd477fa4876..01688c4169d 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -69,7 +69,6 @@ #include "opts.h" #include "rs6000-internal.h" -#include "rs6000-builtins.h" #if TARGET_MACHO #include "gstab.h" /* for N_SLINE */ @@ -11905,7 +11904,7 @@ rs6000_invalid_builtin (enum rs6000_builtins fncode) /* Raise an error message for a builtin function that is called without the appropriate target options being set. */ -static void +void rs6000_invalid_new_builtin (enum rs6000_gen_builtins fncode) { size_t j = (size_t) fncode; @@ -16624,12 +16623,6 @@ rs6000_new_builtin_decl (unsigned code, bool /* initialize_p */) if (fcode >= RS6000_OVLD_MAX) return error_mark_node; - if (!rs6000_new_builtin_is_supported (fcode)) -{ - rs6000_invalid_new_builtin (fcode); - return error_mark_node; -} - return rs6000_builtin_decls_x[code]; } diff --git a/gcc/config/rs6000/rs6000-internal.h b/gcc/config/rs6000/rs6000-internal.h index 88cf9bd5692..a880fd37618 100644 --- a/gcc/config/rs6000/rs6000-internal.h +++ b/gcc/config/rs6000/rs6000-internal.h @@ -22,6 +22,8 @@ #ifndef GCC_RS6000_INTERNAL_H #define GCC_RS6000_INTERNAL_H +#include "rs6000-builtins.h" + /* Structure used to define the rs6000 stack */ typedef struct rs6000_stack { int reload_completed;/* stack info won't change from here on */ @@ -140,6 +142,7 @@ extern void rs6000_output_mi_thunk (FILE *file, extern bool rs6000_output_addr_const_extra (FILE *file, rtx x); extern bool rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi); extern tree rs6000_build_builtin_va_list (void); +extern void rs6000_invalid_new_builtin (rs6000_gen_builtins fncode); extern void rs6000_va_start (tree valist, rtx nextarg); extern tree rs6000_gimplify_va_arg (tree valist, tree type, gimple_
[PATCH] c++: Fix for decltype(auto) and parenthesized expr [PR103403]
In r11-4758, I tried to fix this problem: int &&i = 0; decltype(auto) j = i; // should behave like int &&j = i; error wherein do_auto_deduction was getting confused with a REFERENCE_REF_P and it didn't realize its operand was a name, not an expression, and deduced the wrong type. Unfortunately that fix broke this: int&& r = 1; decltype(auto) rr = (r); where 'rr' should be 'int &' since '(r)' is an expression, not a name. But because I stripped the INDIRECT_REF with the r11-4758 change, we deduced 'rr's type as if decltype had gotten a name, resulting in 'int &&'. I suspect I thought that the REF_PARENTHESIZED_P check when setting 'bool id' in do_auto_deduction would handle the (r) case, but that's not the case; while the documentation for REF_PARENTHESIZED_P specifically says it can be set in INDIRECT_REF, we don't actually do so. This patch sets REF_PARENTHESIZED_P even on REFERENCE_REF_P, so that do_auto_deduction can use it. It also removes code in maybe_undo_parenthesized_ref that I think is dead -- and we don't hit it while running dg.exp. To adduce more data, it also looks dead here: https://splichal.eu/lcov/gcc/cp/semantics.c.gcov.html Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and 11? PR c++/103403 gcc/cp/ChangeLog: * cp-gimplify.c (cp_fold): Don't recurse if maybe_undo_parenthesized_ref doesn't change its argument. * pt.c (do_auto_deduction): Don't strip REFERENCE_REF_P trees. Also REF_PARENTHESIZED_P for REFERENCE_REF_P. * semantics.c (force_paren_expr): Set REF_PARENTHESIZED_P on REFERENCE_REF_P trees too. (maybe_undo_parenthesized_ref): Remove dead code. gcc/testsuite/ChangeLog: * g++.dg/cpp1y/decltype-auto2.C: New test. * g++.dg/cpp1y/decltype-auto3.C: New test. --- gcc/cp/cp-gimplify.c| 3 ++- gcc/cp/pt.c | 5 ++--- gcc/cp/semantics.c | 18 -- gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C | 12 gcc/testsuite/g++.dg/cpp1y/decltype-auto3.C | 12 5 files changed, 32 insertions(+), 18 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto3.C diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c index 0a002db14e7..e3ede02a48e 100644 --- a/gcc/cp/cp-gimplify.c +++ b/gcc/cp/cp-gimplify.c @@ -2421,7 +2421,8 @@ cp_fold (tree x) if (REF_PARENTHESIZED_P (x)) { tree p = maybe_undo_parenthesized_ref (x); - return cp_fold (p); + if (p != x) + return cp_fold (p); } goto unary; diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index f4b9d9673fb..c5b41b57028 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -29889,11 +29889,10 @@ do_auto_deduction (tree type, tree init, tree auto_node, else if (AUTO_IS_DECLTYPE (auto_node)) { tree stripped_init = tree_strip_any_location_wrapper (init); - if (REFERENCE_REF_P (stripped_init)) - stripped_init = TREE_OPERAND (stripped_init, 0); bool id = (DECL_P (stripped_init) || ((TREE_CODE (init) == COMPONENT_REF - || TREE_CODE (init) == SCOPE_REF) + || TREE_CODE (init) == SCOPE_REF + || REFERENCE_REF_P (init)) && !REF_PARENTHESIZED_P (init))); tree deduced = finish_decltype_type (init, id, complain); deduced = canonicalize_type_argument (deduced, complain); diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index cd1956497f8..edba4b60e10 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -2049,7 +2049,8 @@ force_paren_expr (tree expr, bool even_uneval) return expr; if (TREE_CODE (expr) == COMPONENT_REF - || TREE_CODE (expr) == SCOPE_REF) + || TREE_CODE (expr) == SCOPE_REF + || REFERENCE_REF_P (expr)) REF_PARENTHESIZED_P (expr) = true; else if (DECL_P (tree_strip_any_location_wrapper (expr))) { @@ -2072,19 +2073,8 @@ maybe_undo_parenthesized_ref (tree t) if (cxx_dialect < cxx14) return t; - if (INDIRECT_REF_P (t) && REF_PARENTHESIZED_P (t)) -{ - t = TREE_OPERAND (t, 0); - while (TREE_CODE (t) == NON_LVALUE_EXPR -|| TREE_CODE (t) == NOP_EXPR) - t = TREE_OPERAND (t, 0); - - gcc_assert (TREE_CODE (t) == ADDR_EXPR - || TREE_CODE (t) == STATIC_CAST_EXPR); - t = TREE_OPERAND (t, 0); -} - else if ((TREE_CODE (t) == PAREN_EXPR || TREE_CODE (t) == VIEW_CONVERT_EXPR) -&& REF_PARENTHESIZED_P (t)) + if ((TREE_CODE (t) == PAREN_EXPR || TREE_CODE (t) == VIEW_CONVERT_EXPR) + && REF_PARENTHESIZED_P (t)) t = TREE_OPERAND (t, 0); return t; diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C b/gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C new file mode 100644 index 000..56e011e36f4 --- /dev/null +++ b/gcc/testsuit
[PATCH] c++: ICE with unnamed tparm and concept [PR103408]
Here we crash when issuing the "constraint C has type T, not bool" error, because pp_cxx_parameter_mapping wasn't prepared to see an anonymous template parameter. With this patch we print error: constraint 'auto() [with = 0]' has type '', not 'bool' The "" is what this patch adds. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? PR c++/103408 gcc/cp/ChangeLog: * cxx-pretty-print.c (pp_cxx_parameter_mapping): Print "" rather than crash on an unnamed template parameter. gcc/testsuite/ChangeLog: * g++.dg/cpp23/concepts-err1.C: New test. --- gcc/cp/cxx-pretty-print.c | 4 +++- gcc/testsuite/g++.dg/cpp23/concepts-err1.C | 6 ++ 2 files changed, 9 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp23/concepts-err1.C diff --git a/gcc/cp/cxx-pretty-print.c b/gcc/cp/cxx-pretty-print.c index 25cabfee39f..3ea357deb80 100644 --- a/gcc/cp/cxx-pretty-print.c +++ b/gcc/cp/cxx-pretty-print.c @@ -2891,8 +2891,10 @@ pp_cxx_parameter_mapping (cxx_pretty_printer *pp, tree map) if (TYPE_P (parm)) pp->type_id (parm); + else if (tree name = DECL_NAME (TEMPLATE_PARM_DECL (parm))) + pp_cxx_tree_identifier (pp, name); else - pp_cxx_tree_identifier (pp, DECL_NAME (TEMPLATE_PARM_DECL (parm))); + pp->translate_string (""); pp_cxx_whitespace (pp); pp_equal (pp); diff --git a/gcc/testsuite/g++.dg/cpp23/concepts-err1.C b/gcc/testsuite/g++.dg/cpp23/concepts-err1.C new file mode 100644 index 000..e5bdc542bad --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp23/concepts-err1.C @@ -0,0 +1,6 @@ +// PR c++/103408 +// { dg-do compile { target c++23 } } + +template +concept C = auto([]{}); // { dg-error "constraint" } +static_assert(C<0>); // { dg-error "non-constant condition for static assertion" } base-commit: 6b8ecbc6d6652d061d7c72c64352d51eca2df6ca -- 2.33.1
[PATCH] c++: ICE with auto(0) in requires-expression [PR103408]
Here we ICE on int f() requires (auto(0)); in do_auto_deduction when handling the auto: we're in a non-templated requires-expression which are parsed under processing_template_decl == 1 and empty current_template_parms, so 'current_template_args ()' will crash. This code is invalid as per "C++20 CA378: Remove non-templated constrained functions", but of course we shouldn't crash. Since in the scenario above it's expected that current_template_parms is null, I've just added a check, and let grokfndecl issue an error. I guess another approch would be to fake up a template parameter list before calling do_auto_deduction. For good measure, I've added several well-formed cases with auto(x) in a requires-expression. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? PR c++/103408 gcc/cp/ChangeLog: * pt.c (do_auto_deduction): Check current_template_parms before current_template_args (). gcc/testsuite/ChangeLog: * g++.dg/cpp23/auto-fncast9.C: New test. --- gcc/cp/pt.c | 2 +- gcc/testsuite/g++.dg/cpp23/auto-fncast9.C | 27 +++ 2 files changed, 28 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast9.C diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index f4b9d9673fb..012ca5d06c0 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -30041,7 +30041,7 @@ do_auto_deduction (tree type, tree init, tree auto_node, (but we still may have used them for constraint checking above). */; else if (context == adc_unify) targs = add_to_template_args (outer_targs, targs); - else if (processing_template_decl) + else if (processing_template_decl && current_template_parms) targs = add_to_template_args (current_template_args (), targs); return tsubst (type, targs, complain, NULL_TREE); } diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C b/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C new file mode 100644 index 000..45a0ff9b460 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C @@ -0,0 +1,27 @@ +// PR c++/103408 +// { dg-do compile { target c++23 } } + +int bad1() requires (auto(true)); // { dg-error "constraints on a non-templated function" } + +template +int f1() requires (auto(B)); + +template +struct S { T t; constexpr operator bool() { return true; } }; + +int bad2() requires (bool(S{1})); // { dg-error "constraints on a non-templated function" } +int bad3() requires (bool(S(1))); // { dg-error "constraints on a non-templated function" } + +template +int f2() requires (bool(S{N})); + +template +int f3() requires (bool(S(N))); + +void +g () +{ + f1(); + f2<42>(); + f3<42>(); +} base-commit: 6b8ecbc6d6652d061d7c72c64352d51eca2df6ca -- 2.33.1
[PATCH] c++: Handle auto(x) in parameter-declaration-clause [PR103401]
In C++23, auto(x) is valid, so decltype(auto(x)) should also be valid, so void f(decltype(auto(0))); should be just as void f(int); but currently, everytime we see 'auto' in a parameter-declaration-clause, we try to synthesize_implicit_template_parm for it, creating a new template parameter list. The code above actually has us calling s_i_t_p twice; once from cp_parser_decltype_expr -> cp_parser_postfix_expression which fails and then again from cp_parser_decltype_expr -> cp_parser_expression. So it looks like we have f and we accept ill-formed code. So we need to be more careful about synthesizing the implicit template parameter. cp_parser_postfix_expression looked like a sensible place. The r11-1913 change is OK: we need to make sure that we see '(auto)' after decltype to go ahead with 'decltype(auto)'. Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? PR c++/103401 gcc/cp/ChangeLog: * parser.c (cp_parser_postfix_expression): Set auto_is_implicit_function_template_parm_p when parsing a postfix expression. gcc/testsuite/ChangeLog: * g++.dg/cpp23/auto-fncast7.C: New test. * g++.dg/cpp23/auto-fncast8.C: New test. --- gcc/cp/parser.c | 2 ++ gcc/testsuite/g++.dg/cpp23/auto-fncast7.C | 9 gcc/testsuite/g++.dg/cpp23/auto-fncast8.C | 28 +++ 3 files changed, 39 insertions(+) create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast7.C create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast8.C diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 55e6a1a8b3a..c43b180f888 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -7508,6 +7508,8 @@ cp_parser_postfix_expression (cp_parser *parser, bool address_p, bool cast_p, looking at a functional cast. We could also be looking at an id-expression. So, we try the functional cast, and if that doesn't work we fall back to the primary-expression. */ + auto cleanup = make_temp_override + (parser->auto_is_implicit_function_template_parm_p, false); cp_parser_parse_tentatively (parser); /* Look for the simple-type-specifier. */ ++parser->prevent_constrained_type_specifiers; diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C b/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C new file mode 100644 index 000..763164f3e5b --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C @@ -0,0 +1,9 @@ +// PR c++/103401 +// { dg-do compile { target c++23 } } + +void f(decltype(auto(0))) { } + +int main() +{ + f(0); // { dg-error "no matching function" } +} diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C b/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C new file mode 100644 index 000..760827a5d6e --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C @@ -0,0 +1,28 @@ +// PR c++/103401 +// { dg-do compile { target c++23 } } + +void f1 (decltype(auto(0))); +void f2 (decltype(auto{0})); +void f3 (int = auto(42)); +void f4 (int = auto{42}); +void f5 (decltype(auto(0)) = auto(42)); +void f6 (auto (x)); +void f7 (int[auto(10)]); +void f8 (int[auto{10}]); + +void +g () +{ + f1 (1); + f2 (1); + f3 (); + f3 (1); + f4 (); + f4 (1); + f5 (); + f5 (1); + f6 ('a'); + int a[10]; + f7 (&a[0]); + f8 (&a[0]); +} base-commit: e5440bc08e07fd491dcccd47e1b86a5985ee117c -- 2.33.1
Patch ping
Hi! I'd like to ping following patches: https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583289.html c++, dyninit: Optimize C++ dynamic initialization by constants into DECL_INITIAL adjustment [PR102876] While Jason has added -fimplicit-constexpr which can fix up some cases, it still doesn't handle non-inline functions, or functions defined after the dynamic initialization, or functions defined in other TUs, so I think trying to optimize those after IPA is useful and as has been said, other compilers do it late as well. https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584802.html pch, v2: Add support for PCH for relocatable executables I do support Iain's patches to make PCH configure time disableable, but if people do want it and can't use it because of PIEs, it seems a pitty not to support it when it is so easy. Thanks Jakub
PING [PATCH][wwwdocs] Update section on enormous source files in htdocs/projects/beginner.html
Hi, I'd like to ping this patch: https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585294.html On Wed, Nov 24, 2021 at 2:11 AM Eric Gallager wrote: > > On Tue, Nov 23, 2021 at 6:27 PM Eric Gallager wrote: > > > > On Fri, Nov 19, 2021 at 8:14 AM Eric Gallager wrote: > > > > > > On Fri, Nov 19, 2021 at 1:48 AM Gerald Pfeifer wrote: > > > > > > > > Cool, thank you! > > > > > > > > Please feel free to commit patches like this without asking for > > > > approval (though I'm happy to review and approve). > > > > > > > > Gerald > > > > > > > > > > OK thanks; committed as dbaebcd > > > > I've also committed one to remove the section on traditional C now: > > https://gcc.gnu.org/git/?p=gcc-wwwdocs.git;a=commitdiff;h=ca83c13ad6bf0d351220dafa36264ebc7a6b7816 > > This next patch does more than just removing old stuff: it adds an > extra sentence to describe a shell command used to generate a list, so > to verify that I've got the shell command right, I'm asking for a > review. > > Thanks, > Eric Gallager
[committed] libstdc++: Clear RB tree after moving elements [PR103501]
Tested x86_64-linux, pushed to trunk. If the allocator-extended move constructor move-constructs each element into the new container, the contents of the old container are left in moved-from states. We cannot know if those states preserve the container's ordering and uniqueness guarantees, so just erase all moved-from elements. libstdc++-v3/ChangeLog: PR libstdc++/103501 * include/bits/stl_tree.h (_Rb_tree(_Rb_tree&&, false_type)): Clear container if elements have been moved-from. * testsuite/23_containers/map/allocator/move_cons.cc: Expect moved-from container to be empty. * testsuite/23_containers/multimap/allocator/move_cons.cc: Likewise. * testsuite/23_containers/multiset/allocator/103501.cc: New test. * testsuite/23_containers/set/allocator/103501.cc: New test. --- libstdc++-v3/include/bits/stl_tree.h | 6 ++-- .../23_containers/map/allocator/move_cons.cc | 2 +- .../multimap/allocator/move_cons.cc | 2 +- .../multiset/allocator/103501.cc | 32 +++ .../23_containers/set/allocator/103501.cc | 32 +++ 5 files changed, 70 insertions(+), 4 deletions(-) create mode 100644 libstdc++-v3/testsuite/23_containers/multiset/allocator/103501.cc create mode 100644 libstdc++-v3/testsuite/23_containers/set/allocator/103501.cc diff --git a/libstdc++-v3/include/bits/stl_tree.h b/libstdc++-v3/include/bits/stl_tree.h index 0692525be57..55b8c9c7cb2 100644 --- a/libstdc++-v3/include/bits/stl_tree.h +++ b/libstdc++-v3/include/bits/stl_tree.h @@ -1644,9 +1644,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _M_move_data(__x, true_type()); else { + constexpr bool __move = !__move_if_noexcept_cond::value; _Alloc_node __an(*this); - _M_root() = - _M_copy::value>(__x, __an); + _M_root() = _M_copy<__move>(__x, __an); + if _GLIBCXX17_CONSTEXPR (__move) + __x.clear(); } } diff --git a/libstdc++-v3/testsuite/23_containers/map/allocator/move_cons.cc b/libstdc++-v3/testsuite/23_containers/map/allocator/move_cons.cc index 5beb26276b0..b82d3532135 100644 --- a/libstdc++-v3/testsuite/23_containers/map/allocator/move_cons.cc +++ b/libstdc++-v3/testsuite/23_containers/map/allocator/move_cons.cc @@ -41,7 +41,7 @@ void test01() VERIFY(1 == v1.get_allocator().get_personality()); VERIFY(2 == v2.get_allocator().get_personality()); - VERIFY( v1[1].empty() ); + VERIFY( v1.empty() ); VERIFY( v2[1] == str ); } diff --git a/libstdc++-v3/testsuite/23_containers/multimap/allocator/move_cons.cc b/libstdc++-v3/testsuite/23_containers/multimap/allocator/move_cons.cc index 91f2d0be508..37db0f005d1 100644 --- a/libstdc++-v3/testsuite/23_containers/multimap/allocator/move_cons.cc +++ b/libstdc++-v3/testsuite/23_containers/multimap/allocator/move_cons.cc @@ -41,7 +41,7 @@ void test01() VERIFY(1 == v1.get_allocator().get_personality()); VERIFY(2 == v2.get_allocator().get_personality()); - VERIFY( v1.begin()->second.empty() ); + VERIFY( v1.empty() ); VERIFY( v2.begin()->second == str ); } diff --git a/libstdc++-v3/testsuite/23_containers/multiset/allocator/103501.cc b/libstdc++-v3/testsuite/23_containers/multiset/allocator/103501.cc new file mode 100644 index 000..24f657eceba --- /dev/null +++ b/libstdc++-v3/testsuite/23_containers/multiset/allocator/103501.cc @@ -0,0 +1,32 @@ +// { dg-do run { target c++11 } } + +// PR libstdc++/103501 + +#include +#include +#include + +struct Y +{ + int i; + + Y(int i) : i(i) { } + Y(const Y& y) noexcept : i(y.i) { } + Y(Y&& y) noexcept : i(y.i) { y.i = -y.i; } + + bool operator<(const Y& rhs) const { return i < rhs.i; } +}; + +int main() +{ + using Alloc = __gnu_test::uneq_allocator; + std::multiset, Alloc> s1{ {1, 2, 3}, Alloc(1)}; + std::multiset, Alloc> s2{ std::move(s1), Alloc(2) }; + const Y* prev = nullptr; + for (const Y& y : s1) + { +if (prev) + VERIFY( !(y < *prev) ); +prev = &y; + } +} diff --git a/libstdc++-v3/testsuite/23_containers/set/allocator/103501.cc b/libstdc++-v3/testsuite/23_containers/set/allocator/103501.cc new file mode 100644 index 000..7267cf9663f --- /dev/null +++ b/libstdc++-v3/testsuite/23_containers/set/allocator/103501.cc @@ -0,0 +1,32 @@ +// { dg-do run { target c++11 } } + +// PR libstdc++/103501 + +#include +#include +#include + +struct X +{ + int i; + + X(int i) : i(i) { } + X(const X& x) noexcept : i(x.i) { } + X(X&& x) noexcept : i(x.i) { x.i = -1; } + + bool operator<(const X& rhs) const { return i < rhs.i; } +}; + +int main() +{ + using Alloc = __gnu_test::uneq_allocator; + std::set, Alloc> s1{ {1, 2, 3}, Alloc(1)}; + std::set, Alloc> s2{ std::move(s1), Alloc(2) }; + const X* prev = nullptr; + for (const X& x : s1) + { +if (prev) + VERIFY( *prev < x ); +prev = &x; + } +} -- 2.31.1
[committed] libstdc++: Define std::__is_constant_evaluated() for internal use
Tested x86_64-linux, pushed to trunk. This adds std::__is_constant_evaluated() as a C++11 wrapper for __builtin_is_constant_evaluated, but just returning false if the built-in isn't supported by the compiler. This allows us to use it throughout the library without checking __has_builtin every time. Some uses in std::vector and std::string can only be constexpr when the std::is_constant_evaluated() function actually works, so we might as well guard them with a relevant macro and call that function directly, rather than the built-in or std::__is_constant_evaluated(). The remaining checks of the __cpp_lib_is_constant_evaluated macro could now be replaced by checking __cplusplus >= 202002 instead, but there's no practical difference. We still need some kind of preprocessor check there anyway. libstdc++-v3/ChangeLog: * doc/doxygen/user.cfg.in (PREDEFINED): Change macro name. * include/bits/allocator.h (allocate, deallocate): Use std::__is_constant_evaluated() unconditionally, instead of checking whether std::is_constant_evaluated() (or the built-in) can be used. * include/bits/basic_string.h: Check new macro. call std::is_constant_evaluated() directly in C++20-only code that is guarded by a suitable macro. * include/bits/basic_string.tcc: Likewise. * include/bits/c++config (__is_constant_evaluated): Define. (_GLIBCXX_HAVE_BUILTIN_IS_CONSTANT_EVALUATED): Replace with ... (_GLIBCXX_HAVE_IS_CONSTANT_EVALUATED): New macro. * include/bits/char_traits.h (char_traits): Replace conditional calls to std::is_constant_evaluated with unconditional calls to std::__is_constant_evaluated. * include/bits/cow_string.h: Use new macro. * include/bits/ranges_algobase.h (__copy_or_move): Replace conditional calls to std::is_constant_evaluated with unconditional calls to std::__is_constant_evaluated. (__copy_or_move_backward, __fill_n_fn): Likewise. * include/bits/ranges_cmp.h (ranges::less): Likewise. * include/bits/stl_algobase.h (lexicographical_compare_three_way): Likewise. * include/bits/stl_bvector.h: Call std::is_constant_evaluated directly in C++20-only code that is guarded by a suitable macro. * include/bits/stl_construct.h (_Construct, _Destroy, _Destroy_n): Replace is_constant_evaluated with __is_constant_evaluated. * include/bits/stl_function.h (greater, less, greater_equal) (less_equal): Replace __builtin_is_constant_evaluated and __builtin_constant_p with __is_constant_evaluated. * include/bits/stl_vector.h: Call std::is_constant_evaluated() in C++20-only code. * include/debug/helper_functions.h (__check_singular): Use __is_constant_evaluated instead of built-in, or remove check entirely. * include/std/array (operator<=>): Use __is_constant_evaluated unconditionally. * include/std/bit (__bit_ceil): Likewise. * include/std/type_traits (is_constant_evaluated): Define using 'if consteval' if possible. * include/std/version: Use new macro. * libsupc++/compare: Use __is_constant_evaluated instead of __builtin_is_constant_evaluated. * testsuite/23_containers/array/tuple_interface/get_neg.cc: Adjust dg-error lines. --- libstdc++-v3/doc/doxygen/user.cfg.in | 2 +- libstdc++-v3/include/bits/allocator.h | 8 +- libstdc++-v3/include/bits/basic_string.h | 6 +- libstdc++-v3/include/bits/basic_string.tcc| 4 +- libstdc++-v3/include/bits/c++config | 30 +++- libstdc++-v3/include/bits/char_traits.h | 129 +- libstdc++-v3/include/bits/cow_string.h| 2 +- libstdc++-v3/include/bits/ranges_algobase.h | 12 +- libstdc++-v3/include/bits/ranges_cmp.h| 5 +- libstdc++-v3/include/bits/stl_algobase.h | 5 +- libstdc++-v3/include/bits/stl_bvector.h | 8 +- libstdc++-v3/include/bits/stl_construct.h | 12 +- libstdc++-v3/include/bits/stl_function.h | 24 +--- libstdc++-v3/include/bits/stl_vector.h| 5 +- libstdc++-v3/include/debug/helper_functions.h | 15 +- libstdc++-v3/include/std/array| 4 +- libstdc++-v3/include/std/bit | 5 +- libstdc++-v3/include/std/type_traits | 11 +- libstdc++-v3/include/std/version | 4 +- libstdc++-v3/libsupc++/compare| 2 +- .../array/tuple_interface/get_neg.cc | 6 +- 21 files changed, 142 insertions(+), 157 deletions(-) diff --git a/libstdc++-v3/doc/doxygen/user.cfg.in b/libstdc++-v3/doc/doxygen/user.cfg.in index 17cd6fc1c0e..2f15f2c1b82 100644 --- a/libstdc++-v3/doc/doxygen/user.cfg.in +++ b/libstdc++-v3/doc/doxygen/user.cfg.in @@ -2399,7 +2399,7 @@ PREDEFINED = __cplusplus=202002L \ "__has_builtin(x)
[committed] libstdc++: Optimize ref-count updates in COW std::string
Tested x86_64-linux, pushed to trunk. Most ref-count updates in the COW string are done via the functions in , which will use non-atomic ops when the program is known to be single-threaded. The _M_is_leaked() and _M_is_shared() functions use __atomic_load_n directly, because doesn't provide a load operation. Those functions can check the __is_single_threaded() predicate to avoid using __atomic_load_n when not needed. The move constructor for the fully-dynamic-string increments the ref-count by either 2 or 1, for leaked or non-leaked strings respectively. That can be changed to use a non-atomic store of 1 for all non-shared strings. It can be non-atomic because even if the program is multi-threaded, conflicting access to the rvalue object while it's being moved from would be data race anyway. It can store 1 directly for all non-shared strings because it doesn't matter whether the initial refcount was -1 or 0, it should be 1 after the move constructor creates a second owner. libstdc++-v3/ChangeLog: * include/bits/cow_string.h (basic_string::_M_is_leaked): Use non-atomic load when __is_single_threaded() is true. (basic_string::_M_is_shared): Likewise. (basic_string::(basic_string&&)) [_GLIBCXX_FULLY_DYNAMIC_STRING]: Use non-atomic store when rvalue is not shared. --- libstdc++-v3/include/bits/cow_string.h | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/libstdc++-v3/include/bits/cow_string.h b/libstdc++-v3/include/bits/cow_string.h index ced395b80b8..4fae1d02981 100644 --- a/libstdc++-v3/include/bits/cow_string.h +++ b/libstdc++-v3/include/bits/cow_string.h @@ -105,7 +105,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * destroy the empty-string _Rep object. * * All but the last paragraph is considered pretty conventional - * for a C++ string implementation. + * for a Copy-On-Write C++ string implementation. */ // 21.3 Template class basic_string template @@ -207,10 +207,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // so we need to use an atomic load. However, _M_is_leaked // predicate does not change concurrently (i.e. the string is either // leaked or not), so a relaxed load is enough. - return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0; -#else - return this->_M_refcount < 0; + if (!__gnu_cxx::__is_single_threaded()) + return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0; #endif + return this->_M_refcount < 0; } bool @@ -222,10 +222,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION // but one reference concurrently with this check, so we need this // load to be acquire to synchronize with release fetch_and_add in // _M_dispose. - return __atomic_load_n(&this->_M_refcount, __ATOMIC_ACQUIRE) > 0; -#else - return this->_M_refcount > 0; + if (!__gnu_cxx::__is_single_threaded()) + return __atomic_load_n(&this->_M_refcount, __ATOMIC_ACQUIRE) > 0; #endif + return this->_M_refcount > 0; } void @@ -629,12 +629,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION #else // Rather than allocate an empty string for the rvalue string, // just share ownership with it by incrementing the reference count. - // If the rvalue string was "leaked" then it was the unique owner, - // so need an extra increment to indicate shared ownership. - if (_M_rep()->_M_is_leaked()) - __gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 2); - else + // If the rvalue string was the unique owner then there are exactly + // two owners now. + if (_M_rep()->_M_is_shared()) __gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 1); + else + _M_rep()->_M_refcount = 1; #endif } -- 2.31.1
[committed] libstdc++: Avoid unwanted allocations in filesystem::path
Tested x86_64-linux, pushed to trunk. When using COW strings, accessing _M_pathname[0] and similar non-const accessors can cause the string to "leak", meaning it reallocates itself if it shares ownership with another string object. This causes test failures for --enable-fully-dynamic-string builds: /home/jwakely/src/gcc/libstdc++-v3/testsuite/experimental/filesystem/path/construct/90634.cc:62: void test01(): Assertion 'bytes_allocated == 0' failed. FAIL: experimental/filesystem/path/construct/90634.cc execution test This FAIL happens because the fully-dynamic move constructor results in shared ownership, so for path(std::move(std::string("foo"))) the _M_pathname member shares ownership with the temporary, and the non-const accesses in _M_split_cmpts() cause a new copy of the string to be allocated. This un-sharing is wasteful, and entirely unnecessary when sharing ownership with an rvalue that is about to release its ownership anyway. Even for lvalues, sharing ownership is not a problem and reallocating a unique copy of the string is wasteful. This removes non-const accesses of _M_pathname in the path::_M_split_cmpts() members. libstdc++-v3/ChangeLog: * src/c++17/fs_path.cc (path::_M_split_cmpts()): Remove micro-optimization for "/" path. * src/filesystem/path.cc (path::_M_split_cmpts()): Only access the contents of _M_pathname using const member functions. --- libstdc++-v3/src/c++17/fs_path.cc | 5 - libstdc++-v3/src/filesystem/path.cc | 31 - 2 files changed, 17 insertions(+), 19 deletions(-) diff --git a/libstdc++-v3/src/c++17/fs_path.cc b/libstdc++-v3/src/c++17/fs_path.cc index 506ff25f9a6..967d9471134 100644 --- a/libstdc++-v3/src/c++17/fs_path.cc +++ b/libstdc++-v3/src/c++17/fs_path.cc @@ -1872,11 +1872,6 @@ path::_M_split_cmpts() _M_cmpts.type(_Type::_Filename); return; } - if (_M_pathname.length() == 1 && _M_pathname[0] == preferred_separator) -{ - _M_cmpts.type(_Type::_Root_dir); - return; -} _Parser parser(_M_pathname); diff --git a/libstdc++-v3/src/filesystem/path.cc b/libstdc++-v3/src/filesystem/path.cc index a935573740f..8e8806a953f 100644 --- a/libstdc++-v3/src/filesystem/path.cc +++ b/libstdc++-v3/src/filesystem/path.cc @@ -337,15 +337,18 @@ path::_M_split_cmpts() _M_type = _Type::_Multi; _M_cmpts.clear(); - if (_M_pathname.empty()) + // Use const-reference to access _M_pathname, to avoid "leaking" COW string. + const auto& pathname = _M_pathname; + + if (pathname.empty()) return; { // Approximate count of components, to reserve space in _M_cmpts vector: int count = 1; -bool saw_sep_last = _S_is_dir_sep(_M_pathname[0]); +bool saw_sep_last = _S_is_dir_sep(pathname[0]); bool saw_non_sep = !saw_sep_last; -for (value_type c : _M_pathname) +for (value_type c : pathname) { if (_S_is_dir_sep(c)) saw_sep_last = true; @@ -363,13 +366,13 @@ path::_M_split_cmpts() } size_t pos = 0; - const size_t len = _M_pathname.size(); + const size_t len = pathname.size(); // look for root name or root directory - if (_S_is_dir_sep(_M_pathname[0])) + if (_S_is_dir_sep(pathname[0])) { // look for root name, such as "//" or "//foo" - if (len > 1 && _M_pathname[1] == _M_pathname[0]) + if (len > 1 && pathname[1] == pathname[0]) { if (len == 2) { @@ -378,11 +381,11 @@ path::_M_split_cmpts() return; } - if (!_S_is_dir_sep(_M_pathname[2])) + if (!_S_is_dir_sep(pathname[2])) { // got root name, find its end pos = 3; - while (pos < len && !_S_is_dir_sep(_M_pathname[pos])) + while (pos < len && !_S_is_dir_sep(pathname[pos])) ++pos; if (pos == len) { @@ -409,7 +412,7 @@ path::_M_split_cmpts() ++pos; } #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS - else if (len > 1 && _M_pathname[1] == L':') + else if (len > 1 && pathname[1] == L':') { // got disk designator if (len == 2) @@ -418,7 +421,7 @@ path::_M_split_cmpts() return; } _M_add_root_name(2); - if (len > 2 && _S_is_dir_sep(_M_pathname[2])) + if (len > 2 && _S_is_dir_sep(pathname[2])) _M_add_root_dir(2); pos = 2; } @@ -426,9 +429,9 @@ path::_M_split_cmpts() else { size_t n = 1; - for (; n < _M_pathname.size() && !_S_is_dir_sep(_M_pathname[n]); ++n) + for (; n < pathname.size() && !_S_is_dir_sep(pathname[n]); ++n) { } - if (n == _M_pathname.size()) + if (n == pathname.size()) { _M_type = _Type::_Filename; return; @@ -438,7 +441,7 @@ path::_M_split_cmpts() size_t back = pos; while (pos < len) { - if (_S_is_dir_sep(_M_pathname[pos])) + if (_S_is_dir_sep(pathname[pos])) { if
Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]
> On Dec 1, 2021, at 3:01 AM, Richard Biener wrote: > > On Tue, Nov 30, 2021 at 11:35 PM Peter Bergner wrote: >> >> On 11/30/21 2:44 PM, Qing Zhao via Gcc-patches wrote: >>> Sorry for the confusing… >>> My major question is: >>> >>> for a variable of type __vector_pair, could it be in a register? >> >> Yes. To be pedantic, it will live in a vector register pair. >> >> >>> If it could be in a register, can we initialize this register with some >>> constant value? >> >> For a __vector_pair, no, not as it is setup now. We also do not have a >> use case where we would want to initialize a __vector_pair to a constant. >> Our normal (only?) use case with a __vector_pair is to load it up with >> some actual data from memory that represents a (partial) row of a matrix. >> >> For __vector_quad, it too lives in a register (accumulator register) and >> represents a small matrix. We have the __builtin_mma_xxsetaccz (&acc) >> builtin to initialize it to a zero constant. > > Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init > instead of fixing up things at expansion time. Agreed. So, Peter, will you update the routine “is_var_need_auto_init” in gimplify.c to exclude OPAQUE_TYPE to fix this issue? thanks. Qing > > Richard. > >> Peter
Re: [PATCH] x86: Speed up target attribute handling by using a cache
On Mon, Nov 22, 2021 at 10:36 AM Jakub Jelinek wrote: > > Hi! > > The target attribute handling is very expensive and for the common case > from x86intrin.h where many functions get implicitly the same target > attribute, we can speed up compilation a lot by caching it. > > The following patches both create a single entry cache, where they cache > for a particular target attribute argument list the resulting > DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_SPECIFIC_OPTIMIZATION > values from ix86_valid_target_attribute_p and use the cache if the > args are the same as last time and we start either from NULL values > of those, or from the recorded values for those from last time. > > Compiling a simple: > #include > > int i; > testcase with ./cc1 -quiet -O2 -isystem include/ test.c > takes on my WS without the patches ~0.392s and with either of the > patches ~0.182s, i.e. roughly half the time as before. > For ./cc1plus -quiet -O2 -isystem include/ test.c > it is slightly worse, the speed up is from ~0.613s to ~0.403s. > > The difference between the 2 patches is that the first one uses copy_list > while the second one uses a vec, so I think the second one has the advantage > of creating less GC garbage. > I've verified both patches achieve the same content of those > DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_SPECIFIC_OPTIMIZATION > nodes as before on x86intrin.h by doing debug_tree on those and comparing > the stderr from without these patches to with these patches. > > Both patches were bootstrapped/regtested on x86_64-linux and i686-linux, > ok for trunk (and which one)? > > 2021-11-22 Jakub Jelinek > > * attribs.h (simple_cst_list_equal): Declare. > * attribs.c (simple_cst_list_equal): No longer static. > * config/i386/i386-options.c (target_attribute_cache): New variable. > (ix86_valid_target_attribute_p): Cache DECL_FUNCTION_SPECIFIC_TARGET > and DECL_FUNCTION_SPECIFIC_OPTIMIZATION based on args. LGTM, but really I'll just rubber-stamp the patch. Uros. > --- gcc/attribs.h.jj2021-11-11 14:35:37.442350841 +0100 > +++ gcc/attribs.h 2021-11-19 11:52:08.843252645 +0100 > @@ -60,6 +60,7 @@ extern tree build_type_attribute_variant > extern tree build_decl_attribute_variant (tree, tree); > extern tree build_type_attribute_qual_variant (tree, tree, int); > > +extern bool simple_cst_list_equal (const_tree, const_tree); > extern bool attribute_value_equal (const_tree, const_tree); > > /* Return 0 if the attributes for two types are incompatible, 1 if they > --- gcc/attribs.c.jj2021-11-11 14:35:37.442350841 +0100 > +++ gcc/attribs.c 2021-11-19 11:51:43.473615692 +0100 > @@ -1290,7 +1290,7 @@ cmp_attrib_identifiers (const_tree attr1 > /* Compare two constructor-element-type constants. Return 1 if the lists > are known to be equal; otherwise return 0. */ > > -static bool > +bool > simple_cst_list_equal (const_tree l1, const_tree l2) > { >while (l1 != NULL_TREE && l2 != NULL_TREE) > --- gcc/config/i386/i386-options.c.jj 2021-11-15 13:19:07.347900863 +0100 > +++ gcc/config/i386/i386-options.c 2021-11-20 00:27:32.919505947 +0100 > @@ -1403,6 +1403,8 @@ ix86_valid_target_attribute_tree (tree f >return t; > } > > +static GTY(()) tree target_attribute_cache[3]; > + > /* Hook to validate attribute((target("string"))). */ > > bool > @@ -1423,6 +1425,19 @@ ix86_valid_target_attribute_p (tree fnde >&& strcmp (TREE_STRING_POINTER (TREE_VALUE (args)), "default") == 0) > return true; > > + if ((DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == target_attribute_cache[1] > + || DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == NULL_TREE) > + && (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) > + == target_attribute_cache[2] > + || DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) == NULL_TREE) > + && simple_cst_list_equal (args, target_attribute_cache[0])) > +{ > + DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = target_attribute_cache[1]; > + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) > + = target_attribute_cache[2]; > + return true; > +} > + >tree old_optimize = build_optimization_node (&global_options, >&global_options_set); > > @@ -1456,8 +1471,17 @@ ix86_valid_target_attribute_p (tree fnde >if (new_target == error_mark_node) > ret = false; > > - else if (fndecl && new_target) > + else if (new_target) > { > + if (DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == NULL_TREE > + && DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) == NULL_TREE) > + { > + target_attribute_cache[0] = copy_list (args); > + target_attribute_cache[1] = new_target; > + target_attribute_cache[2] > + = old_optimize != new_optimize ? new_optimize : NULL_TREE; > + } > + >DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_target; > >if (old_optimize != new_optimize) > > Jakub
Re: [PATCH] Loop unswitching: support gswitch statements.
On 12/1/21 15:34, Richard Biener wrote: On Wed, Dec 1, 2021 at 3:25 PM Martin Liška wrote: On 12/1/21 15:19, Richard Biener wrote: which is compute the range of 'lhs' on edge_true into predicate->true_range, assign that same range to ->false_range and then invert it to get the range on the false_edge. What I am saying is that for better precision you should do ranger->range_on_edge (predicate->false_range, edge_false, lhs); rather than prematurely optimize this to the inversion of the true range since yes, ranger is CFG sensitive and only the_last_ predicate on a long CFG path is actually inverted. What am I missing? I might be misunderstood, but I think it's the problem defined here: https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584605.html where I used the ranger->range_on_edge on the false_edge. Ah, OK. But then even the true_edge range is possibly wrong, no? You are of course correct, I've just proved that in debugger :// Consider for (;;) { if (a < 100) if (a > 50) // unswitch on this /* .. */ if (a < 120) /* ... */ } then you record [51, 99] for true_range of the a > 50 predicate and thus simplification will simplify the if (a < 120) check, no? Yep. You can only record the range from the (CFG independent) a > 50 check, thus [51, +INF] but of course at simplification time you can also use the CFG context at each simplification location. @Andrew: How can I easily get irange based just on a stmt? Something like fold_range with int_range_max as the 3rd argument? Thanks, Martin Richard. Martin
Re: [PATCH] Loop unswitching: support gswitch statements.
On Wed, Dec 1, 2021 at 3:25 PM Martin Liška wrote: > > On 12/1/21 15:19, Richard Biener wrote: > > which is compute the range of 'lhs' on edge_true into predicate->true_range, > > assign that same range to ->false_range and then invert it to get the > > range on the false_edge. What I am saying is that for better precision > > you should do > > > > ranger->range_on_edge (predicate->false_range, edge_false, lhs); > > > > rather than prematurely optimize this to the inversion of the true range > > since yes, ranger is CFG sensitive and only the_last_ predicate on a > > long CFG path is actually inverted. > > > > What am I missing? > > I might be misunderstood, but I think it's the problem defined here: > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584605.html > > where I used the ranger->range_on_edge on the false_edge. Ah, OK. But then even the true_edge range is possibly wrong, no? Consider for (;;) { if (a < 100) if (a > 50) // unswitch on this /* .. */ if (a < 120) /* ... */ } then you record [51, 99] for true_range of the a > 50 predicate and thus simplification will simplify the if (a < 120) check, no? You can only record the range from the (CFG independent) a > 50 check, thus [51, +INF] but of course at simplification time you can also use the CFG context at each simplification location. Richard. > Martin
[PATCH v4 1/6] tree-object-size: Use trees and support negative offsets
Transform tree-object-size to operate on tree objects instead of host wide integers. This makes it easier to extend to dynamic expressions for object sizes. The compute_builtin_object_size interface also now returns a tree expression instead of HOST_WIDE_INT, so callers have been adjusted to account for that. The trees in object_sizes are each an object_size object with members size (the bytes from the pointer to the end of the object) and wholesize (the size of the whole object). This allows analysis of negative offsets, which can now be allowed to the extent of the object bounds. Tests have been added to verify that it actually works. gcc/ChangeLog: * tree-object-size.h (compute_builtin_object_size): Return tree instead of HOST_WIDE_INT. * builtins.c (fold_builtin_object_size): Adjust. * gimple-fold.c (gimple_fold_builtin_strncat): Likewise. * ubsan.c (instrument_object_size): Likewise. * tree-object-size.c (object_size): New structure. (object_sizes): Change type to vec. (initval): New function. (unknown): Use it. (size_unknown_p, size_initval, size_unknown): New functions. (object_sizes_unknown_p): Use it. (object_sizes_get): Return tree. (object_sizes_initialize): Rename from object_sizes_set_force and set VAL parameter type as tree. Add new parameter WHOLEVAL. (object_sizes_set): Set VAL parameter type as tree and adjust implementation. Add new parameter WHOLEVAL. (size_for_offset): New function. (decl_init_size): Adjust comment. (addr_object_size): Change PSIZE parameter to tree and adjust implementation. Add new parameter PWHOLESIZE. (alloc_object_size): Return tree. (compute_builtin_object_size): Return tree in PSIZE. (expr_object_size, call_object_size, unknown_object_size): Adjust for object_sizes_set change. (merge_object_sizes): Drop OFFSET parameter and adjust implementation for tree change. (plus_stmt_object_size): Call collect_object_sizes_for directly instead of merge_object_size and call size_for_offset to get net size. (cond_expr_object_size, collect_object_sizes_for, object_sizes_execute): Adjust for change of type from HOST_WIDE_INT to tree. (check_for_plus_in_loops_1): Likewise and skip non-positive offsets. gcc/testsuite/ChangeLog: * gcc.dg/builtin-object-size-1.c (test9): New test. (main): Call it. * gcc.dg/builtin-object-size-2.c (test8): New test. (main): Call it. * gcc.dg/builtin-object-size-3.c (test9): New test. (main): Call it. * gcc.dg/builtin-object-size-4.c (test8): New test. (main): Call it. * gcc.dg/builtin-object-size-5.c (test5, test6, test7): New tests. Signed-off-by: Siddhesh Poyarekar --- gcc/builtins.c | 10 +- gcc/gimple-fold.c| 11 +- gcc/testsuite/gcc.dg/builtin-object-size-1.c | 30 ++ gcc/testsuite/gcc.dg/builtin-object-size-2.c | 30 ++ gcc/testsuite/gcc.dg/builtin-object-size-3.c | 31 ++ gcc/testsuite/gcc.dg/builtin-object-size-4.c | 30 ++ gcc/testsuite/gcc.dg/builtin-object-size-5.c | 25 ++ gcc/tree-object-size.c | 393 --- gcc/tree-object-size.h | 2 +- gcc/ubsan.c | 5 +- 10 files changed, 408 insertions(+), 159 deletions(-) diff --git a/gcc/builtins.c b/gcc/builtins.c index 03829c03a5a..309c3010de3 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -10254,7 +10254,7 @@ maybe_emit_sprintf_chk_warning (tree exp, enum built_in_function fcode) static tree fold_builtin_object_size (tree ptr, tree ost) { - unsigned HOST_WIDE_INT bytes; + tree bytes; int object_size_type; if (!validate_arg (ptr, POINTER_TYPE) @@ -10279,8 +10279,8 @@ fold_builtin_object_size (tree ptr, tree ost) if (TREE_CODE (ptr) == ADDR_EXPR) { compute_builtin_object_size (ptr, object_size_type, &bytes); - if (wi::fits_to_tree_p (bytes, size_type_node)) - return build_int_cstu (size_type_node, bytes); + if (int_fits_type_p (bytes, size_type_node)) + return fold_convert (size_type_node, bytes); } else if (TREE_CODE (ptr) == SSA_NAME) { @@ -10288,8 +10288,8 @@ fold_builtin_object_size (tree ptr, tree ost) later. Maybe subsequent passes will help determining it. */ if (compute_builtin_object_size (ptr, object_size_type, &bytes) - && wi::fits_to_tree_p (bytes, size_type_node)) - return build_int_cstu (size_type_node, bytes); + && int_fits_type_p (bytes, size_type_node)) + return fold_convert (size_type_node, bytes); } return NULL_TREE; diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c index 1d8fd74f72c..64515aabc04 100644 --- a/gcc/gimple-fold.c +++
[PATCH v4 5/6] tree-object-size: Handle GIMPLE_CALL
Handle non-constant expressions in GIMPLE_CALL arguments. Also handle alloca. gcc/ChangeLog: * tree-object-size.c (alloc_object_size): Make and return non-constant size expression. (call_object_size): Return expression or unknown based on whether dynamic object size is requested. gcc/testsuite/ChangeLog: * gcc.dg/builtin-dynamic-object-size-0.c: Add new tests. * gcc.dg/builtin-object-size-1.c (test1) [__builtin_object_size]: Alter expected result for dynamic object size. * gcc.dg/builtin-object-size-2.c (test1) [__builtin_object_size]: Likewise. * gcc.dg/builtin-object-size-3.c (test1) [__builtin_object_size]: Likewise. * gcc.dg/builtin-object-size-4.c (test1) [__builtin_object_size]: Likewise. Signed-off-by: Siddhesh Poyarekar --- .../gcc.dg/builtin-dynamic-object-size-0.c| 227 +- gcc/testsuite/gcc.dg/builtin-object-size-1.c | 7 + gcc/testsuite/gcc.dg/builtin-object-size-2.c | 14 ++ gcc/testsuite/gcc.dg/builtin-object-size-3.c | 7 + gcc/testsuite/gcc.dg/builtin-object-size-4.c | 14 ++ gcc/tree-object-size.c| 22 +- 6 files changed, 282 insertions(+), 9 deletions(-) diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c index ce0f4eb17f3..2db0e0d1aa2 100644 --- a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c +++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c @@ -4,12 +4,71 @@ typedef __SIZE_TYPE__ size_t; #define abort __builtin_abort +void * +__attribute__ ((alloc_size (1))) +__attribute__ ((__nothrow__ , __leaf__)) +__attribute__ ((noinline)) +alloc_func (size_t sz) +{ + return __builtin_malloc (sz); +} + +void * +__attribute__ ((alloc_size (1, 2))) +__attribute__ ((__nothrow__ , __leaf__)) +__attribute__ ((noinline)) +calloc_func (size_t cnt, size_t sz) +{ + return __builtin_calloc (cnt, sz); +} + +void * +__attribute__ ((noinline)) +unknown_allocator (size_t cnt, size_t sz) +{ + return __builtin_calloc (cnt, sz); +} + +size_t +__attribute__ ((noinline)) +test_unknown (size_t cnt, size_t sz) +{ + void *ret = unknown_allocator (cnt, sz); + return __builtin_dynamic_object_size (ret, 0); +} + +/* Malloc-like allocator. */ + +size_t +__attribute__ ((noinline)) +test_malloc (size_t sz) +{ + void *ret = alloc_func (sz); + return __builtin_dynamic_object_size (ret, 0); +} + +size_t +__attribute__ ((noinline)) +test_builtin_malloc (size_t sz) +{ + void *ret = __builtin_malloc (sz); + return __builtin_dynamic_object_size (ret, 0); +} + +size_t +__attribute__ ((noinline)) +test_builtin_malloc_cond (int cond) +{ + void *ret = __builtin_malloc (cond ? 32 : 64); + return __builtin_dynamic_object_size (ret, 0); +} + size_t __attribute__ ((noinline)) test_builtin_malloc_condphi (int cond) { void *ret; - + if (cond) ret = __builtin_malloc (32); else @@ -18,6 +77,79 @@ test_builtin_malloc_condphi (int cond) return __builtin_dynamic_object_size (ret, 0); } +size_t +__attribute__ ((noinline)) +test_builtin_malloc_condphi2 (int cond, size_t in) +{ + void *ret; + + if (cond) +ret = __builtin_malloc (in); + else +ret = __builtin_malloc (64); + + return __builtin_dynamic_object_size (ret, 0); +} + +size_t +__attribute__ ((noinline)) +test_builtin_malloc_condphi3 (int cond, size_t in, size_t in2) +{ + void *ret; + + if (cond) +ret = __builtin_malloc (in); + else +ret = __builtin_malloc (in2); + + return __builtin_dynamic_object_size (ret, 0); +} + +size_t +__attribute__ ((noinline)) +test_builtin_malloc_condphi4 (size_t sz, int cond) +{ + char *a = __builtin_malloc (sz); + char b[sz / 2]; + + return __builtin_dynamic_object_size (cond ? b : (void *) &a, 0); +} + +size_t +__attribute__ ((noinline)) +test_builtin_malloc_condphi5 (size_t sz, int cond, char *c) +{ + char *a = __builtin_malloc (sz); + + return __builtin_dynamic_object_size (cond ? c : (void *) &a, 0); +} + +/* Calloc-like allocator. */ + +size_t +__attribute__ ((noinline)) +test_calloc (size_t cnt, size_t sz) +{ + void *ret = calloc_func (cnt, sz); + return __builtin_dynamic_object_size (ret, 0); +} + +size_t +__attribute__ ((noinline)) +test_builtin_calloc (size_t cnt, size_t sz) +{ + void *ret = __builtin_calloc (cnt, sz); + return __builtin_dynamic_object_size (ret, 0); +} + +size_t +__attribute__ ((noinline)) +test_builtin_calloc_cond (int cond1, int cond2) +{ + void *ret = __builtin_calloc (cond1 ? 32 : 64, cond2 ? 1024 : 16); + return __builtin_dynamic_object_size (ret, 0); +} + size_t __attribute__ ((noinline)) test_builtin_calloc_condphi (size_t cnt, size_t sz, int cond) @@ -33,6 +165,47 @@ test_builtin_calloc_condphi (size_t cnt, size_t sz, int cond) return __builtin_dynamic_object_size (cond ? ch : (void *) &bin, 0); } +/* Passthrough functions. */ + +size_t +__attribute__ ((noinline)) +test_passthrough (size_t
[PATCH v4 6/6] tree-object-size: Dynamic sizes for ADDR_EXPR
Allow returning dynamic expressions from ADDR_EXPR for __builtin_dynamic_object_size and also allow offsets to be dynamic. gcc/ChangeLog: * tree-object-size.c (size_valid_p): New function. (size_for_offset): Remove OFFSET constness assertion. (addr_object_size): Build dynamic expressions for object sizes and use size_valid_p to decide if it is valid for the given OBJECT_SIZE_TYPE. (compute_builtin_object_size): Allow dynamic offsets when computing size at O0. (call_object_size): Call size_valid_p. (plus_stmt_object_size): Allow non-constant offset and use size_valid_p to decide if it is valid for the given OBJECT_SIZE_TYPE. gcc/testsuite/ChangeLog: * gcc.dg/builtin-dynamic-object-size-0.c: Add new tests. * gcc.dg/builtin-object-size-1.c (test1) [__builtin_object_size]: Adjust expected output for dynamic object sizes. * gcc.dg/builtin-object-size-2.c (test1) [__builtin_object_size]: Likewise. * gcc.dg/builtin-object-size-3.c (test1) [__builtin_object_size]: Likewise. * gcc.dg/builtin-object-size-4.c (test1) [__builtin_object_size]: Likewise. Signed-off-by: Siddhesh Poyarekar --- .../gcc.dg/builtin-dynamic-object-size-0.c| 158 ++ gcc/testsuite/gcc.dg/builtin-object-size-1.c | 30 +++- gcc/testsuite/gcc.dg/builtin-object-size-2.c | 43 - gcc/testsuite/gcc.dg/builtin-object-size-3.c | 25 ++- gcc/testsuite/gcc.dg/builtin-object-size-4.c | 17 +- gcc/tree-object-size.c| 91 +- 6 files changed, 300 insertions(+), 64 deletions(-) diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c index 2db0e0d1aa2..4a1f4965ebd 100644 --- a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c +++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c @@ -219,6 +219,79 @@ test_deploop (size_t sz, size_t cond) return __builtin_dynamic_object_size (bin, 0); } +/* Address expressions. */ + +struct dynarray_struct +{ + long a; + char c[16]; + int b; +}; + +size_t +__attribute__ ((noinline)) +test_dynarray_struct (size_t sz, size_t off) +{ + struct dynarray_struct bin[sz]; + + return __builtin_dynamic_object_size (&bin[off].c, 0); +} + +size_t +__attribute__ ((noinline)) +test_dynarray_struct_subobj (size_t sz, size_t off) +{ + struct dynarray_struct bin[sz]; + + return __builtin_dynamic_object_size (&bin[off].c[4], 1); +} + +size_t +__attribute__ ((noinline)) +test_dynarray_struct_subobj2 (size_t sz, size_t off, size_t *objsz) +{ + struct dynarray_struct2 +{ + long a; + int b; + char c[sz]; +}; + + struct dynarray_struct2 bin; + + *objsz = sizeof (bin); + + return __builtin_dynamic_object_size (&bin.c[off], 1); +} + +size_t +__attribute__ ((noinline)) +test_substring (size_t sz, size_t off) +{ + char str[sz]; + + return __builtin_dynamic_object_size (&str[off], 0); +} + +size_t +__attribute__ ((noinline)) +test_substring_ptrplus (size_t sz, size_t off) +{ + int str[sz]; + + return __builtin_dynamic_object_size (str + off, 0); +} + +size_t +__attribute__ ((noinline)) +test_substring_ptrplus2 (size_t sz, size_t off, size_t off2) +{ + int str[sz]; + int *ptr = &str[off]; + + return __builtin_dynamic_object_size (ptr + off2, 0); +} + size_t __attribute__ ((access (__read_write__, 1, 2))) __attribute__ ((noinline)) @@ -227,6 +300,40 @@ test_parmsz_simple (void *obj, size_t sz) return __builtin_dynamic_object_size (obj, 0); } +size_t +__attribute__ ((noinline)) +__attribute__ ((access (__read_write__, 1, 2))) +test_parmsz (void *obj, size_t sz, size_t off) +{ + return __builtin_dynamic_object_size (obj + off, 0); +} + +size_t +__attribute__ ((noinline)) +__attribute__ ((access (__read_write__, 1, 2))) +test_parmsz_scale (int *obj, size_t sz, size_t off) +{ + return __builtin_dynamic_object_size (obj + off, 0); +} + +size_t +__attribute__ ((noinline)) +__attribute__ ((access (__read_write__, 1, 2))) +test_loop (int *obj, size_t sz, size_t start, size_t end, int incr) +{ + int *ptr = obj + start; + + for (int i = start; i != end; i = i + incr) +{ + ptr = ptr + incr; + if (__builtin_dynamic_object_size (ptr, 0) == 0) + return 0; +} + + return __builtin_dynamic_object_size (ptr, 0); +} + + unsigned nfails = 0; #define FAIL() ({ \ @@ -287,6 +394,31 @@ main (int argc, char **argv) FAIL (); if (test_dynarray (__builtin_strlen (argv[0])) != __builtin_strlen (argv[0])) FAIL (); + if (test_dynarray_struct (42, 4) != + ((42 - 4) * sizeof (struct dynarray_struct) + - __builtin_offsetof (struct dynarray_struct, c))) +FAIL (); + if (test_dynarray_struct (42, 48) != 0) +FAIL (); + if (test_substring (128, 4) != 128 - 4) +FAIL (); + if (test_substring (128, 142) != 0) +FAIL (); + if (test_dynarray_struct_subobj (42, 4
[PATCH v4 3/6] tree-object-size: Support dynamic sizes in conditions
Handle GIMPLE_PHI and conditionals specially for dynamic objects, returning PHI/conditional expressions instead of just a MIN/MAX estimate. This makes the returned object size variable for loops and conditionals, so tests need to be adjusted to look for precise size in some cases. builtin-dynamic-object-size-5.c had to be modified to only look for success in maximum object size case and skip over the minimum object size tests because the result is no longer a compile time constant. I also added some simple tests to exercise conditionals with dynamic object sizes. gcc/ChangeLog: * builtins.c (fold_builtin_object_size): Adjust for dynamic size expressions. * tree-object-size.c: Include gimplify-me.h. (struct object_size_info): New member UNKNOWNS. (size_initval_p, object_sizes_get_raw): New functions. (object_sizes_get): Return suitable gimple variable for object size. (bundle_sizes): New function. (object_sizes_set): Use it and handle dynamic object size expressions. (object_sizes_set_temp): New function. (size_for_offset): Adjust for dynamic size expressions. (emit_phi_nodes, propagate_unknowns, gimplify_size_expressions): New functions. (compute_builtin_object_size): Call gimplify_size_expressions for OST_DYNAMIC. (dynamic_object_size): New function. (cond_expr_object_size): Use it. (phi_dynamic_object_size): New function. (collect_object_sizes_for): Call it for OST_DYNAMIC. Adjust to accommodate dynamic object sizes. gcc/testsuite/ChangeLog: * gcc.dg/builtin-dynamic-object-size-0.c: New tests. * gcc.dg/builtin-dynamic-object-size-10.c: Add comment. * gcc.dg/builtin-object-size-1.c [__builtin_object_size]: Expect exact size expressions for __builtin_dynamic_object_size. * gcc.dg/builtin-object-size-2.c [__builtin_object_size]: Likewise. * gcc.dg/builtin-object-size-3.c [__builtin_object_size]: Likewise. * gcc.dg/builtin-object-size-4.c [__builtin_object_size]: Likewise. * gcc.dg/builtin-object-size-5.c [__builtin_object_size]: Likewise. Signed-off-by: Siddhesh Poyarekar --- gcc/builtins.c| 6 +- .../gcc.dg/builtin-dynamic-object-size-0.c| 72 +++ .../gcc.dg/builtin-dynamic-object-size-10.c | 2 + gcc/testsuite/gcc.dg/builtin-object-size-1.c | 119 - gcc/testsuite/gcc.dg/builtin-object-size-2.c | 92 gcc/testsuite/gcc.dg/builtin-object-size-3.c | 121 + gcc/testsuite/gcc.dg/builtin-object-size-4.c | 78 +++ gcc/testsuite/gcc.dg/builtin-object-size-5.c | 12 + gcc/tree-object-size.c| 489 +- 9 files changed, 962 insertions(+), 29 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c diff --git a/gcc/builtins.c b/gcc/builtins.c index 3b815a6e42a..52b62f32b44 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -10284,7 +10284,8 @@ fold_builtin_object_size (tree ptr, tree ost, enum built_in_function fcode) if (TREE_CODE (ptr) == ADDR_EXPR) { compute_builtin_object_size (ptr, object_size_type, &bytes); - if (int_fits_type_p (bytes, size_type_node)) + if ((object_size_type & OST_DYNAMIC) + || int_fits_type_p (bytes, size_type_node)) return fold_convert (size_type_node, bytes); } else if (TREE_CODE (ptr) == SSA_NAME) @@ -10293,7 +10294,8 @@ fold_builtin_object_size (tree ptr, tree ost, enum built_in_function fcode) later. Maybe subsequent passes will help determining it. */ if (compute_builtin_object_size (ptr, object_size_type, &bytes) - && int_fits_type_p (bytes, size_type_node)) + && ((object_size_type & OST_DYNAMIC) + || int_fits_type_p (bytes, size_type_node))) return fold_convert (size_type_node, bytes); } diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c new file mode 100644 index 000..ddedf6a49bd --- /dev/null +++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c @@ -0,0 +1,72 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +typedef __SIZE_TYPE__ size_t; +#define abort __builtin_abort + +size_t +__attribute__ ((noinline)) +test_builtin_malloc_condphi (int cond) +{ + void *ret; + + if (cond) +ret = __builtin_malloc (32); + else +ret = __builtin_malloc (64); + + return __builtin_dynamic_object_size (ret, 0); +} + +size_t +__attribute__ ((noinline)) +test_builtin_calloc_condphi (size_t cnt, size_t sz, int cond) +{ + struct +{ + int a; + char b; +} bin[cnt]; + + char *ch = __builtin_calloc (cnt, sz); + + return __builtin_dynamic_object_size (cond ? ch : (void *) &bin, 0); +} + +size_t +__attribute__ ((noinline)) +test_deploop (size_t sz, size_t cond) +{ + char *bin
[PATCH v4 2/6] __builtin_dynamic_object_size: Recognize builtin
Recognize the __builtin_dynamic_object_size builtin and add paths in the object size path to deal with it, but treat it like __builtin_object_size for now. Also add tests to provide the same testing coverage for the new builtin name. gcc/ChangeLog: * builtins.def (BUILT_IN_DYNAMIC_OBJECT_SIZE): New builtin. * tree-object-size.h: Move object size type bits enum from tree-object-size.c and add new value OST_DYNAMIC. * builtins.c (expand_builtin, fold_builtin_2): Handle it. (fold_builtin_object_size): Handle new builtin and adjust for change to compute_builtin_object_size. * tree-object-size.c: Include builtins.h. (compute_builtin_object_size): Adjust. (early_object_sizes_execute_one, dynamic_object_sizes_execute_one): New functions. (object_sizes_execute): Rename insert_min_max_p argument to early. Handle BUILT_IN_DYNAMIC_OBJECT_SIZE and call the new functions. doc/extend.texi (__builtin_dynamic_object_size): Document new builtin. gcc/testsuite/ChangeLog: * g++.dg/ext/builtin-dynamic-object-size1.C: New test. * g++.dg/ext/builtin-dynamic-object-size2.C: Likewise. * gcc.dg/builtin-dynamic-alloc-size.c: Likewise. * gcc.dg/builtin-dynamic-object-size-1.c: Likewise. * gcc.dg/builtin-dynamic-object-size-10.c: Likewise. * gcc.dg/builtin-dynamic-object-size-11.c: Likewise. * gcc.dg/builtin-dynamic-object-size-12.c: Likewise. * gcc.dg/builtin-dynamic-object-size-13.c: Likewise. * gcc.dg/builtin-dynamic-object-size-14.c: Likewise. * gcc.dg/builtin-dynamic-object-size-15.c: Likewise. * gcc.dg/builtin-dynamic-object-size-16.c: Likewise. * gcc.dg/builtin-dynamic-object-size-17.c: Likewise. * gcc.dg/builtin-dynamic-object-size-18.c: Likewise. * gcc.dg/builtin-dynamic-object-size-19.c: Likewise. * gcc.dg/builtin-dynamic-object-size-2.c: Likewise. * gcc.dg/builtin-dynamic-object-size-3.c: Likewise. * gcc.dg/builtin-dynamic-object-size-4.c: Likewise. * gcc.dg/builtin-dynamic-object-size-5.c: Likewise. * gcc.dg/builtin-dynamic-object-size-6.c: Likewise. * gcc.dg/builtin-dynamic-object-size-7.c: Likewise. * gcc.dg/builtin-dynamic-object-size-8.c: Likewise. * gcc.dg/builtin-dynamic-object-size-9.c: Likewise. * gcc.dg/builtin-object-size-16.c: Adjust to allow inclusion from builtin-dynamic-object-size-16.c. * gcc.dg/builtin-object-size-17.c: Likewise. Signed-off-by: Siddhesh Poyarekar --- gcc/builtins.c| 11 +- gcc/builtins.def | 1 + gcc/doc/extend.texi | 13 ++ .../g++.dg/ext/builtin-dynamic-object-size1.C | 5 + .../g++.dg/ext/builtin-dynamic-object-size2.C | 5 + .../gcc.dg/builtin-dynamic-alloc-size.c | 7 + .../gcc.dg/builtin-dynamic-object-size-1.c| 6 + .../gcc.dg/builtin-dynamic-object-size-10.c | 9 ++ .../gcc.dg/builtin-dynamic-object-size-11.c | 7 + .../gcc.dg/builtin-dynamic-object-size-12.c | 5 + .../gcc.dg/builtin-dynamic-object-size-13.c | 5 + .../gcc.dg/builtin-dynamic-object-size-14.c | 5 + .../gcc.dg/builtin-dynamic-object-size-15.c | 5 + .../gcc.dg/builtin-dynamic-object-size-16.c | 6 + .../gcc.dg/builtin-dynamic-object-size-17.c | 7 + .../gcc.dg/builtin-dynamic-object-size-18.c | 8 + .../gcc.dg/builtin-dynamic-object-size-19.c | 104 .../gcc.dg/builtin-dynamic-object-size-2.c| 6 + .../gcc.dg/builtin-dynamic-object-size-3.c| 6 + .../gcc.dg/builtin-dynamic-object-size-4.c| 6 + .../gcc.dg/builtin-dynamic-object-size-5.c| 7 + .../gcc.dg/builtin-dynamic-object-size-6.c| 5 + .../gcc.dg/builtin-dynamic-object-size-7.c| 5 + .../gcc.dg/builtin-dynamic-object-size-8.c| 5 + .../gcc.dg/builtin-dynamic-object-size-9.c| 5 + gcc/testsuite/gcc.dg/builtin-object-size-16.c | 2 + gcc/testsuite/gcc.dg/builtin-object-size-17.c | 2 + gcc/tree-object-size.c| 152 +- gcc/tree-object-size.h| 10 ++ 29 files changed, 378 insertions(+), 42 deletions(-) create mode 100644 gcc/testsuite/g++.dg/ext/builtin-dynamic-object-size1.C create mode 100644 gcc/testsuite/g++.dg/ext/builtin-dynamic-object-size2.C create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-alloc-size.c create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-1.c create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-11.c create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-12.c create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-13.c create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-14.c create mode 1006
[PATCH v4 0/6] __builtin_dynamic_object_size
This patchset implements the __builtin_dynamic_object_size builtin for gcc. The primary motivation to have this builtin in gcc is to enable _FORTIFY_SOURCE=3 support with gcc, thus allowing greater fortification in use cases where the potential performance tradeoff is acceptable. Semantics: -- __builtin_dynamic_object_size has the same signature as __builtin_object_size; it accepts a pointer and type ranging from 0 to 3 and it returns an object size estimate for the pointer based on an analysis of which objects the pointer could point to. The actual properties of the object size estimate are different: - In the best case __builtin_dynamic_object_size evaluates to an expression that represents a precise size of the object being pointed to. - In case a precise object size expression cannot be evaluated, __builtin_dynamic_object_size attempts to evaluate an estimate size expression based on the object size type. - In what situations the builtin returns an estimate vs a precise expression is an implementation detail and may change in future. Users must always assume, as in the case of __builtin_object_size, that the returned value is the maximum or minimum based on the object size type they have provided. - In the worst case of failure, __builtin_dynamic_object_size returns a constant (size_t)-1 or (size_t)0. Implementation: --- - The __builtin_dynamic_object_size support is implemented in tree-object-size. In most cases the first pass (early_objsz) the builtin is treated like __builtin_object_size to preserve subobject bounds. - Each element of the object_sizes vector is now an object_size struct holding bytes to the end of the object and the full size of the object. This allows proper handling of negative offsets, allowing them to the extent of the whole object bounds. This improves __builtin_object_size usage too with negative offsets, consistently returning valid results for pointer decrementing loops too. - The patchset begins with structural modification of the tree-object-size pass, followed by enhancement to return size expressions. I have split the implementation into one feature per patch (calls, function parameters, PHI, etc.) to hopefully ease review. Performance: Expressions generated by this pass in theory could be arbitrarily complex. I have not made an attempt to limit nesting of objects since it seemed too early to do that. In practice based on the few applications I built, most of the complexity of the expressions got folded away. Even so, the performance overhead is likely to be non-zero. If we find performance degradation to be significant, we could later add nesting limits to bail out if a size expression gets too complex. I have implemented simplification of __*_chk to their normal variants if we can determine at compile time that it is safe. This should limit the performance overhead of the expressions in valid cases. Build time performance doesn't seem to be affected much based on an unscientific check to time `make check-gcc RUNTESTFLAGS="dg.exp=builtin*"`. It only increases by about a couple of seconds when the dynamic tests are added and remains more or less in the same ballpark otherwise. Testing: I have added tests for dynamic object sizes as well as wrappers for all __builtin_object_size tests to provide wide coverage. I have also done a full bootstrap build and test run on x86_64. Further I did a bootstrap build with --with-build-config=bootstrap-ubsan for additional coverage since it managed to find an issue with the earlier cleanup patches. It is however only useful to ensure there are no regressions since ubsan does not yet use dynamic size expressions. I have also built bash, cmake, wpa_supplicant and systemtap with _FORTIFY_SOURCE=2 and _FORTIFY_SOURCE=3 (with a hacked up glibc to make sure it works) and saw no issues in any of those builds. I did some rudimentary analysis of the generated binaries using fortify-metrics[1] to confirm that there was a difference in coverage between the two fortification levels. (Unchanged since v3) Here is a summary of coverage in the above packages: F = number of fortified calls T = Total number of calls to fortifiable functions (fortified as well as unfortified) C = F * 100/ T Package F(2)T(2)F(3)T(3)C(2)C(3) bash428 12201005119635.08% 84.03% wpa_supplicant 163532322350340850.59% 68.96% systemtap 324 1990343 199416.28% 17.20% cmake 830 14181 958 14196 5.85% 6.75% The numbers are slightly lower than the previous patch series because in the interim I pushed an improvement to folding of the _chk builtins so that they can use ranges to simplify the calls to their regular variants. Also note that even _FORTIFY_SOURCE=2 coverage should be improved due to negative offset handling. Additional testing plans (
[PATCH v4 4/6] tree-object-size: Handle function parameters
Handle hints provided by __attribute__ ((access (...))) to compute dynamic sizes for objects. gcc/ChangeLog: * tree-object-size.c: Include tree-dfa.h. (parm_object_size): New function. (collect_object_sizes_for): Call it. gcc/testsuite/ChangeLog: * gcc.dg/builtin-dynamic-object-size-0.c (test_parmsz_simple): New function. (main): Call it. Signed-off-by: Siddhesh Poyarekar --- .../gcc.dg/builtin-dynamic-object-size-0.c| 11 gcc/tree-object-size.c| 50 ++- 2 files changed, 60 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c index ddedf6a49bd..ce0f4eb17f3 100644 --- a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c +++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c @@ -46,6 +46,14 @@ test_deploop (size_t sz, size_t cond) return __builtin_dynamic_object_size (bin, 0); } +size_t +__attribute__ ((access (__read_write__, 1, 2))) +__attribute__ ((noinline)) +test_parmsz_simple (void *obj, size_t sz) +{ + return __builtin_dynamic_object_size (obj, 0); +} + unsigned nfails = 0; #define FAIL() ({ \ @@ -64,6 +72,9 @@ main (int argc, char **argv) FAIL (); if (test_deploop (128, 129) != 32) FAIL (); + if (test_parmsz_simple (argv[0], __builtin_strlen (argv[0]) + 1) + != __builtin_strlen (argv[0]) + 1) +FAIL (); if (nfails > 0) __builtin_abort (); diff --git a/gcc/tree-object-size.c b/gcc/tree-object-size.c index dd6c90efebf..bbc708f8ade 100644 --- a/gcc/tree-object-size.c +++ b/gcc/tree-object-size.c @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3. If not see #include "gimple-fold.h" #include "gimple-iterator.h" #include "tree-cfg.h" +#include "tree-dfa.h" #include "stringpool.h" #include "attribs.h" #include "builtins.h" @@ -1449,6 +1450,53 @@ cond_expr_object_size (struct object_size_info *osi, tree var, gimple *stmt) return reexamine; } +/* Find size of an object passed as a parameter to the function. */ + +static void +parm_object_size (struct object_size_info *osi, tree var) +{ + int object_size_type = osi->object_size_type; + tree parm = SSA_NAME_VAR (var); + + if (!(object_size_type & OST_DYNAMIC) || !POINTER_TYPE_P (TREE_TYPE (parm))) +expr_object_size (osi, var, parm); + + /* Look for access attribute. */ + rdwr_map rdwr_idx; + + tree fndecl = cfun->decl; + const attr_access *access = get_parm_access (rdwr_idx, parm, fndecl); + tree typesize = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (parm))); + tree sz = NULL_TREE; + + if (access && access->sizarg != UINT_MAX) +{ + tree fnargs = DECL_ARGUMENTS (fndecl); + tree arg = NULL_TREE; + unsigned argpos = 0; + + /* Walk through the parameters to pick the size parameter and safely +scale it by the type size. */ + for (arg = fnargs; argpos != access->sizarg && arg; + arg = TREE_CHAIN (arg), ++argpos); + + if (arg != NULL_TREE && INTEGRAL_TYPE_P (TREE_TYPE (arg))) + { + sz = get_or_create_ssa_default_def (cfun, arg); + if (sz != NULL_TREE) + { + sz = fold_convert (sizetype, sz); + if (typesize) + sz = size_binop (MULT_EXPR, sz, typesize); + } + } +} + if (!sz) +sz = size_unknown (object_size_type); + + object_sizes_set (osi, SSA_NAME_VERSION (var), sz, sz); +} + /* Compute an object size expression for VAR, which is the result of a PHI node. */ @@ -1606,7 +1654,7 @@ collect_object_sizes_for (struct object_size_info *osi, tree var) case GIMPLE_NOP: if (SSA_NAME_VAR (var) && TREE_CODE (SSA_NAME_VAR (var)) == PARM_DECL) - expr_object_size (osi, var, SSA_NAME_VAR (var)); + parm_object_size (osi, var); else /* Uninitialized SSA names point nowhere. */ unknown_object_size (osi, var); -- 2.31.1
Re: [PATCH] Loop unswitching: support gswitch statements.
On 12/1/21 15:19, Richard Biener wrote: which is compute the range of 'lhs' on edge_true into predicate->true_range, assign that same range to ->false_range and then invert it to get the range on the false_edge. What I am saying is that for better precision you should do ranger->range_on_edge (predicate->false_range, edge_false, lhs); rather than prematurely optimize this to the inversion of the true range since yes, ranger is CFG sensitive and only the_last_ predicate on a long CFG path is actually inverted. What am I missing? I might be misunderstood, but I think it's the problem defined here: https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584605.html where I used the ranger->range_on_edge on the false_edge. Martin
Re: [PATCH] fix spelling of -linker-output-auto-nolto-rel
On Wed, Dec 1, 2021 at 3:18 PM Rasmus Villemoes wrote: > > On 01/12/2021 14.17, Richard Biener wrote: > > On Wed, Dec 1, 2021 at 12:08 PM Rasmus Villemoes > > wrote: > >> > >> The transposition nolto -> notlo is confusing and it makes the long > >> name even harder to read than it already is - I kept reading it as > >> "not lo" until I realized it was a simply typo. > >> > >> Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks) > > > > Hmm, if we ever released GCC with that lto-plugin behavior you should > > preserve > > accepting -linker-output-auto-notlo-rel in lto-plugin because a newer > > lto-plugin > > is required to work with older released GCC which continue to provide > > the option with the spelling mistake. > > I thought a given version of gcc would only use the plugin it shipped > with(?). No. It's even older lto-plugins being used by newer compiler versions (just with a possibly reduced feature set) - typing that on a system with a lto-plugin from GCC 7 ;) > And even if in theory one should keep the plugin compatible with older > gcc releases, in this case it is only relevant to vxworks ports, so if > Eric and Olivier are ok with this as-is, perhaps we can avoid having to > accept both spellings forever. I think the cost is negligible - looks like only GCC 11 knows the flag so I would suggest to backport the patch to the GCC 11 branch as well (but there as well keep the compatibility). Richard. > > Rasmus
Re: [PATCH] Loop unswitching: support gswitch statements.
On Wed, Dec 1, 2021 at 3:10 PM Martin Liška wrote: > > On 11/30/21 12:17, Richard Biener wrote: > > On Mon, Nov 29, 2021 at 1:45 PM Martin Liška wrote: > >> > >> On 11/26/21 09:12, Richard Biener wrote: > >>> On Wed, Nov 24, 2021 at 3:32 PM Martin Liška wrote: > > On 11/24/21 15:14, Martin Liška wrote: > > It likely miscompiles gcc.dg/loop-unswitch-5.c, working on that.. > > Fixed that in the updated version. > >>> > >>> Function level comments need updating it seems. > >> > >> I've done that. > >> > >>> > >>> +static unsigned > >>> +evaluate_insns (class loop *loop, basic_block *bbs, > >>> + predicate_vector &predicate_path, > >>> + auto_bb_flag &reachable_flag) > >>> +{ > >>> + auto_vec worklist (loop->num_nodes); > >>> + worklist.quick_push (bbs[0]); > >>> ... > >>> > >>> so when adding gswitch support the easiest way to make > >>> > >>> + FOR_EACH_EDGE (e, ei, bb->succs) > >>> + { > >>> ... > >>> + { > >>> + worklist.safe_push (dest); > >>> + dest->flags |= reachable_flag; > >>> > >>> work is when the gcond/gswitch simplification would mark > >>> outgoing edges as (non-)executable. For gswitch this > >>> could be achieved by iterating over the case labels and > >>> intersecting that with the range while for gcond it's a > >>> matter of setting an edge flag instead of returning true/false. > >> > >> Exactly, it can be quite naturally added to the current patch. > >> > >>> I'd call the common function evaluate_control_stmt_using_entry_checks > >>> or so and invoke it on the last stmt of a block with >= 2 outgoing > >>> edges. > >> > >> Yes, I'll do it for the gswitch support patch. > >> > >>> > >>> We still seem to do the simplification work twice, once for costing > >>> and once for transform, but that's OK for now I guess. > >>> > >>> I think you want to clear_aux_for_blocks at the end of the pass. > >> > >> Called that. > >> > >>> > >>> Otherwise I like it - it seems you have some TODO around cost > >>> modeling. Did you try to do gswitch support ontop as I suggested > >>> to see if the general structure keeps working? > >> > >> I vanished and tested the patch. No, I don't have the gswitch support patch > >> as the current patch was reworked a few times. > >> > >> Can we please progress and have installed the suggested patch? > > > > I'd like to see the gswitch support - that's what was posted before stage3 > > close, this patch on its own doesn't seem worth pushing for. That said, > > I have some comments below (and the already raised ones about how > > things might need to change with gswitch support). Is it so difficult to > > develop gswitch support as a separate change ontop of this? > > I'm going to work on that in the upcoming days. When are you leaving for the > Christmas > holidays :P ? Wednesday next week is my first day off, but I might peek into mails now and then. > > > >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > > > +#include > > > > that's included unconditionally by system.h > > Good. > > > > > +/* The type represents a predicate path leading to a basic block. */ > > +typedef auto_vec> predicate_vector; > > > > +static bool tree_unswitch_single_loop (class loop *, int, > > + predicate_vector &predicate_path, > > > > I think we don't want to pass auto_vec by reference, instead auto_vec should > > decay to vec<> when passed around. > > Ok. > > > > > + unswitch_predicate *predicate = new unswitch_predicate (cond, lhs); > > + if (irange::supports_type_p (TREE_TYPE (lhs)) && CONSTANT_CLASS_P (rhs)) > > +{ > > + ranger->range_on_edge (predicate->true_range, edge_true, lhs); > > + predicate->false_range = predicate->true_range; > > > > - return cond; > > + if (!predicate->false_range.varying_p () > > + && !predicate->false_range.undefined_p ()) > > + predicate->false_range.invert (); > > +} > > > > is that correct? I would guess range_on_edge, for > > > > if (a > 10) > > if (a < 15) > > /* true */ > > else > > /* false */ > > > > figures [11, 14] on the true edge of if (a < 15) (considered the > > unswitch predicate), > > inverting that yields [0, 10] u [15, +INF] but that's at least > > sub-optimal for the > > else range. I think we want to call range_on_edge again to determine the > > range > > on the else branch? > > No, as shown in the previous emails, Ranger is CFG sensitive and we should > rely > on our irange merging. I don't understand. You do > > + unswitch_predicate *predicate = new unswitch_predicate (cond, lhs); > > + if (irange::supports_type_p (TREE_TYPE (lhs)) && CONSTANT_CLASS_P (rhs)) > > +{ > > + ranger->range_on_edge (predicate->true_range, edge_true, lhs); > > + predicate->false_range = predicate->true_range; > > > > - return cond; > > + if (!predicate->false_range.varying_p () > > +
Re: [PATCH] fix spelling of -linker-output-auto-nolto-rel
On 01/12/2021 14.17, Richard Biener wrote: > On Wed, Dec 1, 2021 at 12:08 PM Rasmus Villemoes > wrote: >> >> The transposition nolto -> notlo is confusing and it makes the long >> name even harder to read than it already is - I kept reading it as >> "not lo" until I realized it was a simply typo. >> >> Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks) > > Hmm, if we ever released GCC with that lto-plugin behavior you should preserve > accepting -linker-output-auto-notlo-rel in lto-plugin because a newer > lto-plugin > is required to work with older released GCC which continue to provide > the option with the spelling mistake. I thought a given version of gcc would only use the plugin it shipped with(?). And even if in theory one should keep the plugin compatible with older gcc releases, in this case it is only relevant to vxworks ports, so if Eric and Olivier are ok with this as-is, perhaps we can avoid having to accept both spellings forever. Rasmus
Re: [PING, PATCH] darwin, d: Support outfile substitution for liphobos
On Tue, Nov 30, 2021 at 6:04 PM Iain Buclaw via Gcc-patches wrote: > > Ping. > > Are the common gcc parts OK (also for backporting)? Yes, OK for backporting as well after a short burn in period. Richard, > Iain. > > Excerpts from Iain Buclaw's message of November 26, 2021 1:51 pm: > > Excerpts from Iain Sandoe's message of November 19, 2021 10:21 am: > >> Hi Iain > >> > >>> On 19 Nov 2021, at 08:32, Iain Buclaw wrote: > >> > >>> This patch fixes a stage2 bootstrap failure in the D front-end on > >>> darwin due to libgphobos being dynamically linked despite > >>> -static-libphobos being on the command line. > >>> > >>> In the gdc driver, this takes the previous fix for the Darwin D > >>> bootstrap, and extends it to the -static-libphobos option as well. > >>> Rather than pushing the -static-libphobos option back onto the command > >>> line, the setting of SKIPOPT is instead conditionally removed. The same > >>> change has been repeated for -static-libstdc++ so there is now no need > >>> to call generate_option to re-add it. > >>> > >>> In the gcc driver, -static-libphobos has been added as a common option, > >>> validated, and a new outfile substition added to config/darwin.h to > >>> correctly replace -lgphobos with libgphobos.a. > >>> > >>> Bootstrapped and regression tested on x86_64-linux-gnu and > >>> x86_64-apple-darwin20. > >>> > >>> OK for mainline? This would also be fine for gcc-11 release branch too, > >>> as well as earlier releases with D support. > >> > >> the Darwin parts are fine, thanks > >> > >> The SKIPOPT in d-spec, presumably means “skip removing this opt”? > >> otherwise the #ifndef looks odd (because of the > >> static-libgcc|static-libphobos, > >> darwin.h would do the substitution for -static-libgcc as well, so it’s not > >> a 100% > >> test). > >> > > > > I've only just realised what you meant. Yes you are of course right, > > and it should have been #ifdef, attaching a fixed-up patch. > > > > Iain. > > > > --- > > gcc/ChangeLog: > > > > * common.opt (static-libphobos): Add option. > > * config/darwin.h (LINK_SPEC): Substitute -lgphobos with > > libgphobos.a > > when linking statically. > > * gcc.c (driver_handle_option): Set -static-libphobos as always > > valid. > > > > gcc/d/ChangeLog: > > > > * d-spec.cc (lang_specific_driver): Set SKIPOPT on > > -static-libstdc++ > > and -static-libphobos only when target supports > > LD_STATIC_DYNAMIC. > > Remove generate_option to re-add -static-libstdc++. > > > > libphobos/ChangeLog: > > > > * testsuite/testsuite_flags.in: Add libphobos library directory > > as > > search path to --gdcldflags. > > > > diff --git a/gcc/common.opt b/gcc/common.opt > > index db6010e4e20..73c12d933f3 100644 > > --- a/gcc/common.opt > > +++ b/gcc/common.opt > > @@ -3527,6 +3527,10 @@ static-libgfortran > > Driver > > ; Documented for Fortran, but always accepted by driver. > > > > +static-libphobos > > +Driver > > +; Documented for D, but always accepted by driver. > > + > > static-libstdc++ > > Driver > > > > diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h > > index 7ed01efa694..c4ddd623e8b 100644 > > --- a/gcc/config/darwin.h > > +++ b/gcc/config/darwin.h > > @@ -443,6 +443,7 @@ extern GTY(()) int darwin_ms_struct; > > %:replace-outfile(-lobjc libobjc-gnu.a%s); \ > > :%:replace-outfile(-lobjc -lobjc-gnu )}}\ > > %{static|static-libgcc|static-libgfortran:%:replace-outfile(-lgfortran > > libgfortran.a%s)}\ > > + %{static|static-libgcc|static-libphobos:%:replace-outfile(-lgphobos > > libgphobos.a%s)}\ > > > > %{static|static-libgcc|static-libstdc++|static-libgfortran:%:replace-outfile(-lgomp > > libgomp.a%s)}\ > > %{static|static-libgcc|static-libstdc++:%:replace-outfile(-lstdc++ > > libstdc++.a%s)}\ > > %{force_cpusubtype_ALL:-arch %(darwin_arch)} \ > > diff --git a/gcc/d/d-spec.cc b/gcc/d/d-spec.cc > > index b12d28f1047..1304126a675 100644 > > --- a/gcc/d/d-spec.cc > > +++ b/gcc/d/d-spec.cc > > @@ -253,13 +253,23 @@ lang_specific_driver (cl_decoded_option > > **in_decoded_options, > > > > case OPT_static_libstdc__: > > saw_static_libcxx = true; > > +#ifdef HAVE_LD_STATIC_DYNAMIC > > + /* Remove -static-libstdc++ from the command only if target supports > > + LD_STATIC_DYNAMIC. When not supported, it is left in so that a > > + back-end target can use outfile substitution. */ > > args[i] |= SKIPOPT; > > +#endif > > break; > > > > case OPT_static_libphobos: > > if (phobos_library != PHOBOS_NOLINK) > > phobos_library = PHOBOS_STATIC; > > +#ifdef HAVE_LD_STATIC_DYNAMIC > > + /* Remove -static-libphobos from the command only if target supports > > + LD_STATIC_DYNAMIC. When not supported, it is left in so that a > > + back-end target can use outfile substi
Re: [PATCH] Loop unswitching: support gswitch statements.
On 11/30/21 12:17, Richard Biener wrote: On Mon, Nov 29, 2021 at 1:45 PM Martin Liška wrote: On 11/26/21 09:12, Richard Biener wrote: On Wed, Nov 24, 2021 at 3:32 PM Martin Liška wrote: On 11/24/21 15:14, Martin Liška wrote: It likely miscompiles gcc.dg/loop-unswitch-5.c, working on that.. Fixed that in the updated version. Function level comments need updating it seems. I've done that. +static unsigned +evaluate_insns (class loop *loop, basic_block *bbs, + predicate_vector &predicate_path, + auto_bb_flag &reachable_flag) +{ + auto_vec worklist (loop->num_nodes); + worklist.quick_push (bbs[0]); ... so when adding gswitch support the easiest way to make + FOR_EACH_EDGE (e, ei, bb->succs) + { ... + { + worklist.safe_push (dest); + dest->flags |= reachable_flag; work is when the gcond/gswitch simplification would mark outgoing edges as (non-)executable. For gswitch this could be achieved by iterating over the case labels and intersecting that with the range while for gcond it's a matter of setting an edge flag instead of returning true/false. Exactly, it can be quite naturally added to the current patch. I'd call the common function evaluate_control_stmt_using_entry_checks or so and invoke it on the last stmt of a block with >= 2 outgoing edges. Yes, I'll do it for the gswitch support patch. We still seem to do the simplification work twice, once for costing and once for transform, but that's OK for now I guess. I think you want to clear_aux_for_blocks at the end of the pass. Called that. Otherwise I like it - it seems you have some TODO around cost modeling. Did you try to do gswitch support ontop as I suggested to see if the general structure keeps working? I vanished and tested the patch. No, I don't have the gswitch support patch as the current patch was reworked a few times. Can we please progress and have installed the suggested patch? I'd like to see the gswitch support - that's what was posted before stage3 close, this patch on its own doesn't seem worth pushing for. That said, I have some comments below (and the already raised ones about how things might need to change with gswitch support). Is it so difficult to develop gswitch support as a separate change ontop of this? I'm going to work on that in the upcoming days. When are you leaving for the Christmas holidays :P ? Patch can bootstrap on x86_64-linux-gnu and survives regression tests. +#include that's included unconditionally by system.h Good. +/* The type represents a predicate path leading to a basic block. */ +typedef auto_vec> predicate_vector; +static bool tree_unswitch_single_loop (class loop *, int, + predicate_vector &predicate_path, I think we don't want to pass auto_vec by reference, instead auto_vec should decay to vec<> when passed around. Ok. + unswitch_predicate *predicate = new unswitch_predicate (cond, lhs); + if (irange::supports_type_p (TREE_TYPE (lhs)) && CONSTANT_CLASS_P (rhs)) +{ + ranger->range_on_edge (predicate->true_range, edge_true, lhs); + predicate->false_range = predicate->true_range; - return cond; + if (!predicate->false_range.varying_p () + && !predicate->false_range.undefined_p ()) + predicate->false_range.invert (); +} is that correct? I would guess range_on_edge, for if (a > 10) if (a < 15) /* true */ else /* false */ figures [11, 14] on the true edge of if (a < 15) (considered the unswitch predicate), inverting that yields [0, 10] u [15, +INF] but that's at least sub-optimal for the else range. I think we want to call range_on_edge again to determine the range on the else branch? No, as shown in the previous emails, Ranger is CFG sensitive and we should rely on our irange merging. } -/* Simplifies COND using checks in front of the entry of the LOOP. Just very - simplish (sufficient to prevent us from duplicating loop in unswitching - unnecessarily). */ +static void +combine_range (predicate_vector &predicate_path, tree index, irange &path_range) +{ unless I misread the patch combine_range misses a comment. +evaluate_control_stmt_using_entry_checks (gimple *stmt, + predicate_vector &predicate_path) { so this function for ranger does combine all predicates on the predicate_path but for the symbolic evaluation it looks at the last predicate only? Yes. I guess that's because other predicate simplification opportunities are applied already, correct? Exactly. But doesn't that mean that the combine_range could be done once when we build the predicate vector instead of for each stmt? I'm just looking at the difference in treating both cases - if we first analyze the whole unswitching path (including all recursions) then we'd have to simplify all opportunities at once, so iterating over all
[committed] testsuite: Fix typo in comment in aapcs64 test
Committed as obvious. Thanks, Alex gcc/testsuite/ChangeLog: * gcc.target/aarch64/aapcs64/macro-def.h (PTR): Fix typo in comment. diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/macro-def.h b/gcc/testsuite/gcc.target/aarch64/aapcs64/macro-def.h index 72a47067631..e681de60778 100644 --- a/gcc/testsuite/gcc.target/aarch64/aapcs64/macro-def.h +++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/macro-def.h @@ -63,7 +63,7 @@ #define ANON_PROMOTED(type,val,type_promoted, val_promoted, offset,...) \ ANON(type_promoted, val_promoted, offset, __VA_ARGS__) /* Composite larger than 16 bytes is replaced by a pointer to a copy prepared - by the caller, so here we extrat the pointer, deref it and compare the + by the caller, so here we extract the pointer, deref it and compare the content with that of the original one. */ #define PTR(type, val, offset, ...) { \ type * ptr; \
Re: [PATCH] configure, d: Add support for bootstrapping the D front-end
Hi! On 2021-10-29T17:37:44-0400, Eric Gallager via Gcc-patches wrote: > On Thu, Oct 28, 2021 at 2:38 PM Jeff Law via Gcc-patches > wrote: >> On 10/9/2021 7:32 AM, Iain Buclaw via Gcc-patches wrote: >> > The implementation of the D front-end in GCC is based on the original >> > C++ version of the D programming language compiler, which was ported to >> > D itself in version 2.069.0 (released in 2015). [...] >> > It has come to the point now that I'm happy enough with the process to >> > switch out the C++ sources in gcc/d/dmd with D sources. Congratulations: self-hosting compiler! That, of course, makes the GCC/D compilation process more difficult: >> Presumably this means that the only way to build D for the first time on >> a new target is to cross from an existing target that supports D, right? >> >> I think that's not unreasonable and I don't think we want to increase >> the burden of maintaining an old codebase just for the sake of a >> marginally easier bootstrap process for a new target. (Some may argue "burden" vs. "marginally", but yes, I agree.) Plus, the problem that also for non-cross (native) builds, you now have a baseline GCC/D compiler requirement: > There should be some sort of note about this in the documentation, > IMO; both install.texi That has been done. +In order to build GDC, the D compiler, you need a working GDC +compiler (GCC version 9.1 or later), as the D front end is written in D. > and the "Caveats" section of > gcc-12/changes.html (and possibly other places). That not yet, but yes, I agree that should be done, too. So it's now a requirement to build/bootstrap GCC/D with GCC 9.1 (or newer, of course) -- which is quite different from the current GCC 4.8 requirement for all other GCC parts (including self-hosted GCC/Ada, by the way, which also is happy with GCC 4.8). I'm one of those (few, I guess?) doing bootstrap verification builds with actual old GCC 4.8 ("gcc-4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4", precisely), and that now obviously doesn't cover GCC/D anymore, for example with '--enable-languages=all' (and without the GDC 4.8 packages installed): [...] +checking for gdc... no [...] +configure: WARNING: GDC is required to build d configure: WARNING: --enable-host-shared required to build jit -The following languages will be built: c,ada,c++,d,fortran,go,lto,objc,obj-c++ +The following languages will be built: c,ada,c++,fortran,go,lto,objc,obj-c++ *** This configuration is not supported in the following subdirectories: - target-liboffloadmic + target-libphobos target-zlib target-liboffloadmic (Any other directories should still work fine.) [...] ..., or error for explicit '--enable-languages=d'. So I built myself a stock GCC 9.1 with '--enable-languages=d', and via '[...]/configure [...] CC=gcc-4.8 CXX=g++-4.8 GDC=[GCC 9.1]/bin/gdc [...]' I'm able to successfully bootstrap GCC, including GCC/D, in a mixed GCC 4.8/9.1 configuration, with subsequent good-looking 'make check-d' results. Per my superficial review of the build log file, the '[GCC 9.1]/bin/gdc' indeed is only used during stage 1 build, as it should be. So that's good enough as far as I'm concerned, and unless anyone sees any reason why such a mixed GCC 4.8/9.1 configuration would be bad, may it be worth putting such information (may 'configure' with 'GDC=[GCC 9.1]/bin/gdc') into the documentation, too? Grüße Thomas - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Re: [PATCH] Allow loop crossing paths in back threader copier.
On Tue, Nov 30, 2021 at 4:48 PM Aldy Hernandez via Gcc-patches wrote: > > We are currently restricting loop crossing paths in the generic copier > used by the back threader, but we should be able to handle them after > loop_done has completed. But this isn't a cost thing but a correctness (as in, can we update loops properly) thing. We are passing 'loop' down to copy_bbs and very much rely on seeing either BBs of that loop or copying complete subloops of it. You will likely see excess loop fixup caused by this or ICEs in case the caller of this function will not always set LOOPS_NEED_FIXUP (not that copy_bbs handling of loops is perfect). > This fixes the PR at -O2, though the problem remains at -O1 because we > have no threaders smart enough to elide the undefined read. DOM3 could > be a candidate when it is converted to either a hybrid threader or > replaced with the backward threader (when ranger can handle floats). > > Tested on x86-64 Linux. > > OK for trunk? > > PR tree-optimization/80548 > > gcc/ChangeLog: > > * attribs.c (sorted_attr_string): Add assert for -Wstringop-overread. > * tree-ssa-threadupdate.c > (back_jt_path_registry::duplicate_thread_path): Allow paths that > cross loops after loop_done. > (back_jt_path_registry::update_cfg): Diagnose dropped threads > after duplicate_thread_path. > > gcc/testsuite/ChangeLog: > > * gcc.dg/pr80548.c: New test. > --- > gcc/attribs.c | 1 + > gcc/testsuite/gcc.dg/pr80548.c | 23 +++ > gcc/tree-ssa-threadupdate.c| 19 +++ > 3 files changed, 35 insertions(+), 8 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/pr80548.c > > diff --git a/gcc/attribs.c b/gcc/attribs.c > index c252f5af07b..9a079b8405a 100644 > --- a/gcc/attribs.c > +++ b/gcc/attribs.c > @@ -1035,6 +1035,7 @@ sorted_attr_string (tree arglist) >attr_str[str_len_sum + len] = TREE_CHAIN (arg) ? ',' : '\0'; >str_len_sum += len + 1; > } > + gcc_assert (arglist); > >/* Replace "=,-" with "_". */ >for (i = 0; i < strlen (attr_str); i++) > diff --git a/gcc/testsuite/gcc.dg/pr80548.c b/gcc/testsuite/gcc.dg/pr80548.c > new file mode 100644 > index 000..232743e > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr80548.c > @@ -0,0 +1,23 @@ > +// { dg-do compile } > +// { dg-options "-O2 -Wuninitialized" } > + > +int g (void); > +void h (int, int); > + > +void f (int b) > +{ > + int x, y; > + > + if (b) > +{ > + x = g (); > + y = g (); > +} > + > + while (g ()) > +if (b) > + { > +h (x, y); // { dg-bogus "uninit" } > +y = g (); > + } > +} > diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c > index 8aac733ac25..b194c11e23d 100644 > --- a/gcc/tree-ssa-threadupdate.c > +++ b/gcc/tree-ssa-threadupdate.c > @@ -2410,13 +2410,14 @@ back_jt_path_registry::duplicate_thread_path (edge > entry, > missuses of the functions. I.e. if you ask to copy something weird, > it will work, but the state of structures probably will not be > correct. */ > - for (i = 0; i < n_region; i++) > -{ > - /* We do not handle subloops, i.e. all the blocks must belong to the > -same loop. */ > - if (region[i]->loop_father != loop) > - return false; > -} > + if (!(cfun->curr_properties & PROP_loop_opts_done)) > +for (i = 0; i < n_region; i++) > + { > + /* We do not handle subloops, i.e. all the blocks must belong to the > + same loop. */ > + if (region[i]->loop_father != loop) > + return false; > + } > >initialize_original_copy_tables (); > > @@ -2651,9 +2652,11 @@ back_jt_path_registry::update_cfg (bool > /*peel_loop_headers*/) > visited_starting_edges.add (entry); > retval = true; > m_num_threaded_edges++; > + path->release (); > } > + else > + cancel_thread (path, "Failure in duplicate_thread_path"); > > - path->release (); >m_paths.unordered_remove (0); >free (region); > } > -- > 2.31.1 >
Re: [PATCH] vect: Tighten check for SLP memory groups [PR103517]
On Wed, Dec 1, 2021 at 2:25 PM Richard Sandiford wrote: > > Richard Biener via Gcc-patches writes: > > On Wed, Dec 1, 2021 at 11:56 AM Richard Sandiford via Gcc-patches > > wrote: > >> > >> When checking for compatible stmts, vect_build_slp_tree_1 did: > >> > >>&& !(STMT_VINFO_GROUPED_ACCESS (stmt_info) > >> && (first_stmt_code == ARRAY_REF > >> || first_stmt_code == BIT_FIELD_REF > >> || first_stmt_code == INDIRECT_REF > >> || first_stmt_code == COMPONENT_REF > >> || first_stmt_code == MEM_REF))) > >> > >> That is, it allowed any rhs_code as long as the first_stmt_code > >> looked valid. This had the effect of allowing IFN_MASK_LOAD > >> to be paired with an earlier non-call code (but didn't allow > >> the reverse). > >> > >> This patch makes the check symmetrical. > >> > >> Still testing on x86_64-linux-gnu. OK if testing passes, or doesn't > >> this seem like the right approach? > > > > It's indeed a too weak comparison of the classification of the first > > and the followup operands, some larger refactoring is probably > > needed to improve here (note how we compare STMT_VINFO_GROUPED_ACCESS > > of the followup against the tree codes of the first stmt but also later > > compare first_stmt_load_p against load_p). > > > > The proposed patch looks reasonable (but then we could drop > > the STMT_VINFO_GROUPED_ACCESS (stmt_info) part of the check?), > > Yeah, was wondering about that. Seemed safer to keep it, since without > it we might pair non-memory BIT_FIELD_REFs with other things. I guess > the same goes for the first stmt though, and the mismatch ought to be > caught later anyway. Hmm, yeah. I guess some classify_stmt () returning a custom enum and comparing that might clean up this whole thing. But nothing for stage3. Richard. > > so OK. > > Thanks, > Richard > > > Thanks, > > Richard. > > > >> Richard > >> > >> > >> gcc/ > >> PR tree-optimization/103517 > >> * tree-vect-slp.c (vect_build_slp_tree_1): When allowing two > >> different component references, check the codes of both them, > >> rather than just the first. > >> > >> gcc/testsuite/ > >> PR tree-optimization/103517 > >> * gcc.dg/vect/pr103517.c: New test. > >> --- > >> gcc/testsuite/gcc.dg/vect/pr103517.c | 13 + > >> gcc/tree-vect-slp.c | 7 ++- > >> 2 files changed, 19 insertions(+), 1 deletion(-) > >> create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c > >> > >> diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c > >> b/gcc/testsuite/gcc.dg/vect/pr103517.c > >> new file mode 100644 > >> index 000..de87fc48f84 > >> --- /dev/null > >> +++ b/gcc/testsuite/gcc.dg/vect/pr103517.c > >> @@ -0,0 +1,13 @@ > >> +/* { dg-do compile } */ > >> +/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* > >> i?86-*-* } } */ > >> + > >> +int a; > >> +short b, c; > >> +extern short d[]; > >> +void e() { > >> + for (short f = 1; f < (short)a; f += 2) > >> +if (d[f + 1]) { > >> + b = d[f]; > >> + c = d[f + 1]; > >> +} > >> +} > >> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c > >> index 7bff5118bd0..bc22ffeed82 100644 > >> --- a/gcc/tree-vect-slp.c > >> +++ b/gcc/tree-vect-slp.c > >> @@ -1121,7 +1121,12 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned > >> char *swap, > >> || first_stmt_code == BIT_FIELD_REF > >> || first_stmt_code == INDIRECT_REF > >> || first_stmt_code == COMPONENT_REF > >> - || first_stmt_code == MEM_REF))) > >> + || first_stmt_code == MEM_REF) > >> + && (rhs_code == ARRAY_REF > >> + || rhs_code == BIT_FIELD_REF > >> + || rhs_code == INDIRECT_REF > >> + || rhs_code == COMPONENT_REF > >> + || rhs_code == MEM_REF))) > >> || first_stmt_load_p != load_p > >> || first_stmt_phi_p != phi_p) > >> { > >> -- > >> 2.25.1 > >>
Re: [PATCH] vect: Tighten check for SLP memory groups [PR103517]
Richard Biener via Gcc-patches writes: > On Wed, Dec 1, 2021 at 11:56 AM Richard Sandiford via Gcc-patches > wrote: >> >> When checking for compatible stmts, vect_build_slp_tree_1 did: >> >>&& !(STMT_VINFO_GROUPED_ACCESS (stmt_info) >> && (first_stmt_code == ARRAY_REF >> || first_stmt_code == BIT_FIELD_REF >> || first_stmt_code == INDIRECT_REF >> || first_stmt_code == COMPONENT_REF >> || first_stmt_code == MEM_REF))) >> >> That is, it allowed any rhs_code as long as the first_stmt_code >> looked valid. This had the effect of allowing IFN_MASK_LOAD >> to be paired with an earlier non-call code (but didn't allow >> the reverse). >> >> This patch makes the check symmetrical. >> >> Still testing on x86_64-linux-gnu. OK if testing passes, or doesn't >> this seem like the right approach? > > It's indeed a too weak comparison of the classification of the first > and the followup operands, some larger refactoring is probably > needed to improve here (note how we compare STMT_VINFO_GROUPED_ACCESS > of the followup against the tree codes of the first stmt but also later > compare first_stmt_load_p against load_p). > > The proposed patch looks reasonable (but then we could drop > the STMT_VINFO_GROUPED_ACCESS (stmt_info) part of the check?), Yeah, was wondering about that. Seemed safer to keep it, since without it we might pair non-memory BIT_FIELD_REFs with other things. I guess the same goes for the first stmt though, and the mismatch ought to be caught later anyway. > so OK. Thanks, Richard > Thanks, > Richard. > >> Richard >> >> >> gcc/ >> PR tree-optimization/103517 >> * tree-vect-slp.c (vect_build_slp_tree_1): When allowing two >> different component references, check the codes of both them, >> rather than just the first. >> >> gcc/testsuite/ >> PR tree-optimization/103517 >> * gcc.dg/vect/pr103517.c: New test. >> --- >> gcc/testsuite/gcc.dg/vect/pr103517.c | 13 + >> gcc/tree-vect-slp.c | 7 ++- >> 2 files changed, 19 insertions(+), 1 deletion(-) >> create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c >> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c >> b/gcc/testsuite/gcc.dg/vect/pr103517.c >> new file mode 100644 >> index 000..de87fc48f84 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/vect/pr103517.c >> @@ -0,0 +1,13 @@ >> +/* { dg-do compile } */ >> +/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* >> i?86-*-* } } */ >> + >> +int a; >> +short b, c; >> +extern short d[]; >> +void e() { >> + for (short f = 1; f < (short)a; f += 2) >> +if (d[f + 1]) { >> + b = d[f]; >> + c = d[f + 1]; >> +} >> +} >> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c >> index 7bff5118bd0..bc22ffeed82 100644 >> --- a/gcc/tree-vect-slp.c >> +++ b/gcc/tree-vect-slp.c >> @@ -1121,7 +1121,12 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char >> *swap, >> || first_stmt_code == BIT_FIELD_REF >> || first_stmt_code == INDIRECT_REF >> || first_stmt_code == COMPONENT_REF >> - || first_stmt_code == MEM_REF))) >> + || first_stmt_code == MEM_REF) >> + && (rhs_code == ARRAY_REF >> + || rhs_code == BIT_FIELD_REF >> + || rhs_code == INDIRECT_REF >> + || rhs_code == COMPONENT_REF >> + || rhs_code == MEM_REF))) >> || first_stmt_load_p != load_p >> || first_stmt_phi_p != phi_p) >> { >> -- >> 2.25.1 >>
Re: [PATCH] fix spelling of -linker-output-auto-nolto-rel
On Wed, Dec 1, 2021 at 12:08 PM Rasmus Villemoes wrote: > > The transposition nolto -> notlo is confusing and it makes the long > name even harder to read than it already is - I kept reading it as > "not lo" until I realized it was a simply typo. > > Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks) Hmm, if we ever released GCC with that lto-plugin behavior you should preserve accepting -linker-output-auto-notlo-rel in lto-plugin because a newer lto-plugin is required to work with older released GCC which continue to provide the option with the spelling mistake. Richard. > lto-plugin/ > * lto-plugin.c: Fix -linker-output-auto-notlo-rel -> > -linker-output-auto-nolto-rel typo. > (process_option): Adjust accordingly. > > gcc/ > * config/vxworks.h (LTO_PLUGIN_SPEC): Adapt to corrected > spelling of -linker-output-auto-nolto-rel. > --- > gcc/config/vxworks.h| 2 +- > lto-plugin/lto-plugin.c | 4 ++-- > 2 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/vxworks.h b/gcc/config/vxworks.h > index e41f16a51e8..bddf2c37f42 100644 > --- a/gcc/config/vxworks.h > +++ b/gcc/config/vxworks.h > @@ -306,4 +306,4 @@ extern void vxworks_emit_call_builtin___clear_cache (rtx > begin, rtx end); > further incremental LTO linking. We do not do repeated incremental > linking > so silence the warning (instead of passing -flinker-output=nolto-rel). */ > #undef LTO_PLUGIN_SPEC > -#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-notlo-rel}" > +#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-nolto-rel}" > diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c > index 6ab9822f369..40d7b7b8692 100644 > --- a/lto-plugin/lto-plugin.c > +++ b/lto-plugin/lto-plugin.c > @@ -32,7 +32,7 @@ along with this program; see the file COPYING3. If not see > -nop: Instead of running lto-wrapper, pass the original to the plugin. > This > only works if the input files are hybrid. > -linker-output-known: Do not determine linker output > - -linker-output-auto-notlo-rel: Switch from rel to nolto-rel mode without > + -linker-output-auto-nolto-rel: Switch from rel to nolto-rel mode without > warning. This is used on systems like VxWorks (kernel) where the link is > always partial and repeated incremental linking is generally not used. > -sym-style={none,win32,underscore|uscore} > @@ -1321,7 +1321,7 @@ process_option (const char *option) > { >if (strcmp (option, "-linker-output-known") == 0) > linker_output_known = true; > - else if (strcmp (option, "-linker-output-auto-notlo-rel") == 0) > + else if (strcmp (option, "-linker-output-auto-nolto-rel") == 0) > linker_output_auto_nolto_rel = true; >else if (strcmp (option, "-debug") == 0) > debug = true; > -- > 2.31.1 >
Re: [PATCH] vect: Tighten check for SLP memory groups [PR103517]
On Wed, Dec 1, 2021 at 11:56 AM Richard Sandiford via Gcc-patches wrote: > > When checking for compatible stmts, vect_build_slp_tree_1 did: > >&& !(STMT_VINFO_GROUPED_ACCESS (stmt_info) > && (first_stmt_code == ARRAY_REF > || first_stmt_code == BIT_FIELD_REF > || first_stmt_code == INDIRECT_REF > || first_stmt_code == COMPONENT_REF > || first_stmt_code == MEM_REF))) > > That is, it allowed any rhs_code as long as the first_stmt_code > looked valid. This had the effect of allowing IFN_MASK_LOAD > to be paired with an earlier non-call code (but didn't allow > the reverse). > > This patch makes the check symmetrical. > > Still testing on x86_64-linux-gnu. OK if testing passes, or doesn't > this seem like the right approach? It's indeed a too weak comparison of the classification of the first and the followup operands, some larger refactoring is probably needed to improve here (note how we compare STMT_VINFO_GROUPED_ACCESS of the followup against the tree codes of the first stmt but also later compare first_stmt_load_p against load_p). The proposed patch looks reasonable (but then we could drop the STMT_VINFO_GROUPED_ACCESS (stmt_info) part of the check?), so OK. Thanks, Richard. > Richard > > > gcc/ > PR tree-optimization/103517 > * tree-vect-slp.c (vect_build_slp_tree_1): When allowing two > different component references, check the codes of both them, > rather than just the first. > > gcc/testsuite/ > PR tree-optimization/103517 > * gcc.dg/vect/pr103517.c: New test. > --- > gcc/testsuite/gcc.dg/vect/pr103517.c | 13 + > gcc/tree-vect-slp.c | 7 ++- > 2 files changed, 19 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c > > diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c > b/gcc/testsuite/gcc.dg/vect/pr103517.c > new file mode 100644 > index 000..de87fc48f84 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/pr103517.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* > i?86-*-* } } */ > + > +int a; > +short b, c; > +extern short d[]; > +void e() { > + for (short f = 1; f < (short)a; f += 2) > +if (d[f + 1]) { > + b = d[f]; > + c = d[f + 1]; > +} > +} > diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c > index 7bff5118bd0..bc22ffeed82 100644 > --- a/gcc/tree-vect-slp.c > +++ b/gcc/tree-vect-slp.c > @@ -1121,7 +1121,12 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char > *swap, > || first_stmt_code == BIT_FIELD_REF > || first_stmt_code == INDIRECT_REF > || first_stmt_code == COMPONENT_REF > - || first_stmt_code == MEM_REF))) > + || first_stmt_code == MEM_REF) > + && (rhs_code == ARRAY_REF > + || rhs_code == BIT_FIELD_REF > + || rhs_code == INDIRECT_REF > + || rhs_code == COMPONENT_REF > + || rhs_code == MEM_REF))) > || first_stmt_load_p != load_p > || first_stmt_phi_p != phi_p) > { > -- > 2.25.1 >
Re: [PATCH RFA (fold/symtab)] c++: constexpr, fold, weak redecl, fp/0 [PR103310]
On Tue, Nov 30, 2021 at 09:09:41PM -0500, Jason Merrill via Gcc-patches wrote: > For PR61825, honza changed tree_single_nonzero_warnv_p to prevent a later > declaration from marking a function as weak after we've determined that it > wasn't weak before. But we shouldn't do that for speculative folding; we > should only do it when we actually need a constant value. In C++, such a > context is called "manifestly constant-evaluated". In fold, this seems to > correspond to the folding_initializer flag, since in C this situation only > occurs in static initializers. > > This change makes nonzero-1.c well-formed; I've added a nonzero-1a.c to > verify that we delete the null check eventually if there is no weak > redeclaration. > > The varasm.c change is so that if we do get the weak redeclaration error, we > get it at the position of the weak declaration rather than the previous > declaration. > > Using the FOLD_INIT paths also affects floating point arithmetic: notably, > this makes floating point division by zero in a manifestly > constant-evaluated context constant, as in a C static initializer. I've had > some success convincing CWG that this is the right direction; C++ should > follow C's floating point semantics more than we have been doing, and Joseph > says that the C policy is that Annex F overrides other parts of the standard > that say that some operations are undefined. But since we're in stage 3, > I'm only making this change with the new flag -fconstexpr-fp-except. It may > turn on by default in a future release. > > I think this distinction is only relevant for binary operations; arithmetic > for the floating point case, comparison for possibly non-zero addresses. > > Tested x86_64-pc-linux-gnu, OK for trunk? [...] > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > index 4b8a094b206..5654f044ae4 100644 > --- a/gcc/c-family/c.opt > +++ b/gcc/c-family/c.opt > @@ -1615,6 +1615,10 @@ fconstexpr-cache-depth= > C++ ObjC++ Joined RejectNegative UInteger Var(constexpr_cache_depth) Init(8) > -fconstexpr-cache-depth= Specify maximum constexpr recursion > cache depth. > > +fconstexpr-fp-except > +C++ ObjC++ Var(flag_constexpr_fp_except) Init(0) > +Allow IEC559 floating point exceptions in constant expressions Sorry about nitpicking but there's a missing period after 'expressions', which will break a test. Marek
[PATCH] fix spelling of -linker-output-auto-nolto-rel
The transposition nolto -> notlo is confusing and it makes the long name even harder to read than it already is - I kept reading it as "not lo" until I realized it was a simply typo. Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks) lto-plugin/ * lto-plugin.c: Fix -linker-output-auto-notlo-rel -> -linker-output-auto-nolto-rel typo. (process_option): Adjust accordingly. gcc/ * config/vxworks.h (LTO_PLUGIN_SPEC): Adapt to corrected spelling of -linker-output-auto-nolto-rel. --- gcc/config/vxworks.h| 2 +- lto-plugin/lto-plugin.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/gcc/config/vxworks.h b/gcc/config/vxworks.h index e41f16a51e8..bddf2c37f42 100644 --- a/gcc/config/vxworks.h +++ b/gcc/config/vxworks.h @@ -306,4 +306,4 @@ extern void vxworks_emit_call_builtin___clear_cache (rtx begin, rtx end); further incremental LTO linking. We do not do repeated incremental linking so silence the warning (instead of passing -flinker-output=nolto-rel). */ #undef LTO_PLUGIN_SPEC -#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-notlo-rel}" +#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-nolto-rel}" diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c index 6ab9822f369..40d7b7b8692 100644 --- a/lto-plugin/lto-plugin.c +++ b/lto-plugin/lto-plugin.c @@ -32,7 +32,7 @@ along with this program; see the file COPYING3. If not see -nop: Instead of running lto-wrapper, pass the original to the plugin. This only works if the input files are hybrid. -linker-output-known: Do not determine linker output - -linker-output-auto-notlo-rel: Switch from rel to nolto-rel mode without + -linker-output-auto-nolto-rel: Switch from rel to nolto-rel mode without warning. This is used on systems like VxWorks (kernel) where the link is always partial and repeated incremental linking is generally not used. -sym-style={none,win32,underscore|uscore} @@ -1321,7 +1321,7 @@ process_option (const char *option) { if (strcmp (option, "-linker-output-known") == 0) linker_output_known = true; - else if (strcmp (option, "-linker-output-auto-notlo-rel") == 0) + else if (strcmp (option, "-linker-output-auto-nolto-rel") == 0) linker_output_auto_nolto_rel = true; else if (strcmp (option, "-debug") == 0) debug = true; -- 2.31.1
[committed] d: Update documentation of new D language options.
Hi, This patch adds documentation for the following new D options: - New switch that controls what code is generated on a contract failure (throw or abort). - New switch that controls mangling of D types in `extern(C++)` code, as well as setting the compile-time value of `__traits(getTargetInfo "cppStd")` - New switches that generate C++ headers from D source files. - New switch to save expanded mixins to a file. - New switches that now distinguish between D language changes that are either (a) an experimental feature or an upcoming breaking change, (b) a warning or help on an upcoming change, or (c) revert of a change for users who don't want to deal with the breaking change for now. Bootstrapped on x86_64-linux-gnu, and committed to mainline. Regards, Iain --- gcc/d/ChangeLog: * gdc.texi (Runtime Options): Document -fcheckaction=, -fextern-std=, -fpreview=, -frevert=. (Code Generation): Document -fdump-c++-spec=, -fdump-c++-spec-verbose, -fsave-mixins=. (Warnings): Update list of supported -ftransitions=. --- gcc/d/gdc.texi | 114 ++--- 1 file changed, 109 insertions(+), 5 deletions(-) diff --git a/gcc/d/gdc.texi b/gcc/d/gdc.texi index 095f7ecca41..c98eb1f45d9 100644 --- a/gcc/d/gdc.texi +++ b/gcc/d/gdc.texi @@ -216,6 +216,20 @@ Don't recognize built-in functions unless they begin with the prefix @samp{__builtin_}. By default, the compiler will recognize when a function in the @code{core.stdc} package is a built-in function. +@item -fcheckaction=@var{value} +@cindex @option{-fcheckaction} +This option controls what code is generated on an assertion, bounds check, or +final switch failure. The following values are supported: + +@table @samp +@item context +Throw an @code{AssertError} with extra context information. +@item halt +Halt the program execution. +@item throw +Throw an @code{AssertError} (the default). +@end table + @item -fdebug @item -fdebug=@var{value} @cindex @option{-fdebug} @@ -245,6 +259,25 @@ This is equivalent to compiling with the following options: gdc -nophoboslib -fno-exceptions -fno-moduleinfo -fno-rtti @end example +@item -fextern-std=@var{standard} +@cindex @option{-fextern-std} +Sets the C++ name mangling compatibility to the version identified by +@var{standard}. The following values are supported: + +@table @samp +@item c++98 +@item c++03 +Sets @code{__traits(getTargetInfo "cppStd")} to @code{199711}. +@item c++11 +Sets @code{__traits(getTargetInfo "cppStd")} to @code{201103}. +@item c++14 +Sets @code{__traits(getTargetInfo "cppStd")} to @code{201402}. +@item c++17 +Sets @code{__traits(getTargetInfo "cppStd")} to @code{201703}. +@item c++20 +Sets @code{__traits(getTargetInfo "cppStd")} to @code{202002}. +@end table + @item -fno-invariants @cindex @option{-finvariants} @cindex @option{-fno-invariants} @@ -276,6 +309,48 @@ Turns off code generation for postcondition @code{out} contracts. @cindex @option{-fno-preconditions} Turns off code generation for precondition @code{in} contracts. +@item -fpreview=@var{id} +@cindex @option{-fpreview} +Turns on an upcoming D language change identified by @var{id}. The following +values are supported: + +@table @samp +@item all +Turns on all upcoming D language features. +@item dip1000 +Implements @uref{http://wiki.dlang.org/DIP1000} (Scoped pointers). +@item dip1008 +Implements @uref{http://wiki.dlang.org/DIP1008} (Allow exceptions in +@code{@@nogc} code). +@item dip1021 +Implements @uref{http://wiki.dlang.org/DIP1021} (Mutable function arguments). +@item dip25 +Implements @uref{http://wiki.dlang.org/DIP25} (Sealed references). +@item dtorfields +Turns on generation for destructing fields of partially constructed objects. +@item fieldwise +Turns on generation of struct equality to use field-wise comparisons. +@item fixaliasthis +Implements new lookup rules that check the current scope for @code{alias this} +before searching in upper scopes. +@item in +Implements @code{in} parameters to mean @code{scope const [ref]} and accepts +rvalues. +@item inclusiveincontracts +Implements @code{in} contracts of overridden methods to be a superset of parent +contract. +@item intpromote +Implements C-style integral promotion for unary @code{+}, @code{-} and @code{~} +expressions. +@item nosharedaccess +Turns off and disallows all access to shared memory objects. +@item rvaluerefparam +Implements rvalue arguments to @code{ref} parameters. +@item shortenedmethods +Implements use of @code{=>} for methods and top-level functions in addition to +lambdas. +@end table + @item -frelease @cindex @option{-frelease} @cindex @option{-fno-release} @@ -291,6 +366,22 @@ gdc -fno-assert -fbounds-check=safe -fno-invariants \ -fno-postconditions -fno-preconditions -fno-switch-errors @end example +@item -frevert= +@cindex @option{-frevert} +Turns off a D language feature identified by @var{
[PATCH] vect: Tighten check for SLP memory groups [PR103517]
When checking for compatible stmts, vect_build_slp_tree_1 did: && !(STMT_VINFO_GROUPED_ACCESS (stmt_info) && (first_stmt_code == ARRAY_REF || first_stmt_code == BIT_FIELD_REF || first_stmt_code == INDIRECT_REF || first_stmt_code == COMPONENT_REF || first_stmt_code == MEM_REF))) That is, it allowed any rhs_code as long as the first_stmt_code looked valid. This had the effect of allowing IFN_MASK_LOAD to be paired with an earlier non-call code (but didn't allow the reverse). This patch makes the check symmetrical. Still testing on x86_64-linux-gnu. OK if testing passes, or doesn't this seem like the right approach? Richard gcc/ PR tree-optimization/103517 * tree-vect-slp.c (vect_build_slp_tree_1): When allowing two different component references, check the codes of both them, rather than just the first. gcc/testsuite/ PR tree-optimization/103517 * gcc.dg/vect/pr103517.c: New test. --- gcc/testsuite/gcc.dg/vect/pr103517.c | 13 + gcc/tree-vect-slp.c | 7 ++- 2 files changed, 19 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c b/gcc/testsuite/gcc.dg/vect/pr103517.c new file mode 100644 index 000..de87fc48f84 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr103517.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* i?86-*-* } } */ + +int a; +short b, c; +extern short d[]; +void e() { + for (short f = 1; f < (short)a; f += 2) +if (d[f + 1]) { + b = d[f]; + c = d[f + 1]; +} +} diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index 7bff5118bd0..bc22ffeed82 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -1121,7 +1121,12 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char *swap, || first_stmt_code == BIT_FIELD_REF || first_stmt_code == INDIRECT_REF || first_stmt_code == COMPONENT_REF - || first_stmt_code == MEM_REF))) + || first_stmt_code == MEM_REF) + && (rhs_code == ARRAY_REF + || rhs_code == BIT_FIELD_REF + || rhs_code == INDIRECT_REF + || rhs_code == COMPONENT_REF + || rhs_code == MEM_REF))) || first_stmt_load_p != load_p || first_stmt_phi_p != phi_p) { -- 2.25.1
Re: [PATCH] Fix alignment of stack slots for overaligned types [PR103500]
On 30/11/2021 19:38, Florian Weimer wrote: > * Alex Coplan via Gcc-patches: > > > Bootstrapped and regtested on aarch64-linux-gnu, x86_64-linux-gnu, and > > arm-linux-gnueabihf: no regressions. > > > > I'd appreciate any feedback. Is it OK for trunk? > > Does this need an ABI warning? That's a good question. I'm not sure exactly what changes merit an ABI warning in GCC. In this case, we aren't changing how the parameter is passed in the sense that the callee can still locate it in the same way (by following the pointer in the appropriate GPR), we're simply increasing the alignment of the stack slot used for the indirected copy. So on this basis I would guess that it doesn't need a warning, but I'm happy to be corrected. Thanks, Alex > > Thanks, > Florian >
[Ada] Fix incorrect fixed-point computation in expression function
This fixes a couple of issues pertaining to (ordinary) fixed-point types declared with a Small aspect whose value is not equal to the default one. The processing of this aspect is delayed until the freeze point of the type, which means that the Small_Value of the entity for the type does not have the right value until after the freeze point is encountered. The first issue is that Resolve_Real_Literal could use the Small_Value of the entity for an unfrozen type, for example during the pre-analysis of a default expression or of an expression function. The second issue is that Freeze_Fixed_Point_Type could still use the old value of Small_Value even after it has set the field to its final value. It could also overwrite a correct Small_Value with garbage in the case of the subtype of a private fixed-point type. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * einfo.ads (E_Decimal_Fixed_Point_Subtype): Fix pasto. * freeze.adb (Freeze_Fixed_Point_Type): Retrieve the underlying type of the first subtype and do not use a stale value of Small_Value. * sem_res.adb (Resolve_Real_Literal): In the case of a fixed-point type, make sure that the base type is frozen and use its Small_Value to compute the corresponding integer value of the literal.diff --git a/gcc/ada/einfo.ads b/gcc/ada/einfo.ads --- a/gcc/ada/einfo.ads +++ b/gcc/ada/einfo.ads @@ -5353,7 +5353,7 @@ package Einfo is --Size_Clause (synth) -- E_Decimal_Fixed_Point_Type - -- E_Decimal_Fixed_Subtype$$$no such thing + -- E_Decimal_Fixed_Point_Subtype --Scale_Value --Digits_Value --Scalar_Range diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb --- a/gcc/ada/freeze.adb +++ b/gcc/ada/freeze.adb @@ -8997,8 +8997,9 @@ package body Freeze is Brng : constant Node_Id:= Scalar_Range (Btyp); BLo : constant Node_Id:= Low_Bound (Brng); BHi : constant Node_Id:= High_Bound (Brng); - Par : constant Entity_Id := First_Subtype (Typ); - Small : constant Ureal := Small_Value (Typ); + Ftyp : constant Entity_Id := Underlying_Type (First_Subtype (Typ)); + + Small : Ureal; Loval : Ureal; Hival : Ureal; Atype : Entity_Id; @@ -9037,7 +9038,7 @@ package body Freeze is function Larger (A, B : Ureal) return Boolean is begin - return A > B and then A - Small > B; + return A > B and then A - Small_Value (Typ) > B; end Larger; - @@ -9046,7 +9047,7 @@ package body Freeze is function Smaller (A, B : Ureal) return Boolean is begin - return A < B and then A + Small < B; + return A < B and then A + Small_Value (Typ) < B; end Smaller; -- Start of processing for Freeze_Fixed_Point_Type @@ -9057,9 +9058,15 @@ package body Freeze is -- so that all characteristics of the type (size, bounds) can be -- computed and validated in the call to Minimum_Size that follows. - if Has_Delayed_Aspects (First_Subtype (Typ)) then - Analyze_Aspects_At_Freeze_Point (First_Subtype (Typ)); - Set_Has_Delayed_Aspects (First_Subtype (Typ), False); + if Has_Delayed_Aspects (Ftyp) then + Analyze_Aspects_At_Freeze_Point (Ftyp); + Set_Has_Delayed_Aspects (Ftyp, False); + end if; + + -- Inherit the Small value from the first subtype in any case + + if Typ /= Ftyp then + Set_Small_Value (Typ, Small_Value (Ftyp)); end if; -- If Esize of a subtype has not previously been set, set it now @@ -9074,16 +9081,6 @@ package body Freeze is end if; end if; - -- The 'small attribute may have been specified with an aspect, - -- in which case it is processed after a subtype declaration, so - -- inherit now the specified value. - - if Typ /= Par -and then Present (Find_Aspect (Par, Aspect_Small)) - then - Set_Small_Value (Typ, Small_Value (Par)); - end if; - -- Immediate return if the range is already analyzed. This means that -- the range is already set, and does not need to be computed by this -- routine. @@ -9100,6 +9097,7 @@ package body Freeze is return; end if; + Small := Small_Value (Typ); Loval := Realval (Lo); Hival := Realval (Hi); @@ -9137,7 +9135,6 @@ package body Freeze is Size_Excl_EP : Int; Model_Num : Ureal; -First_Subt: Entity_Id; Actual_Lo : Ureal; Actual_Hi : Ureal; @@ -9279,10 +9276,8 @@ package body Freeze is -- to get a base type whose size is smaller than the specified -- size of the first subtype. - First_Subt := First_Subtype (Typ); - - if Has_Size_Clause (First_Subt) - and then Size_Incl_EP <= Esi
[Ada] Tune whitespace of the bounded lists Aggregate aspect
Whitespace cleanup only. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * libgnat/a-cbdlli.ads (List): Remove extra space in Aggregate aspect.diff --git a/gcc/ada/libgnat/a-cbdlli.ads b/gcc/ada/libgnat/a-cbdlli.ads --- a/gcc/ada/libgnat/a-cbdlli.ads +++ b/gcc/ada/libgnat/a-cbdlli.ads @@ -56,8 +56,8 @@ is Variable_Indexing => Reference, Default_Iterator => Iterate, Iterator_Element => Element_Type, - Aggregate => (Empty=> Empty, -Add_Unnamed => Append), + Aggregate => (Empty => Empty, +Add_Unnamed => Append), Preelaborable_Initialization => Element_Type'Preelaborable_Initialization;
[Ada] Allow formal functions to have a default in the form of an expression function
As a language extension, the declaration of a generic formal function is allowed to have a default given by an expression in parentheses, where the expression is of the function's result type and can refer to formal parameters of the function (in direct analogy with expression functions). For example: generic type T is private; with function Copy (Item : T) return T is (Item); -- Defaults to Item package Stacks is type Stack is limited private; procedure Push (S : in out Stack; X : T); -- Calls Copy on X function Pop (S : in out Stack) return T; -- Calls Copy to return item private ... end Stacks; Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * doc/gnat_rm/implementation_defined_pragmas.rst: Add documentation of the new form of formal subprogram default in the section on language extensions (pragma Extensions_Allowed). * gnat_rm.texi: Regenerate. * gen_il-gen-gen_nodes.adb: Add Expression as a syntactic field of N_Formal_(Abstract|Concrete)_Subprogram_Declaration nodes. * par-ch12.adb (P_Formal_Subprogram_Declaration): Add parsing support for the new default of a parenthesized expression for formal functions. Issue an error when extensions are not allowed, suggesting use of -gnatX. Update comment with extended syntax for SUBPROGRAM_DEFAULT. * sem_ch12.adb (Analyze_Formal_Subprogram_Declaration): Issue an error when an expression default is given for an abstract formal function. When a default expression is present for a formal function, install the function's formals and preanalyze the expression. (Instantiate_Formal_Subprogram): Fix typo in RM paragraph in a comment. When a formal function has a default expression, create a body for the function that will evaluate the expression and will be called when the default applies in an instantiation. The implicit function is marked as inlined and as having convention Intrinsic.diff --git a/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst b/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst --- a/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst +++ b/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst @@ -2401,6 +2401,30 @@ of GNAT specific extensions are recognized as follows: name, preference is given to the component in a selected_component (as is currently the case for tagged types with such component names). +* Expression defaults for generic formal functions + + The declaration of a generic formal function is allowed to specify + an expression as a default, using the syntax of an expression function. + + Here is an example of this feature: + + .. code-block:: ada + + generic + type T is private; + with function Copy (Item : T) return T is (Item); -- Defaults to Item + package Stacks is + + type Stack is limited private; + + procedure Push (S : in out Stack; X : T); -- Calls Copy on X + + function Pop (S : in out Stack) return T; -- Calls Copy to return item + + private + -- ... + end Stacks; + .. _Pragma-Extensions_Visible: Pragma Extensions_Visible diff --git a/gcc/ada/gen_il-gen-gen_nodes.adb b/gcc/ada/gen_il-gen-gen_nodes.adb --- a/gcc/ada/gen_il-gen-gen_nodes.adb +++ b/gcc/ada/gen_il-gen-gen_nodes.adb @@ -1136,11 +1136,13 @@ begin -- Gen_IL.Gen.Gen_Nodes Cc (N_Formal_Abstract_Subprogram_Declaration, N_Formal_Subprogram_Declaration, (Sy (Specification, Node_Id), Sy (Default_Name, Node_Id, Default_Empty), +Sy (Expression, Node_Id, Default_Empty), Sy (Box_Present, Flag))); Cc (N_Formal_Concrete_Subprogram_Declaration, N_Formal_Subprogram_Declaration, (Sy (Specification, Node_Id), Sy (Default_Name, Node_Id, Default_Empty), +Sy (Expression, Node_Id, Default_Empty), Sy (Box_Present, Flag))); Ab (N_Push_Pop_xxx_Label, Node_Kind); diff --git a/gcc/ada/gnat_rm.texi b/gcc/ada/gnat_rm.texi --- a/gcc/ada/gnat_rm.texi +++ b/gcc/ada/gnat_rm.texi @@ -3853,6 +3853,31 @@ simple name as one of the type’s primitive subprograms, where the component is visible at the point of a selected_component using that name, preference is given to the component in a selected_component (as is currently the case for tagged types with such component names). + +@item +Expression defaults for generic formal functions + +The declaration of a generic formal function is allowed to specify +an expression as a default, using the syntax of an expression function. + +Here is an example of this feature: + +@example +generic + type T is private; + with function Copy (Item : T) return T is (Item); -- Defaults to Item +package Stacks is + + type Stack is limited private; + + procedure Push (S : in out Stack; X : T); -- Calls Copy on X + + function Pop (S : in
[Ada] Do not return freeze nodes for start of early call regions
This makes sure that the Early Call Region Processor in Sem_Elab does not return a region starting with a construct that is not suitable for being included into it, for example a freeze node. Tested on x86_64-pc-linux-gnu, committed on trunk gcc/ada/ * sem_elab.adb (Previous_Suitable_Construct): New function declared in the Early_Call_Region_Processor package. (Find_ECR): Call it to get the previous node at the start. (Include): Call it to get the previous node during the traversal.diff --git a/gcc/ada/sem_elab.adb b/gcc/ada/sem_elab.adb --- a/gcc/ada/sem_elab.adb +++ b/gcc/ada/sem_elab.adb @@ -6965,6 +6965,11 @@ package body Sem_Elab is -- Determine whether arbitrary node N denotes a suitable construct -- for inclusion into the early call region. + function Previous_Suitable_Construct (N : Node_Id) return Node_Id; + pragma Inline (Previous_Suitable_Construct); + -- Return the previous node suitable for inclusion into the early + -- call region. + procedure Transition_Body_Declarations (Bod : Node_Id; Curr : out Node_Id); @@ -7209,7 +7214,7 @@ package body Sem_Elab is begin -- The early call region starts at N -Curr := Prev (N); +Curr := Previous_Suitable_Construct (N); Start := N; -- Inspect each node in reverse declarative order while going in @@ -7286,7 +7291,7 @@ package body Sem_Elab is -- Otherwise the input node is still within some list else - Curr := Prev (Start); + Curr := Previous_Suitable_Construct (Start); end if; end Include; @@ -7378,6 +7383,23 @@ package body Sem_Elab is end case; end Is_Suitable_Construct; + - + -- Previous_Suitable_Construct -- + - + + function Previous_Suitable_Construct (N : Node_Id) return Node_Id is +P : Node_Id; + + begin +P := Prev (N); + +while Present (P) and then not Is_Suitable_Construct (P) loop + Prev (P); +end loop; + +return P; + end Previous_Suitable_Construct; + -- -- Transition_Body_Declarations -- --