Re: [PATCH v2] fix spelling of -linker-output-auto-nolto-rel

2021-12-01 Thread Olivier Hainque via Gcc-patches
Hi Rasmus,

The VxWorks part is ok. The plugin part looks good to me
as well, now in line with Richard's comment, but is not for
me to approve.

Thanks, and Thanks Richard for the original review.

Cheers,

Olivier

> On 2 Dec 2021, at 08:22, Rasmus Villemoes  wrote:
> 
> The transposition nolto -> notlo is confusing and it makes the long
> name even harder to read than it already is - I kept reading it as
> "not lo" until I realized it was a simple typo.
> 
> Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks)
> 
> lto-plugin/
>   * lto-plugin.c: Fix -linker-output-auto-notlo-rel ->
>   -linker-output-auto-nolto-rel typo.
>   (process_option): Adjust accordingly, accepting both old and
>   new spelling.
> 
> gcc/
>   * config/vxworks.h (LTO_PLUGIN_SPEC): Adapt to corrected
>   spelling of -linker-output-auto-nolto-rel.
> ---
> v2: still accept -linker-output-auto-notlo-rel in the plugin.
> 
> Like this?
> 
> gcc/config/vxworks.h| 2 +-
> lto-plugin/lto-plugin.c | 6 --
> 2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/config/vxworks.h b/gcc/config/vxworks.h
> index e41f16a51e8..bddf2c37f42 100644
> --- a/gcc/config/vxworks.h
> +++ b/gcc/config/vxworks.h
> @@ -306,4 +306,4 @@ extern void vxworks_emit_call_builtin___clear_cache (rtx 
> begin, rtx end);
>further incremental LTO linking.  We do not do repeated incremental linking
>so silence the warning (instead of passing -flinker-output=nolto-rel).  */
> #undef LTO_PLUGIN_SPEC
> -#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-notlo-rel}"
> +#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-nolto-rel}"
> diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
> index 6ab9822f369..b73483de994 100644
> --- a/lto-plugin/lto-plugin.c
> +++ b/lto-plugin/lto-plugin.c
> @@ -32,7 +32,7 @@ along with this program; see the file COPYING3.  If not see
>-nop: Instead of running lto-wrapper, pass the original to the plugin. This
>only works if the input files are hybrid. 
>-linker-output-known: Do not determine linker output
> -   -linker-output-auto-notlo-rel: Switch from rel to nolto-rel mode without
> +   -linker-output-auto-nolto-rel: Switch from rel to nolto-rel mode without
>warning.  This is used on systems like VxWorks (kernel) where the link is
>always partial and repeated incremental linking is generally not used.
>-sym-style={none,win32,underscore|uscore}
> @@ -1321,7 +1321,9 @@ process_option (const char *option)
> {
>   if (strcmp (option, "-linker-output-known") == 0)
> linker_output_known = true;
> -  else if (strcmp (option, "-linker-output-auto-notlo-rel") == 0)
> +  /* Also accept "notlo" for backwards compatibility.  */
> +  else if ((strcmp (option, "-linker-output-auto-nolto-rel") == 0)
> +   || (strcmp (option, "-linker-output-auto-notlo-rel") == 0))
> linker_output_auto_nolto_rel = true;
>   else if (strcmp (option, "-debug") == 0)
> debug = true;
> -- 
> 2.31.1
> 



[PATCH v2] fix spelling of -linker-output-auto-nolto-rel

2021-12-01 Thread Rasmus Villemoes
The transposition nolto -> notlo is confusing and it makes the long
name even harder to read than it already is - I kept reading it as
"not lo" until I realized it was a simple typo.

Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks)

lto-plugin/
* lto-plugin.c: Fix -linker-output-auto-notlo-rel ->
-linker-output-auto-nolto-rel typo.
(process_option): Adjust accordingly, accepting both old and
new spelling.

gcc/
* config/vxworks.h (LTO_PLUGIN_SPEC): Adapt to corrected
spelling of -linker-output-auto-nolto-rel.
---
v2: still accept -linker-output-auto-notlo-rel in the plugin.

Like this?

 gcc/config/vxworks.h| 2 +-
 lto-plugin/lto-plugin.c | 6 --
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/config/vxworks.h b/gcc/config/vxworks.h
index e41f16a51e8..bddf2c37f42 100644
--- a/gcc/config/vxworks.h
+++ b/gcc/config/vxworks.h
@@ -306,4 +306,4 @@ extern void vxworks_emit_call_builtin___clear_cache (rtx 
begin, rtx end);
further incremental LTO linking.  We do not do repeated incremental linking
so silence the warning (instead of passing -flinker-output=nolto-rel).  */
 #undef LTO_PLUGIN_SPEC
-#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-notlo-rel}"
+#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-nolto-rel}"
diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
index 6ab9822f369..b73483de994 100644
--- a/lto-plugin/lto-plugin.c
+++ b/lto-plugin/lto-plugin.c
@@ -32,7 +32,7 @@ along with this program; see the file COPYING3.  If not see
-nop: Instead of running lto-wrapper, pass the original to the plugin. This
only works if the input files are hybrid. 
-linker-output-known: Do not determine linker output
-   -linker-output-auto-notlo-rel: Switch from rel to nolto-rel mode without
+   -linker-output-auto-nolto-rel: Switch from rel to nolto-rel mode without
warning.  This is used on systems like VxWorks (kernel) where the link is
always partial and repeated incremental linking is generally not used.
-sym-style={none,win32,underscore|uscore}
@@ -1321,7 +1321,9 @@ process_option (const char *option)
 {
   if (strcmp (option, "-linker-output-known") == 0)
 linker_output_known = true;
-  else if (strcmp (option, "-linker-output-auto-notlo-rel") == 0)
+  /* Also accept "notlo" for backwards compatibility.  */
+  else if ((strcmp (option, "-linker-output-auto-nolto-rel") == 0)
+   || (strcmp (option, "-linker-output-auto-notlo-rel") == 0))
 linker_output_auto_nolto_rel = true;
   else if (strcmp (option, "-debug") == 0)
 debug = true;
-- 
2.31.1



Re: [PATCH] c++: Handle auto(x) in parameter-declaration-clause [PR103401]

2021-12-01 Thread Jason Merrill via Gcc-patches

On 12/1/21 10:16, Marek Polacek wrote:

In C++23, auto(x) is valid, so decltype(auto(x)) should also be valid,
so

   void f(decltype(auto(0)));

should be just as

   void f(int);

but currently, everytime we see 'auto' in a parameter-declaration-clause,
we try to synthesize_implicit_template_parm for it, creating a new template
parameter list.  The code above actually has us calling s_i_t_p twice;
once from cp_parser_decltype_expr -> cp_parser_postfix_expression which
fails and then again from cp_parser_decltype_expr -> cp_parser_expression.
So it looks like we have f and we accept ill-formed code.

So we need to be more careful about synthesizing the implicit template
parameter.  cp_parser_postfix_expression looked like a sensible place.


Does this cover other uses of auto in decltype, such as

void f(decltype(new auto{0}));

?  Should we adjust this flag in cp_parser_decltype along with all the 
other flags?



The r11-1913 change is OK: we need to make sure that we see '(auto)' after
decltype to go ahead with 'decltype(auto)'.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/103401

gcc/cp/ChangeLog:

* parser.c (cp_parser_postfix_expression): Set
auto_is_implicit_function_template_parm_p when parsing a postfix
expression.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-fncast7.C: New test.
* g++.dg/cpp23/auto-fncast8.C: New test.
---
  gcc/cp/parser.c   |  2 ++
  gcc/testsuite/g++.dg/cpp23/auto-fncast7.C |  9 
  gcc/testsuite/g++.dg/cpp23/auto-fncast8.C | 28 +++
  3 files changed, 39 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast7.C
  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast8.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 55e6a1a8b3a..c43b180f888 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -7508,6 +7508,8 @@ cp_parser_postfix_expression (cp_parser *parser, bool 
address_p, bool cast_p,
   looking at a functional cast.  We could also be looking at
   an id-expression.  So, we try the functional cast, and if
   that doesn't work we fall back to the primary-expression.  */
+   auto cleanup = make_temp_override
+ (parser->auto_is_implicit_function_template_parm_p, false);
cp_parser_parse_tentatively (parser);
/* Look for the simple-type-specifier.  */
  ++parser->prevent_constrained_type_specifiers;
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C 
b/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C
new file mode 100644
index 000..763164f3e5b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C
@@ -0,0 +1,9 @@
+// PR c++/103401
+// { dg-do compile { target c++23 } }
+
+void f(decltype(auto(0))) { }
+
+int main()
+{
+  f(0); // { dg-error "no matching function" }
+}
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C 
b/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C
new file mode 100644
index 000..760827a5d6e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C
@@ -0,0 +1,28 @@
+// PR c++/103401
+// { dg-do compile { target c++23 } }
+
+void f1 (decltype(auto(0)));
+void f2 (decltype(auto{0}));
+void f3 (int = auto(42));
+void f4 (int = auto{42});
+void f5 (decltype(auto(0)) = auto(42));
+void f6 (auto (x));
+void f7 (int[auto(10)]);
+void f8 (int[auto{10}]);
+
+void
+g ()
+{
+  f1 (1);
+  f2 (1);
+  f3 ();
+  f3 (1);
+  f4 ();
+  f4 (1);
+  f5 ();
+  f5 (1);
+  f6 ('a');
+  int a[10];
+  f7 (&a[0]);
+  f8 (&a[0]);
+}

base-commit: e5440bc08e07fd491dcccd47e1b86a5985ee117c





Re: [PATCH] c++: ICE with unnamed tparm and concept [PR103408]

2021-12-01 Thread Jason Merrill via Gcc-patches

On 12/1/21 10:16, Marek Polacek wrote:

Here we crash when issuing the "constraint C has type T, not bool"
error, because pp_cxx_parameter_mapping wasn't prepared to see an
anonymous template parameter.  With this patch we print

error: constraint 'auto() [with  = 0]' has type '', 
not 'bool'

The "" is what this patch adds.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


PR c++/103408

gcc/cp/ChangeLog:

* cxx-pretty-print.c (pp_cxx_parameter_mapping): Print ""
rather than crash on an unnamed template parameter.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/concepts-err1.C: New test.
---
  gcc/cp/cxx-pretty-print.c  | 4 +++-
  gcc/testsuite/g++.dg/cpp23/concepts-err1.C | 6 ++
  2 files changed, 9 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp23/concepts-err1.C

diff --git a/gcc/cp/cxx-pretty-print.c b/gcc/cp/cxx-pretty-print.c
index 25cabfee39f..3ea357deb80 100644
--- a/gcc/cp/cxx-pretty-print.c
+++ b/gcc/cp/cxx-pretty-print.c
@@ -2891,8 +2891,10 @@ pp_cxx_parameter_mapping (cxx_pretty_printer *pp, tree 
map)
  
if (TYPE_P (parm))

pp->type_id (parm);
+  else if (tree name = DECL_NAME (TEMPLATE_PARM_DECL (parm)))
+   pp_cxx_tree_identifier (pp, name);
else
-   pp_cxx_tree_identifier (pp, DECL_NAME (TEMPLATE_PARM_DECL (parm)));
+   pp->translate_string ("");
  
pp_cxx_whitespace (pp);

pp_equal (pp);
diff --git a/gcc/testsuite/g++.dg/cpp23/concepts-err1.C 
b/gcc/testsuite/g++.dg/cpp23/concepts-err1.C
new file mode 100644
index 000..e5bdc542bad
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/concepts-err1.C
@@ -0,0 +1,6 @@
+// PR c++/103408
+// { dg-do compile { target c++23 } }
+
+template
+concept C = auto([]{}); // { dg-error "constraint" }
+static_assert(C<0>); // { dg-error "non-constant condition for static 
assertion" }

base-commit: 6b8ecbc6d6652d061d7c72c64352d51eca2df6ca





Re: [PATCH] c++: Fix for decltype(auto) and parenthesized expr [PR103403]

2021-12-01 Thread Jason Merrill via Gcc-patches

On 12/1/21 10:16, Marek Polacek wrote:

In r11-4758, I tried to fix this problem:

   int &&i = 0;
   decltype(auto) j = i; // should behave like int &&j = i; error

wherein do_auto_deduction was getting confused with a REFERENCE_REF_P
and it didn't realize its operand was a name, not an expression, and
deduced the wrong type.

Unfortunately that fix broke this:

   int&& r = 1;
   decltype(auto) rr = (r);

where 'rr' should be 'int &' since '(r)' is an expression, not a name.  But
because I stripped the INDIRECT_REF with the r11-4758 change, we deduced
'rr's type as if decltype had gotten a name, resulting in 'int &&'.

I suspect I thought that the REF_PARENTHESIZED_P check when setting
'bool id' in do_auto_deduction would handle the (r) case, but that's not
the case; while the documentation for REF_PARENTHESIZED_P specifically says
it can be set in INDIRECT_REF, we don't actually do so.

This patch sets REF_PARENTHESIZED_P even on REFERENCE_REF_P, so that
do_auto_deduction can use it.

It also removes code in maybe_undo_parenthesized_ref that I think is
dead -- and we don't hit it while running dg.exp.  To adduce more data,
it also looks dead here:
https://splichal.eu/lcov/gcc/cp/semantics.c.gcov.html


Agreed, that code is dead since r9-1417.


Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and 11?

PR c++/103403

gcc/cp/ChangeLog:

* cp-gimplify.c (cp_fold): Don't recurse if maybe_undo_parenthesized_ref
doesn't change its argument.
* pt.c (do_auto_deduction): Don't strip REFERENCE_REF_P trees.  Also
REF_PARENTHESIZED_P for REFERENCE_REF_P.
* semantics.c (force_paren_expr): Set REF_PARENTHESIZED_P on
REFERENCE_REF_P trees too.
(maybe_undo_parenthesized_ref): Remove dead code.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto2.C: New test.
* g++.dg/cpp1y/decltype-auto3.C: New test.
---
  gcc/cp/cp-gimplify.c|  3 ++-
  gcc/cp/pt.c |  5 ++---
  gcc/cp/semantics.c  | 18 --
  gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C | 12 
  gcc/testsuite/g++.dg/cpp1y/decltype-auto3.C | 12 
  5 files changed, 32 insertions(+), 18 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto3.C

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 0a002db14e7..e3ede02a48e 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -2421,7 +2421,8 @@ cp_fold (tree x)
if (REF_PARENTHESIZED_P (x))
{
  tree p = maybe_undo_parenthesized_ref (x);
- return cp_fold (p);
+ if (p != x)
+   return cp_fold (p);
}
goto unary;
  
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c

index f4b9d9673fb..c5b41b57028 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -29889,11 +29889,10 @@ do_auto_deduction (tree type, tree init, tree 
auto_node,
else if (AUTO_IS_DECLTYPE (auto_node))
  {
tree stripped_init = tree_strip_any_location_wrapper (init);
-  if (REFERENCE_REF_P (stripped_init))
-   stripped_init = TREE_OPERAND (stripped_init, 0);
bool id = (DECL_P (stripped_init)
 || ((TREE_CODE (init) == COMPONENT_REF
- || TREE_CODE (init) == SCOPE_REF)
+ || TREE_CODE (init) == SCOPE_REF
+ || REFERENCE_REF_P (init))
 && !REF_PARENTHESIZED_P (init)));


This change seems wrong; not all references are id-expressions or member 
accesses.  For instance, a call to a function returning a reference is 
not.  I think we still want to depend on the TREE_CODE, but we need to 
look at the TREE_CODE of stripped_init, not init.



tree deduced = finish_decltype_type (init, id, complain);
deduced = canonicalize_type_argument (deduced, complain);
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index cd1956497f8..edba4b60e10 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2049,7 +2049,8 @@ force_paren_expr (tree expr, bool even_uneval)
  return expr;
  
if (TREE_CODE (expr) == COMPONENT_REF

-  || TREE_CODE (expr) == SCOPE_REF)
+  || TREE_CODE (expr) == SCOPE_REF
+  || REFERENCE_REF_P (expr))
  REF_PARENTHESIZED_P (expr) = true;
else if (DECL_P (tree_strip_any_location_wrapper (expr)))
  {
@@ -2072,19 +2073,8 @@ maybe_undo_parenthesized_ref (tree t)
if (cxx_dialect < cxx14)
  return t;
  
-  if (INDIRECT_REF_P (t) && REF_PARENTHESIZED_P (t))

-{
-  t = TREE_OPERAND (t, 0);
-  while (TREE_CODE (t) == NON_LVALUE_EXPR
-|| TREE_CODE (t) == NOP_EXPR)
-   t = TREE_OPERAND (t, 0);
-
-  gcc_assert (TREE_CODE (t) == ADDR_EXPR
- || TREE_CODE (t) == STATIC_CAST_EXPR);
-  t = TREE_OPERAND (t, 0);
-}
-  else if ((TREE_CODE (t) == PAREN_EXPR || TREE_CODE (t) == VIEW_CONVERT

Re: [PATCH] c++: Fix bogus error with __integer_pack [PR94490]

2021-12-01 Thread Jason Merrill via Gcc-patches

On 12/1/21 12:22, Marek Polacek wrote:

Here we issue a bogus:

error: '(0 ? fake_tuple_size_v : fake_tuple_size_v)' is not a 
constant expression

because cxx_constant_value in expand_integer_pack gets

*(0 ? VIEW_CONVERT_EXPR(fake_tuple_size_v) : VIEW_CONVERT_EXPR(fake_tuple_size_v))

which is a REFERENCE_REF_P and we evaluate its operand to 3, so we end
up with *3 and that fails.  Sounds like we need to get rid of the
REFERENCE_REF_P then.  That is what tsubst_copy_and_build/INDIRECT_REF
will do:

 if (REFERENCE_REF_P (t))
   {
 /* A type conversion to reference type will be enclosed in
such an indirect ref, but the substitution of the cast
will have also added such an indirect ref.  */
 r = convert_from_reference (r);
   }

so I think it's reasonable to call instantiate_non_dependent_expr_sfinae.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11?


OK.


PR c++/94490

gcc/cp/ChangeLog:

* pt.c (expand_integer_pack): Call
instantiate_non_dependent_expr_sfinae.

gcc/testsuite/ChangeLog:

* g++.dg/ext/integer-pack5.C: New test.
---
  gcc/cp/pt.c  |  1 +
  gcc/testsuite/g++.dg/ext/integer-pack5.C | 29 
  2 files changed, 30 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack5.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f4b9d9673fb..6b560952639 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -3792,6 +3792,7 @@ expand_integer_pack (tree call, tree args, tsubst_flags_t 
complain,
  }
else
  {
+  hi = instantiate_non_dependent_expr_sfinae (hi, complain);
hi = cxx_constant_value (hi);
int len = valid_constant_size_p (hi) ? tree_to_shwi (hi) : -1;
  
diff --git a/gcc/testsuite/g++.dg/ext/integer-pack5.C b/gcc/testsuite/g++.dg/ext/integer-pack5.C

new file mode 100644
index 000..84938649d31
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/integer-pack5.C
@@ -0,0 +1,29 @@
+// PR c++/94490
+// { dg-do compile { target c++14 } }
+
+template
+constexpr int fake_tuple_size_v = 3;
+template struct intseq {};
+
+// So that it compiles with clang++.
+#if __has_builtin(__make_integer_seq)
+using size_t = decltype(sizeof(1));
+template
+using _IdxTuple = intseq<_Indices...>;
+
+template using genseq = __make_integer_seq<_IdxTuple, size_t, N>;
+#else
+template using genseq = intseq<__integer_pack(N)...>;
+#endif
+
+template>
+struct arith_result
+{ };
+
+template
+auto Mul(const T&)
+{
+return [](auto) { return arith_result> { }; }(0);
+}
+
+auto x = Mul(0);

base-commit: 54ebec35abec09a24b47b997172622ca0d8e2318





Re: [PATCH] Modify combine pattern by a pseudo AND with its nonzero bits [PR93453]

2021-12-01 Thread Kewen.Lin via Gcc-patches
Hi Haochen,

on 2021/12/1 下午5:01, HAO CHEN GUI via Gcc-patches wrote:
> Hi,
>   This patch modifies the combine pattern with a helper - 
> change_pseudo_and_mask when recog fails.
> The helper converts a single pseudo to the pseudo AND with a mask if the 
> outer operator is IOR/XOR/PLUS
> and the inner operator is ASHIFT/LSHIFTRT/AND. The conversion helps match 
> shift + ior pattern.
> 

Thanks for improving this!

I would expect this patch can make us remove the below existing splitting,
which was added for a similar purpose but only works for more insns
combination.

(define_split
  [(set (match_operand:GPR 0 "gpc_reg_operand")
(plus_ior_xor:GPR (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand")
  (match_operand:SI 2 "const_int_operand"))
  (match_operand:GPR 3 "gpc_reg_operand")))]
  "nonzero_bits (operands[3], mode)
   < HOST_WIDE_INT_1U << INTVAL (operands[2])"
  [(set (match_dup 0)
(ior:GPR (and:GPR (match_dup 3)
  (match_dup 4))
 (ashift:GPR (match_dup 1)
 (match_dup 2]
{
  operands[4] = GEN_INT ((HOST_WIDE_INT_1U << INTVAL (operands[2])) - 1);
})


As the description, this seems to check some special operations and mainly
for those targets having Power style rotate* operations?  If the understanding
is correct, not sure it's a good idea to make this under a target hook/macro?


>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. 
> Is this okay for trunk?
> Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2021-12-01 Haochen Gui 
> 
> gcc/
>   * combine.c (change_pseudo_and_mask): New.
>   (recog_for_combine): If recog fails, try again with the pattern
>   modified by change_pseudo_and_mask.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/20050603-3.c: Modify dump check conditions.
>   * gcc.target/powerpc/rlwimi-2.c: Likewise.
> 
> 
> patch.diff
> diff --git a/gcc/combine.c b/gcc/combine.c
> index 03e9a780919..82ee3b2e9db 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -11539,6 +11539,37 @@ change_zero_ext (rtx pat)
>return changed;
>  }
> 
> +/* When the outer code of set_src is IOR/XOR/PLUS and the inner code is
> +   ASHIFT/LSHIFTRT/AND, convert a pseudo to psuedo AND with a mask if its
> +   nonzero_bits is less than its mode mask. The nonzero_bits in other pass
> +   doesn't return the same value as it does in combine pass.  */
> +static bool
> +change_pseudo_and_mask (rtx pat)
> +{
> +  bool changed = false;

Maybe we can remove this variable ...

> +
> +  rtx src = SET_SRC (pat);
> +  if ((GET_CODE (src) == IOR
> +   || GET_CODE (src) == XOR
> +   || GET_CODE (src) == PLUS)
> +  && (((GET_CODE (XEXP (src, 0)) == ASHIFT
> + || GET_CODE (XEXP (src, 0)) == LSHIFTRT
> + || GET_CODE (XEXP (src, 0)) == AND)
> +&& REG_P (XEXP (src, 1)
> +{
> +  rtx *reg = &XEXP (SET_SRC (pat), 1);
> +  machine_mode mode = GET_MODE (*reg);
> +  unsigned HOST_WIDE_INT nonzero = nonzero_bits (*reg, mode);
> +  if (nonzero < GET_MODE_MASK (mode))
> + {
> +   rtx x = gen_rtx_AND (mode, *reg, GEN_INT (nonzero));
> +   SUBST (*reg, x);
> +   changed = true;

..., directly return true here ...

> + }
> + }
> +  return changed;

and return false here.

> +}
> +
>  /* Like recog, but we receive the address of a pointer to a new pattern.
> We try to match the rtx that the pointer points to.
> If that fails, we may try to modify or replace the pattern,
> @@ -11586,7 +11617,14 @@ recog_for_combine (rtx *pnewpat, rtx_insn *insn, rtx 
> *pnotes)
>   }
>   }
>else
> - changed = change_zero_ext (pat);
> + {
> +   if (change_pseudo_and_mask (pat))
> + {
> +   maybe_swap_commutative_operands (SET_SRC (pat));

maybe we can call maybe_swap_commutative_operands directly
in function change_pseudo_and_mask and make the code here compact.

BR,
Kewen

> +   changed = true;
> + }
> +   changed |= change_zero_ext (pat);
> + }
>  }
>else if (GET_CODE (pat) == PARALLEL)
>  {
> diff --git a/gcc/testsuite/gcc.target/powerpc/20050603-3.c 
> b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> index 4017d34f429..e628be11532 100644
> --- a/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/20050603-3.c
> @@ -12,7 +12,7 @@ void rotins (unsigned int x)
>b.y = (x<<12) | (x>>20);
>  }
> 
> -/* { dg-final { scan-assembler-not {\mrlwinm} } } */
> +/* { dg-final { scan-assembler-not {\mrlwinm} { target ilp32 } } } */
>  /* { dg-final { scan-assembler-not {\mrldic} } } */
>  /* { dg-final { scan-assembler-not {\mrot[lr]} } } */
>  /* { dg-final { scan-assembler-not {\ms[lr][wd]} } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c 
> b/gcc/testsuite/gcc.target/powerpc/rlwimi-2.c
> index bafa371db73..ffb5f9e450f 100644
> --- a/

[PR103149] introduce asmnesia internal function

2021-12-01 Thread Alexandre Oliva via Gcc-patches


This is now used by -fharden-conditional-branches and
-fharden-compares to detach the values used in redundant hardening
compares from their original locations.  It eventually still expands
to an asm statement, as before, but the temporary is only introduced
at expand time, preventing gimple optimizations from diverging input
and output, which may prevent their unification in asm match operands.

Additional logic to check asm operands matches, and to keep them
matchable, which may improve some cases of "+m" and "+X" in asm that
are currently rejected because the single operand, split into input
and output, ends up diverging in ways that can't be merged back by
reload.

While investigating regressions hit by an earlier iteration of this
patch, I came across pr93027.c, and noticed that, with optimization
(unlike the test), reload overwrites one of the inputs with another.
This preexisted this patch, and I have not attempted to fix it.


The changes above paved the way to fix PR103149: a vectorized loop
ends up using vector booleans on aarch64, and the hardening pass
attempts to "copy-and-forget" them with an asm statement (thus
ASMNESIA) for the redundant hardening compare.

The problem there is that vector booleans, on the affected platform,
can't go in GENERAL_REGS, but the asm statement we used to emit could
only use those registers or memory.

The work-around introduced in the newly-introduced ASMNESIA expander
is to check whether the mode is suitable for GENERAL_REGS, and force
the asm operands to use MEM otherwise.  This is not ideal, but
functionality to map a mode to a constraint string that selects a
suitable register class doesn't seem readily available.  Maybe in a
future patch...

ASMNESIA currently works only with gimple registers of types with
scalar modes.  I have not attempted to make it more general than that,
though I envision future uses that might end up requiring it.  I'll
cross that bridge if/when I get to it.


Regstrapped on x86_64-linux-gnu.  Also bootstrapped along with a patch
that enables both compare hardening passes.  Built crosses to aarch64-
and riscv64-, and verified that the mentioned aarch64 PR is fixed.  Ok
to install?


(Manual inspection of the asm output for riscv PR103302 suggests that
the reported problem may be gone with the given testcase, but I have not
built enough of the target environment to enable me to run that.)



for  gcc/ChangeLog

PR middle-end/103149
* cfgexpand.c (expand_asm_stmt): Pass original input
constraint, and constraint array, to asm_operand_ok.
* gimple-hander-conditionals.cc (detach_value): Call
ASMNESIA internal function.
* internal-fn.c (expand_ASMNESIA): New.
* internal-fn.def (ASMNESIA): New.
* recog.c (asm_operand_ok): Turn into wrapper of...
(asm_operand_ok_match): ... renamed.  Skip non-constraint
characters faster.  Don't require register_operand for matched
operands.  Combine results with...
(check_match): New.
(check_asm_operands): Enable match checking.

for  gcc/testsuite/ChangeLog

PR middle-end/103149
* gcc.target/aarch64/pr103149.c: New.
* gcc.target/i386/pr85030.c: Also accept impossible constraint
error.
---
 gcc/cfgexpand.c |6 +
 gcc/gimple-harden-conditionals.cc   |   17 
 gcc/internal-fn.c   |   81 +++
 gcc/internal-fn.def |4 +
 gcc/recog.c |  114 +++
 gcc/testsuite/gcc.target/aarch64/pr103149.c |   13 +++
 gcc/testsuite/gcc.target/i386/pr85030.c |2 
 7 files changed, 200 insertions(+), 37 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr103149.c

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index fb84d469f1e77..f0272ba1ac9ba 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -3374,10 +3374,10 @@ expand_asm_stmt (gasm *stmt)
   tree val = input_tvec[i];
   tree type = TREE_TYPE (val);
   bool allows_reg, allows_mem, ok;
-  const char *constraint;
+  const char *constraint, *orig_constraint;
   rtx op;
 
-  constraint = constraints[i + noutputs];
+  orig_constraint = constraint = constraints[i + noutputs];
   ok = parse_input_constraint (&constraint, i, ninputs, noutputs, 0,
   constraints.address (),
   &allows_mem, &allows_reg);
@@ -3397,7 +3397,7 @@ expand_asm_stmt (gasm *stmt)
   else if (MEM_P (op))
op = validize_mem (op);
 
-  if (asm_operand_ok (op, constraint, NULL) <= 0)
+  if (asm_operand_ok (op, orig_constraint, constraints.address ()) <= 0)
{
  if (allows_reg && TYPE_MODE (type) != BLKmode)
op = force_reg (TYPE_MODE (type), op);
diff --git a/gcc/gimple-harden-conditionals.cc 
b/gcc/gimple-harden-conditionals.cc
index cfa2361d65be0..3768

Re: [PATCH] i386: Improve V8HI and V8HF inserts [PR102811]

2021-12-01 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 2, 2021 at 6:07 AM Uros Bizjak  wrote:
>
> Introduce vec_set_0 pattern for V8HI and V8HF modes to implement scalar
> element 0 inserts to from a GP register, SSE register or memory.  Also
> add V8HI and V8HF AVX2 (x,x,x) alternative to PINSR insn pattern, which is
> split after reload to a sequence of PBROADCASTW and PBLENDW.
>
> The V8HF inserts from memory improve from:
>
> -   vpbroadcastw4(%esp), %xmm1
> -   vpblendw$16, %xmm1, %xmm0, %xmm0
> +   vpinsrw $4, 4(%esp), %xmm0, %xmm0
>
> and V8HF inserts from SSE register to element 0 improve from:
>
> vpxor   %xmm2, %xmm2, %xmm2
> -   vpbroadcastw%xmm0, %xmm0
> vpblendw$1, %xmm0, %xmm2, %xmm0
>
> Based on the above improvements, the register allocator is able to determine
> the optimal instruction (or instruction sequence) based on the register set
> of the input value, so there is no need to manually expand V8HI and V8HF
> inserts to the sequence of VEC_DUPLICATE and VEC_MERGE RTXes.
>
Thanks.
> 2021-12-01  Uroš Bizjak  
>
> gcc/ChangeLog:
>
> PR target/102811
> * config/i386/sse.md (VI2F): Remove mode iterator.
> (VI2F_256_512): New mode iterator.
> (vec_set_0): New insn pattern.
> (vec_set_0>): Rename from vec_setmode.
> Use VI2F_256_512 mode iterator instead of VI2F.
> (*axv512fp16_movsh): Remove.
> (_pinsr): Add (x,x,x) AVX2 alternative.
> Do not disable V8HF mode insn on AVX2 targets.
> (pinsrw -> pbroadcast + pblendw peephole2): New peephole.
> (pinsrw -> pbroadcast + pblendw splitter): New post-reload splitter.
> * config/i386/i386.md (extendhfsf): Call gen_vec_setv8hf_0.
> * config/i386/i386-expand.c (ix86_expand_vector_set)
> : Use vec_merge path for TARGET_AVX2.
>
> gcc/testsuite/ChangeLog:
>
> PR target/102881
> * gcc.target/i386/pr102811-1.c: New test.
> * gcc.target/i386/avx512fp16-1c.c (dg-final): Update
> scan-assembler-times scan strings for ia32 targets.
> * gcc.target/i386/pr102327-1.c (dg-final): Ditto.
> * gcc.target/i386/pr102811.c: Rename from ...
> * gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c: ... this.
>
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} w/ and
> w/o -mf16c.
>
> Pushed to master.
>
> Uros.



-- 
BR,
Hongtao


[committed] libphobos: Add missing ControlState variable for AArch64

2021-12-01 Thread Iain Buclaw via Gcc-patches
Hi,

This patch fixes a typo that occurred during the splitting of the
std.math module into a package.

Bootstrapped and regression tested on aarch64-linux-gnu, and committed
to mainline.

Regards,
Iain.

---
libphobos/ChangeLog:

* src/std/math/hardware.d (FloatingPointControl.getControlState): Add
missing ControlState variable for AArch64.
---
 libphobos/src/std/math/hardware.d | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libphobos/src/std/math/hardware.d 
b/libphobos/src/std/math/hardware.d
index 90bc96df148..b768969d4cf 100644
--- a/libphobos/src/std/math/hardware.d
+++ b/libphobos/src/std/math/hardware.d
@@ -912,6 +912,7 @@ private:
 }
 else version (AArch64)
 {
+ControlState cont;
 asm pure nothrow @nogc
 {
 "mrs %0, FPCR;" : "=r" (cont);
-- 
2.30.2



[committed] d: Disable the D runtime garbage collector after initializing (PR103520)

2021-12-01 Thread Iain Buclaw via Gcc-patches
Hi,

This patch disables the D runtime garbage collector after initializing.

Not all targets that support building libdruntime have a stable garbage
collector, so to avoid running into problems where live memory allocated
by the D GC is freed, disable all in-flight collections until a time
when scanning is more reliably implemented everywhere.

Bootstrapped and regression tested on aarch64-linux-gnu, and committed
to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

PR d/103520
* d-frontend.h (gc_disable): Declare.
* d-lang.cc (d_init_options): Disable the D runtime garbage collector
after initializing.
---
 gcc/d/d-frontend.h | 2 +-
 gcc/d/d-lang.cc| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/d/d-frontend.h b/gcc/d/d-frontend.h
index 3edf812212a..e7695d5718b 100644
--- a/gcc/d/d-frontend.h
+++ b/gcc/d/d-frontend.h
@@ -21,7 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 /* These functions are defined in D runtime.  */
 extern "C" int rt_init (void);
 extern "C" int rt_term (void);
-//extern "C" void gc_disable (void);
+extern "C" void gc_disable (void);
 extern "C" void *gc_malloc (size_t sz, unsigned ba = 0, const void *ti = NULL);
 extern "C" void gc_free (void *);
 extern "C" void gc_collect (void);
diff --git a/gcc/d/d-lang.cc b/gcc/d/d-lang.cc
index d20370e5d8a..dbf7a8b60b9 100644
--- a/gcc/d/d-lang.cc
+++ b/gcc/d/d-lang.cc
@@ -288,7 +288,7 @@ d_init_options (unsigned int, cl_decoded_option 
*decoded_options)
 {
   /* Initialize the D runtime.  */
   rt_init ();
-//  gc_disable ();
+  gc_disable ();
 
   /* Set default values.  */
   global._init ();
-- 
2.30.2



[committed] d: Prefix object files from the root package with 'root-'

2021-12-01 Thread Iain Buclaw via Gcc-patches
Hi,

This patch prefixes D object files from the root package with 'root-'.

None of the front-end module names in either the dmd or root package
collide just yet, but that does not mean they won't in the future.

Bootstrapped and regression tested on x86_64-linux-gnu, and committed to
mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

* Make-lang.in (D_FRONTEND_OBJS): Prefix object files from the root
package with root-.
(d/root-%.o): New recipe.
---
 gcc/d/Make-lang.in | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/gcc/d/Make-lang.in b/gcc/d/Make-lang.in
index 4c0a0321eba..4ce11e3cada 100644
--- a/gcc/d/Make-lang.in
+++ b/gcc/d/Make-lang.in
@@ -74,19 +74,16 @@ endif
 
 # D Frontend object files.
 D_FRONTEND_OBJS = \
-   d/aav.o \
d/access.o \
d/aggregate.o \
d/aliasthis.o \
d/apply.o \
-   d/array.o \
d/arrayop.o \
d/arraytypes.o \
d/attrib.o \
d/ast_node.o \
d/astcodegen.o \
d/astenums.o \
-   d/bitarray.o \
d/blockexit.o \
d/builtin.o \
d/canthrow.o \
@@ -99,7 +96,6 @@ D_FRONTEND_OBJS = \
d/cparse.o \
d/cppmangle.o \
d/ctfeexpr.o \
-   d/ctfloat.o \
d/ctorflow.o \
d/dcast.o \
d/dclass.o \
@@ -124,13 +120,10 @@ D_FRONTEND_OBJS = \
d/escape.o \
d/expression.o \
d/expressionsem.o \
-   d/file.o \
-   d/filename.o \
d/foreachvar.o \
d/func.o \
d/globals.o \
d/gluelayer.o \
-   d/hash.o \
d/hdrgen.o \
d/iasm.o \
d/iasmgcc.o \
@@ -145,7 +138,6 @@ D_FRONTEND_OBJS = \
d/json.o \
d/lambdacomp.o \
d/lexer.o \
-   d/longdouble.o \
d/mtype.o \
d/nogc.o \
d/nspace.o \
@@ -153,29 +145,37 @@ D_FRONTEND_OBJS = \
d/objc.o \
d/opover.o \
d/optimize.o \
-   d/outbuffer.o \
d/parse.o \
d/parsetimevisitor.o \
d/permissivevisitor.o \
-   d/port.o \
d/printast.o \
-   d/region.o \
-   d/rmem.o \
-   d/rootobject.o \
+   d/root-aav.o \
+   d/root-array.o \
+   d/root-bitarray.o \
+   d/root-ctfloat.o \
+   d/root-file.o \
+   d/root-filename.o \
+   d/root-hash.o \
+   d/root-longdouble.o \
+   d/root-outbuffer.o \
+   d/root-port.o \
+   d/root-region.o \
+   d/root-rmem.o \
+   d/root-rootobject.o \
+   d/root-speller.o \
+   d/root-string.o \
+   d/root-stringtable.o \
d/safe.o \
d/sapply.o \
d/semantic2.o \
d/semantic3.o \
d/sideeffect.o \
-   d/speller.o \
d/statement.o \
d/statement_rewrite_walker.o \
d/statementsem.o \
d/staticassert.o \
d/staticcond.o \
d/stmtstate.o \
-   d/string.o \
-   d/stringtable.o \
d/target.o \
d/templateparamsem.o \
d/tokens.o \
@@ -393,6 +393,6 @@ d/%.o: d/dmd/%.d
$(DCOMPILE) $(D_INCLUDES) $<
$(DPOSTCOMPILE)
 
-d/%.o: d/dmd/root/%.d
+d/root-%.o: d/dmd/root/%.d
$(DCOMPILE) $(D_INCLUDES) $<
$(DPOSTCOMPILE)
-- 
2.30.2



[PATCH v2] c++: ICE with auto(0) in requires-expression [PR103408]

2021-12-01 Thread Marek Polacek via Gcc-patches
On Wed, Dec 01, 2021 at 12:17:47PM -0500, Patrick Palka wrote:
> On Wed, Dec 1, 2021 at 10:26 AM Marek Polacek via Gcc-patches
>  wrote:
> >
> > Here we ICE on
> >
> >   int f() requires (auto(0));
> >
> > in do_auto_deduction when handling the auto: we're in a non-templated
> > requires-expression which are parsed under processing_template_decl == 1
> > and empty current_template_parms, so 'current_template_args ()' will
> > crash.  This code is invalid as per "C++20 CA378: Remove non-templated
> > constrained functions", but of course we shouldn't crash.
> 
> FWIW it looks like we can trip over the same bug with valid code:
> 
>   static_assert(requires { auto(0); });

Right, I hadn't thought of this at all.  So it's not just an ICE-on-invalid!

> > Since in the scenario above it's expected that current_template_parms is
> > null, I've just added a check, and let grokfndecl issue an error.
> > I guess another approch would be to fake up a template parameter list
> > before calling do_auto_deduction.
> >
> > For good measure, I've added several well-formed cases with auto(x) in
> > a requires-expression.
> >
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> >
> > PR c++/103408
> >
> > gcc/cp/ChangeLog:
> >
> > * pt.c (do_auto_deduction): Check current_template_parms before
> > current_template_args ().
> >
> > gcc/testsuite/ChangeLog:
> >
> > * g++.dg/cpp23/auto-fncast9.C: New test.
> > ---
> >  gcc/cp/pt.c   |  2 +-
> >  gcc/testsuite/g++.dg/cpp23/auto-fncast9.C | 27 +++
> >  2 files changed, 28 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast9.C
> >
> > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> > index f4b9d9673fb..012ca5d06c0 100644
> > --- a/gcc/cp/pt.c
> > +++ b/gcc/cp/pt.c
> > @@ -30041,7 +30041,7 @@ do_auto_deduction (tree type, tree init, tree 
> > auto_node,
> > (but we still may have used them for constraint checking above).  
> > */;
> >else if (context == adc_unify)
> >  targs = add_to_template_args (outer_targs, targs);
> > -  else if (processing_template_decl)
> > +  else if (processing_template_decl && current_template_parms)
> >  targs = add_to_template_args (current_template_args (), targs);
> >return tsubst (type, targs, complain, NULL_TREE);
> 
> Won't this mean the call to tsubst here will end up lowering the level
> of the auto from 2 to 1 rather than replacing it with the actual
> deduced type?

Yes, exactly -- targs has depth 1 but the level of auto is 2.  :(
 
> It also looks like this approach doesn't handle static_assert(requires
> { auto(auto(0)); }), probably due to this substitution issue.  I guess
> we could add a dummy level to 'targs' to work around this..

Thanks for pointing this out.  Here's a v2 which uses a dummy level.  The
description hopefully explains what's going on here...

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Here we ICE on

  int f() requires (auto(0));

in do_auto_deduction when handling the auto: we're in a non-templated
requires-expression which are parsed under processing_template_decl == 1
and empty current_template_parms, so 'current_template_args ()' will
crash.  This code is invalid as per "C++20 CA378: Remove non-templated
constrained functions", but of course we shouldn't crash.  However, as
Patrick pointed out, this is valid:

static_assert(requires { auto(0); }); // #1
static_assert(requires { auto(auto(0)); }); // #2

To make #1 and #2 work, I'm playing games with faking up a dummy template
parameter level to current_template_parms, which need elaborating a bit.
autos are created with the level processing_template_decl + 1.  When we
call do_auto_deduction for the inner auto(0) in #2, the auto will therefore
have level 2.  type_unification_real in do_auto_deduction deduces targs to
 (TMPL_ARGS_DEPTH = 1), so the

  return tsubst (type, targs, complain, NULL_TREE);

will only reduce the level of the auto to 1.  So it remains undeduced,
causing the "invalid use of auto" error.  If I create the dummy template
parameter level, then this at the end of do_auto_deduction

  else if (processing_template_decl)
targs = add_to_template_args (current_template_args (), targs);

transforms  into <, int>, which has TMPL_ARGS_DEPTH = 2 and so the
following tsubst actually replaces the auto with int, and things work.

For good measure, I've added several well-formed cases with auto(x) in
a requires-expression.

PR c++/103408

gcc/cp/ChangeLog:

* cp-tree.h (add_dummy_template_parameter_level): Declare.
* pt.c (add_dummy_template_parameter_level): New function.
(begin_template_parm_list): Use it.
* semantics.c (finish_compound_literal): Call
add_dummy_template_parameter_level if there are no template parameters.
* typeck2.c (build_functional_cast_1): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-

[RFC]PR103271 ICE in assign_stack_temp_for_type with -ftrivial-auto-var-init=pattern and VLAs and -mno-strict-align on riscv64

2021-12-01 Thread Qing Zhao via Gcc-patches
Hi, 

As I studied, the issue with this bug is similar as PR102285:

For the testing case:

int
qy (void)
{
  int tw = 4;
  int fb[tw];
  return fb[2]; /* { dg-warning "uninitialized" } */
}

 if compiled with 
/home/opc/Install/cross/bin/riscv64-unknown-linux-gnu-gcc -c -O1  
-mno-strict-align -ftrivial-auto-var-init=pattern -fdump-tree-all t.c

**t.c.032t.cpp1:
int qy ()
{
  unsigned char fb.3[16];
  int tw;
  int[0:D.1494] * fb.1;
  sizetype D.1494;
  void * saved_stack.2_16;
  int _27;

   :
  saved_stack.2_16 = __builtin_stack_save ();
  fb.1_25 = &fb.3;
  (*fb.1_25) = .DEFERRED_INIT (16, 1, 1);
  _27 = (*fb.1_25)[2];
  fb.3 ={v} {CLOBBER};
  __builtin_stack_restore (saved_stack.2_16);
  return _27;

}

**t.c.033t.forwprop1:
int qy ()
{
  unsigned char fb.3[16];
  int tw;
  int[0:D.1494] * fb.1;
  sizetype D.1494;
  void * saved_stack.2_16;
  int _27;

   :
  saved_stack.2_16 = __builtin_stack_save ();
  MEM[(int[0:D.1494] *)&fb.3] = .DEFERRED_INIT (16, 1, 1);
  _27 = MEM[(int[0:D.1494] *)&fb.3][2];
  fb.3 ={v} {CLOBBER};
  __builtin_stack_restore (saved_stack.2_16);
  return _27;

}

the problem with the above IR for .DEFERRED_INIT expansion is:

for MEM[(int[0:D.1494] *)&fb.3] = .DEFERRED_INIT (16, 1, 1)

after the “forwprop1” phase, the original address that point to VLA fb.1 is 
replaced with the the new address that point to fixed array fb.3, however, the 
TREE_TYPE of the MEM still is kept the old VLA type int[0:D.1984]. Such 
inconsistency causes the ICE during expanding of call to the above 
.DEFERRED_INIT.

If during the forwprop, when replacing “fb.1” with “&fb.3”, the TREE_TYPE of 
the MEM_REF can be updated accordingly, we should fix this inconsistency of the 
IR, and therefore fix the issue in DEFERRED_INIT expansion. 

So, I think that the fix should be in “forwprop”, and update the TREE_TYPE of 
the MEM_REF accordingly.

What do you think on this?

thanks.

Qing




[committed] analyzer: fix false leak seen in Juliet 1.3 [PR102471]

2021-12-01 Thread David Malcolm via Gcc-patches
Juliet 1.3's CWE415_Double_Free__malloc_free_*_67a.c
were showing leak false positives in non-LTO builds; fixed thusly.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-5701-g860c56b5bc356960a4d0445dadc43ceddbe3c7e2.

gcc/analyzer/ChangeLog:
PR analyzer/102471
* region-model-reachability.cc (reachable_regions::handle_parm):
Treat all svalues within a compound parm has reachable, and those
wrapped in a cast.

gcc/testsuite/ChangeLog:
PR analyzer/102471
* gcc.dg/analyzer/leak-3.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model-reachability.cc | 13 +++
 gcc/testsuite/gcc.dg/analyzer/leak-3.c| 41 +++
 2 files changed, 54 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/leak-3.c

diff --git a/gcc/analyzer/region-model-reachability.cc 
b/gcc/analyzer/region-model-reachability.cc
index b5ae787cac9..f82f7e6556d 100644
--- a/gcc/analyzer/region-model-reachability.cc
+++ b/gcc/analyzer/region-model-reachability.cc
@@ -258,6 +258,19 @@ reachable_regions::handle_parm (const svalue *sval, tree 
param_type)
   const region *pointee_reg = parm_ptr->get_pointee ();
   add (pointee_reg, is_mutable);
 }
+  /* Treat all svalues within a compound_svalue as reachable.  */
+  if (const compound_svalue *compound_sval
+  = sval->dyn_cast_compound_svalue ())
+{
+  for (compound_svalue::iterator_t iter = compound_sval->begin ();
+  iter != compound_sval->end (); ++iter)
+   {
+ const svalue *iter_sval = (*iter).second;
+ handle_sval (iter_sval);
+   }
+}
+  if (const svalue *cast = sval->maybe_undo_cast ())
+handle_sval (cast);
 }
 
 /* Update the store to mark the clusters that were found to be mutable
diff --git a/gcc/testsuite/gcc.dg/analyzer/leak-3.c 
b/gcc/testsuite/gcc.dg/analyzer/leak-3.c
new file mode 100644
index 000..d11cc03b36f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/leak-3.c
@@ -0,0 +1,41 @@
+#include 
+
+/* Reduced from Juliet 1.3's CWE415_Double_Free__malloc_free_char_67a.c
+   goodG2B which was showing a false leak report in a non-LTO build.  */
+
+struct s1
+{
+char * structFirst;
+};
+void external_fn_1(struct s1 myStruct);
+void test_1()
+{
+char * data;
+struct s1 myStruct;
+data = (char *)malloc(100*sizeof(char));
+if (data == NULL)
+  exit(-1);
+myStruct.structFirst = data;
+external_fn_1(myStruct);
+} /* { dg-bogus "leak of 'data'" } */
+
+/* As above, but with padding before the field.  */
+
+struct s2
+{
+  void *padding;
+  char *ptr;
+};
+void external_fn_2(struct s2 myStruct);
+void test_2()
+{
+char * data;
+struct s2 myStruct;
+data = (char *)malloc(100*sizeof(char));
+if (data == NULL)
+  exit(-1);
+myStruct.padding = NULL;
+myStruct.ptr = data;
+external_fn_2(myStruct);
+} /* { dg-bogus "leak of 'data'" } */
+
-- 
2.26.3



Re: [PATCH] rs6000: Builtins test changes for test_fpscr_[d]rn_builtin_error.c

2021-12-01 Thread Segher Boessenkool
On Thu, Nov 18, 2021 at 10:36:52AM -0600, Bill Schmidt wrote:
> Hi!  This is the last patch broken out of the previous test suite patch
> for the new builtins support.

Whew :-)

> One advantage of the new builtins support is uniform error messages for
> arguments with restricted values.  Previously this was done in many places
> in an ad hoc manner, with little uniformity.  This patch adjusts the
> expected error messages accordingly.
> 
> All such error messages are now one of the following:
>   "argument %d must be a %d-bit unsigned literal"
>   "argument %d must be a literal between %d and %d, inclusive"
>   "argument %d must be a variable or a literal between %d and %d, inclusive"
>   "argument %d must be either a literal %d or a literal %d"
> 
> These messages were chosen to require the fewest changes from previous
> messages while still introducing uniformity.  This patch adjusts error
> messages for some cases where this produces changed messages.  In
> particular, some messages are improved because previously they did not
> admit the possibility that an argument could hold a variable.

Same comment as on the previous patch.  But, okay for trunk.  Thanks!


Segher


Re: [PATCH] rs6000: Builtins test changes for pr80315-*.c, pr88100.c

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!

On 12/1/21 4:29 PM, Segher Boessenkool wrote:
> On Thu, Nov 18, 2021 at 10:15:21AM -0600, Bill Schmidt wrote:
>> Hi!  This patch is broken out from the test case patch for the new
>> builtins support.
>>
>> One advantage of the new builtins support is uniform error messages for
>> arguments with restricted values.  Previously this was done in many places
>> in an ad hoc manner, with little uniformity.  This patch adjusts the
>> expected error messages accordingly.
>>
>> All error messages are now one of the following:
>>   "argument %d must be a %d-bit unsigned literal"
>>   "argument %d must be a literal between %d and %d, inclusive"
>>   "argument %d must be a variable or a literal between %d and %d, inclusive"
>>   "argument %d must be either a literal %d or a literal %d"
>>
>> These messages were chosen to require the fewest changes from previous
>> messages while still introducing uniformity.  This patch adjusts error
>> messages for some cases where this produces changed messages.
>>
>> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with
>> no regressions.  is this okay for trunk?
> We should have opnly the middle two of those messages.  But, okay for
> trunk if you put this on some to-do list.  Thanks!

The last one is actually needed also, because we have at least one case 
where the two supported values aren't contiguous.  We can do without the
first one, but that will affect quite a number of test cases, so agree
that this should be done later.  (We already had a whole lot of tests
of this form.)

Thanks for the review!
Bill

>
> Segher


Re: [PATCH] rs6000: Builtins test changes for pragma_misc9.c

2021-12-01 Thread Segher Boessenkool
On Thu, Nov 18, 2021 at 10:18:06AM -0600, Bill Schmidt wrote:
> Hi!  This patch is broken out from the test suite patch for the new
> builtins support.  This one is just a minor adjustment for the error
> message wording.
> 
> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64)
> with no regressions.  Is this okay for trunk?

Okay.  Thanks!


Segher


Re: [PATCH] rs6000: Builtins test changes for pr80315-*.c, pr88100.c

2021-12-01 Thread Segher Boessenkool
On Thu, Nov 18, 2021 at 10:15:21AM -0600, Bill Schmidt wrote:
> Hi!  This patch is broken out from the test case patch for the new
> builtins support.
> 
> One advantage of the new builtins support is uniform error messages for
> arguments with restricted values.  Previously this was done in many places
> in an ad hoc manner, with little uniformity.  This patch adjusts the
> expected error messages accordingly.
> 
> All error messages are now one of the following:
>   "argument %d must be a %d-bit unsigned literal"
>   "argument %d must be a literal between %d and %d, inclusive"
>   "argument %d must be a variable or a literal between %d and %d, inclusive"
>   "argument %d must be either a literal %d or a literal %d"
> 
> These messages were chosen to require the fewest changes from previous
> messages while still introducing uniformity.  This patch adjusts error
> messages for some cases where this produces changed messages.
> 
> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with
> no regressions.  is this okay for trunk?

We should have opnly the middle two of those messages.  But, okay for
trunk if you put this on some to-do list.  Thanks!


Segher


Re: [PATCH] rs6000: Builtins test changes for compare-bytes tests

2021-12-01 Thread Segher Boessenkool
On Thu, Nov 18, 2021 at 07:47:38AM -0600, Bill Schmidt wrote:
> Hi!  This patch is broken out from the patch with test suite changes for the
> new builtins support.
> 
> With the old builtins support, cmpb-2.c produces:
>   warning: implicit declaration of function '__builtin_cmpb; did you mean 
> '__builtin_bcmp'?
> 
> With the new support, it produces:
>   error: '__builtin_p6_cmpb requires the '-mcpu=power6' option and either the 
> '-m64' or '-mpowerpc64' option
>   note: builtin '__builtin_cmpb' requires builtin '__builtin_p6_cmpb'

I am still not happy with this at all, it is clearly worse than what we
had.  But, okay for trunk, and hopefully we can fix it before GCC 12
release.  Thanks!


Segher


[PATCH] i386: Improve V8HI and V8HF inserts [PR102811]

2021-12-01 Thread Uros Bizjak via Gcc-patches
Introduce vec_set_0 pattern for V8HI and V8HF modes to implement scalar
element 0 inserts to from a GP register, SSE register or memory.  Also
add V8HI and V8HF AVX2 (x,x,x) alternative to PINSR insn pattern, which is
split after reload to a sequence of PBROADCASTW and PBLENDW.

The V8HF inserts from memory improve from:

-   vpbroadcastw4(%esp), %xmm1
-   vpblendw$16, %xmm1, %xmm0, %xmm0
+   vpinsrw $4, 4(%esp), %xmm0, %xmm0

and V8HF inserts from SSE register to element 0 improve from:

vpxor   %xmm2, %xmm2, %xmm2
-   vpbroadcastw%xmm0, %xmm0
vpblendw$1, %xmm0, %xmm2, %xmm0

Based on the above improvements, the register allocator is able to determine
the optimal instruction (or instruction sequence) based on the register set
of the input value, so there is no need to manually expand V8HI and V8HF
inserts to the sequence of VEC_DUPLICATE and VEC_MERGE RTXes.

2021-12-01  Uroš Bizjak  

gcc/ChangeLog:

PR target/102811
* config/i386/sse.md (VI2F): Remove mode iterator.
(VI2F_256_512): New mode iterator.
(vec_set_0): New insn pattern.
(vec_set_0>): Rename from vec_setmode.
Use VI2F_256_512 mode iterator instead of VI2F.
(*axv512fp16_movsh): Remove.
(_pinsr): Add (x,x,x) AVX2 alternative.
Do not disable V8HF mode insn on AVX2 targets.
(pinsrw -> pbroadcast + pblendw peephole2): New peephole.
(pinsrw -> pbroadcast + pblendw splitter): New post-reload splitter.
* config/i386/i386.md (extendhfsf): Call gen_vec_setv8hf_0.
* config/i386/i386-expand.c (ix86_expand_vector_set)
: Use vec_merge path for TARGET_AVX2.

gcc/testsuite/ChangeLog:

PR target/102881
* gcc.target/i386/pr102811-1.c: New test.
* gcc.target/i386/avx512fp16-1c.c (dg-final): Update
scan-assembler-times scan strings for ia32 targets.
* gcc.target/i386/pr102327-1.c (dg-final): Ditto.
* gcc.target/i386/pr102811.c: Rename from ...
* gcc.target/i386/avx512vl-vcvtps2ph-pr102811.c: ... this.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} w/ and
w/o -mf16c.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 354a9a713b7..cba2880dc8d 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -16204,18 +16204,8 @@ ix86_expand_vector_set (bool mmx_ok, rtx target, rtx 
val, int elt)
}
   return;
 
-case E_V8HFmode:
-  if (TARGET_AVX2)
-   {
- mmode = SImode;
- gen_blendm = gen_sse4_1_pblendph;
- blendm_const = true;
-   }
-  else
-   use_vec_merge = true;
-  break;
-
 case E_V8HImode:
+case E_V8HFmode:
 case E_V2HImode:
   use_vec_merge = TARGET_SSE2;
   break;
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 4e9fae80479..9d7d1161d15 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4656,15 +4656,7 @@
   rtx tmp = gen_reg_rtx (V8HFmode);
   rtx zero = force_reg (V8HFmode, CONST0_RTX (V8HFmode));
 
-  if (TARGET_AVX2)
-   {
- rtx dup = gen_reg_rtx (V8HFmode);
- emit_move_insn (dup, gen_rtx_VEC_DUPLICATE (V8HFmode, operands[1]));
- emit_move_insn (tmp, gen_rtx_VEC_MERGE (V8HFmode, dup,
- zero, const1_rtx));
-   }
-  else
-   emit_insn (gen_sse2_pinsrph (tmp, zero, operands[1], const1_rtx));
+  emit_insn (gen_vec_setv8hf_0 (tmp, zero, operands[1]));
   emit_insn (gen_vcvtph2ps (res, gen_lowpart (V8HImode, tmp)));
   emit_move_insn (operands[0], gen_lowpart (SFmode, res));
   DONE;
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 08bdcddc111..f8b34a15cc6 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -827,7 +827,7 @@
   (V32HF "TARGET_AVX512BW")])
 
 ;; Int-float size matches
-(define_mode_iterator VI2F [V8HI V16HI V32HI V8HF V16HF V32HF])
+(define_mode_iterator VI2F_256_512 [V16HI V32HI V16HF V32HF])
 (define_mode_iterator VI4F_128 [V4SI V4SF])
 (define_mode_iterator VI8F_128 [V2DI V2DF])
 (define_mode_iterator VI4F_256 [V8SI V8SF])
@@ -10170,13 +10170,84 @@
   ]
   (symbol_ref "true")))])
 
+(define_insn "vec_set_0"
+  [(set (match_operand:V8_128 0 "register_operand"
+ "=v,v,v,x,x,Yr,*x,x,x,x,v,v")
+   (vec_merge:V8_128
+ (vec_duplicate:V8_128
+   (match_operand: 2 "nonimmediate_operand"
+ " r,m,v,r,m,Yr,*x,r,m,x,r,m"))
+ (match_operand:V8_128 1 "reg_or_0_operand"
+ " C,C,v,0,0,0 ,0 ,x,x,x,v,v")
+ (const_int 1)))]
+  "TARGET_SSE2"
+  "@
+   vmovw\t{%k2, %0|%0, %k2}
+   vmovw\t{%2, %0|%0, %2}
+   vmovsh\t{%2, %1, %0|%0, %1, %2}
+   pinsrw\t{$0, %k2, %0|%0, %k2, 0}
+   pinsrw\t{$0, %2, %0|%0, %2, 0}
+   pblendw\t{$1, %2, %0|%0, %2, 1}
+   pblendw\t{$1, %2, %0|%0, %2, 1}
+   vpinsrw\t{$0, %k2, %1, %0|%0, %1, %k2, 0}
+   vpinsrw\t{$0, %2, %1, %0|%0, %1, %2, 0}
+   vpblendw\t{$1, %2, %

Re: [PATCH v2] rs6000: Fix a handful of 32-bit built-in function problems

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!

On 12/1/21 3:08 PM, Segher Boessenkool wrote:
> On Tue, Nov 16, 2021 at 12:56:52PM -0600, Bill Schmidt wrote:
>> Hi!  I previously posted [1] to correct some problems with the new builtins
>> support targeting 32-bit code gen.  Based on the discussion, I've made some
>> adjustments and would like to submit this for consideration.
>>
>> We eventually agreed that the strange behavior for -m32 -mpowerpc64 for 
>> certain
>> HTM builtins should be removed.  All of the registers TEXASR, TEXASRU, TFHAR,
>> and TFIAR are now accessed using the unsigned long data type in all 
>> configurations.
>> gcc/
>>  * config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
>>  (BPERMD): Flag as 32bit (needing special handling for 32-bit).
>>  (UNPACK_TD): Return unsigned long long instead of unsigned long.
>>  (GET_TEXASR): Return unsigned long instead of unsigned long long.
>>  (GET_TEXASRU): Likewise.
>>  (GET_TFHAR): Likewise.
>>  (GET_TFIAR): Likewise.
>>  (SET_TEXASR): Pass unsigned long instead of unsigned long long.
>>  (SET_TEXASRU): Likewise.
>>  (SET_TFHAR): Likewise.
>>  (SET_TFIAR): Likewise.
>>  (TABORTDC): Likewise.
>>  (TABORTDCI): Likewise.
>>  * config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error
>>  handling for no32bit.  Add 32bit handling for RS6000_BIF_BPERMD.
>>
>> gcc/testsuite/
>>  * gcc.target/powerpc/cmpb-3.c: Adjust error message.
> Okay for trunk.  Thanks!
>
>
> Could you put some short blurb about the changed prototype of the HTM
> reg builtins in the release notes please?  Thanks x2 :-)

Already done!

Bill

>
> Segher


Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests

2021-12-01 Thread Segher Boessenkool
On Wed, Nov 17, 2021 at 02:58:54PM -0600, Bill Schmidt wrote:
> Hi!  This patch is broken out of the previous patch for all the builtins test
> suite adjustments.  Here we have some slight changes in error messages due to
> how the internals have changed between the old and new builtins methods.
> 
> For scalar-extract-exp-2.c we change:
>   error: '__builtin_vec_scalar_extract_exp is not supported in this compiler 
> configuration'
> 
> to:
>   error: '__builtin_vsx_scalar_extract_exp' requires the '-mcpu=power9' 
> option and either the '-m64' or '-mpowerpc64' option
>   note: builtin '__builtin_vec_scalar_extract_exp' requires builtin 
> '__builtin_vsx_scalar_extract_exp'

So, okay for trunk.  Thanks!

One thing though:
> gcc/testsuite/
>   * gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust error
>   message.

Don't wrap unnecessarily please.

>   * gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-test-neg-2.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-test-neg-3.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-test-neg-5.c: Likewise.


Segher


Re: [PATCH] rs6000: Builtins test changes for BFP scalar tests

2021-12-01 Thread Segher Boessenkool
On Thu, Nov 18, 2021 at 03:59:41PM -0600, Bill Schmidt wrote:
> On 11/18/21 3:32 PM, Segher Boessenkool wrote:
> > On Thu, Nov 18, 2021 at 03:30:48PM -0600, Bill Schmidt wrote:
> >> On 11/18/21 3:16 PM, Segher Boessenkool wrote:
> >>> On Wed, Nov 17, 2021 at 05:06:05PM -0600, Bill Schmidt wrote:
> > I don't like that at all.  The user didn't write the _vsx thing, and it
> > isn't documented either (neither is the _vec one, but that is a separate
> > issue, specific to this builtin).
>  I feel like I haven't explained this well.  This kind of thing has been 
>  in
>  existence forever even in the old builtins code.  The combination of the
>  error showing the internal builtin name, and the note tying the overload
>  name to the internal builtin name, has been there all along.  The name of
>  the internal builtin is pretty meaningless.  The only thing that's 
>  interesting
>  in this case is that we previously didn't get this *for this specific 
>  case*
>  because the old code went to a generic fallback.  But in many other cases
>  you get exactly this same kind of error message for the old code.
> >>> Yes.  And it still is a regression (in *this* case).
> >> Sorry, I don't understand.  Why specifically is this a regression?
> > It is wrong now, in ways that it wasn't wrong before.  That is the
> > definition of regression!
> 
> I'm sorry, I disagree.  With clarification of the note, I don't see how
> this can be considered a regression.  It is providing information in a
> different way, but it is still clear that the user has misused the builtin
> in the context, and, unlike before, it now tells them *what* is wrong with
> the options that were used (not just "unavailable in this configuration").
> The fact that an internal builtin name is *also* disclosed as part of
> this does not make it wrong.

It is claiming something is wrong with something the user didn't write.
The blow is softened a lot by the inform(), but it still is a
regression.  But you said you'll look into it, and it will be okay for
now, it isn't like this is super important.  Just an ugly wart :-)

> The way that overloads work, we can only tell whether a builtin is
> enabled with the current set of options by looking at the true builtin
> that the overload maps to.  The enablement checking code doesn't have
> any knowledge that an overloaded function maps to it.  So that error
> message is produced without knowledge of the overloading.  The note
> diagnostic is added by the overload processing code that *is* aware
> that the mapping exists.

Yeah, but you can keep track of what the original name was, somewhere.
Hopefully, anyway :-)

> The enablement checking code (rs6000_invalid_builtin in the old code,
> rs6000_invalid_new_builtin in the new code) is called from multiple
> places, and not always in an overload context, so we can't assume
> this is the case.  Not all builtins are mapped to by overloads, but
> they still need enablement checking.

A direct call would not get the "this comes from  overload" field
set, so that is easy to see in the error message code.

> Would it be possible to change things so that we pass in the overload
> name to be used in the error message when appropriate?  Yes.  But
> this would have a much larger impact on the test suite than the
> current method, because all error tests for overloads would now
> have to change.  That is, there are many existing tests that are
> already "wrong" in the sense that they report the internal builtin
> name.
> 
> I suggest that we add that to the list of future cleanups due to
> the size of the impact.  I agree that it has never been ideal to
> mention the internal builtin name on these messages.  It's just
> not unique to the changes I've made here; it's a pre-existing
> condition that needs work to cleanse it everywhere.
> 
> Can we move forward this way?

We can yes :-)  And thanks in advance!  It would be nice if we could
see this in GCC 12 still.  It is fine for stage 4 still, if it is error
message only (and a lot of testsuite :-) )


Segher


[committed] wwwdocs: git: Go has a new home

2021-12-01 Thread Gerald Pfeifer
Pushed.

---
 htdocs/git.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/git.html b/htdocs/git.html
index de8f0584..493b5734 100644
--- a/htdocs/git.html
+++ b/htdocs/git.html
@@ -329,7 +329,7 @@ in Git.
   gccgo
   This branch is for the Go front end to gcc.  For more information
 about the Go programming language,
-see https://golang.org/";>https://golang.org.  The
+see https://go.dev/";>go.dev.  The
 branch is maintained by Ian Lance Taylor.  Patches should be
 marked with the tag [gccgo] wwwdocs: in the Subject line.
   
-- 
2.34.0


[committed] wwwdocs: gcc-12: Update link to our Bugzilla

2021-12-01 Thread Gerald Pfeifer
On the way tweak language a bit.

Pushed.
---
 htdocs/gcc-12/changes.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index 45a8d99a..000501fb 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -399,7 +399,7 @@ a work-in-progress.
 has been rewritten to be easier and less error-prone to maintain.  Every
 attempt has been made to ensure that the new behavior matches the old
 behavior, but inevitably some bugs can be expected.  Please report any
-problems using https://gcc.gnu.org/bugzilla";>GCC Bugzilla.
+problems via https://gcc.gnu.org/bugzilla/";>GCC Bugzilla.
   
   
 The built-in functions __builtin_get_texasr,
-- 
2.34.0


Re: [PATCH v2] rs6000: Fix a handful of 32-bit built-in function problems

2021-12-01 Thread Segher Boessenkool
On Tue, Nov 16, 2021 at 12:56:52PM -0600, Bill Schmidt wrote:
> Hi!  I previously posted [1] to correct some problems with the new builtins
> support targeting 32-bit code gen.  Based on the discussion, I've made some
> adjustments and would like to submit this for consideration.
> 
> We eventually agreed that the strange behavior for -m32 -mpowerpc64 for 
> certain
> HTM builtins should be removed.  All of the registers TEXASR, TEXASRU, TFHAR,
> and TFIAR are now accessed using the unsigned long data type in all 
> configurations.

> gcc/
>   * config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
>   (BPERMD): Flag as 32bit (needing special handling for 32-bit).
>   (UNPACK_TD): Return unsigned long long instead of unsigned long.
>   (GET_TEXASR): Return unsigned long instead of unsigned long long.
>   (GET_TEXASRU): Likewise.
>   (GET_TFHAR): Likewise.
>   (GET_TFIAR): Likewise.
>   (SET_TEXASR): Pass unsigned long instead of unsigned long long.
>   (SET_TEXASRU): Likewise.
>   (SET_TFHAR): Likewise.
>   (SET_TFIAR): Likewise.
>   (TABORTDC): Likewise.
>   (TABORTDCI): Likewise.
>   * config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error
>   handling for no32bit.  Add 32bit handling for RS6000_BIF_BPERMD.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/cmpb-3.c: Adjust error message.

Okay for trunk.  Thanks!


Could you put some short blurb about the changed prototype of the HTM
reg builtins in the release notes please?  Thanks x2 :-)


Segher


RE: [PATCH] Final value replacement improvements for until-wrap loops.

2021-12-01 Thread Roger Sayle


Hi Richard,
Many thanks for the review.  Here's the final version that I've committed, 
including 
your suggested improvements, following another make bootstrap and make -k check
on x86_64-pc-linux-gnu with no new failures.  Thanks again.


2021-12-01  Roger Sayle  
Richard Biener  

gcc/ChangeLog
* tree-ssa-loop-niter.c (number_of_iterations_until_wrap):
Check if simplify_using_initial_conditions allows us to
simplify the expression for may_be_zero.
* match.pd (X != C ? -X : -C -> -X): New transform.
(X != C ? ~X : ~C -> ~X): Likewise.
((X+1) > Y ? -X : 1 -> X >= Y ? -X : 1): Likewise.

gcc/testsuite/ChangeLog
* gcc.dg/fold-condneg-1.c: New test case.
* gcc.dg/fold-condneg-2.c: New test case.
* gcc.dg/fold-condnot-1.c: New test case.
* gcc.dg/pr101145-1.c: New test case.
* gcc.dg/pr101145-2.c: New test case.

Roger
--

> -Original Message-
> From: Richard Biener 
> Sent: 30 November 2021 10:00
> To: Roger Sayle 
> Cc: GCC Patches 
> Subject: Re: [PATCH] Final value replacement improvements for until-wrap
> loops.
> 
> On Mon, Nov 29, 2021 at 10:07 AM Roger Sayle
>  wrote:
> >
> >
> > This middle-end patch is inspired by Richard Biener's until-wrap loop
> > example in PR tree-optimization/101145.
> >
> > unsigned foo(unsigned val, unsigned start) {
> >   unsigned cnt = 0;
> >   for (unsigned i = start; i > val; ++i)
> > cnt++;
> >   return cnt;
> > }
> >
> > For this loop, the tree optimizers currently generate:
> >
> > unsigned int foo (unsigned int val, unsigned int start) {
> >   unsigned int cnt;
> >   unsigned int _1;
> >   unsigned int _5;
> >
> >[local count: 118111600]:
> >   if (start_3(D) > val_4(D))
> > goto ; [89.00%]
> >   else
> > goto ; [11.00%]
> >
> >[local count: 105119324]:
> >   _1 = start_3(D) + 1;
> >   _5 = -start_3(D);
> >   cnt_2 = _1 > val_4(D) ? _5 : 1;
> >
> >[local count: 118111600]:
> >   # cnt_11 = PHI 
> >   return cnt_11;
> > }
> >
> > or perhaps slightly easier to read:
> >
> >   if (start > val) {
> > cnt = (start+1) > val ? -start : 1;
> >   } else cnt = 0;
> >
> > In this snippet, if we know start > val, then (start+1) > val unless
> > start+1 overflows, i.e. (start+1) == 0 and start == ~0.
> > We can use this (loop header) context to simplify the ternary
> > expression to "(start != -1) ? -start : 1", which with a little help
> > from match.pd can be folded to -start.  Hence the optimal final value
> > replacement should be:
> >
> >   cnt = (start > val) ? -start : 0;
> >
> > Or as now generated by this patch:
> >
> > unsigned int foo (unsigned int val, unsigned int start) {
> >   unsigned int cnt;
> >
> >[local count: 118111600]:
> >   if (start_3(D) > val_4(D))
> > goto ; [89.00%]
> >   else
> > goto ; [11.00%]
> >
> >[local count: 105119324]:
> >   cnt_2 = -start_3(D);
> >
> >[local count: 118111600]:
> >   # cnt_11 = PHI 
> >   return cnt_11;
> > }
> >
> >
> > We can also improve until-wrap loops that don't have a (suitable) loop
> > header, as determined by simplify_using_initial_conditions.
> >
> > unsigned bar(unsigned val, unsigned start) {
> >   unsigned cnt = 0;
> >   unsigned i = start;
> >   do {
> > cnt++;
> > i++;
> >   } while (i > val);
> >   return cnt;
> > }
> >
> > which is currently optimized to:
> >
> > unsigned int foo (unsigned int val, unsigned int start) {
> >   unsigned int cnt;
> >   unsigned int _9;
> >   unsigned int _10;
> >
> >[local count: 118111600]:
> >   _9 = start_4(D) + 1;
> >   _10 = -start_4(D);
> >   cnt_3 = val_7(D) < _9 ? _10 : 1;
> >   return cnt_3;
> > }
> >
> > Here we have "val < (start+1) ? -start : 1", which again with the help
> > of match.pd can be slightly simplified to "val <= start ? -start : 1"
> > when dealing with unsigned types, because at the complicating value
> > where start == ~0, we fortunately have -start == 1, hence it doesn't
> > matter whether the second or third operand of the ternary operator is
> returned.
> >
> > To summarize, this patch (in addition to tweaking may_be_zero in
> > number_of_iterations_until_wrap) adds three new constant folding
> > transforms to match.pd.
> >
> > X != C1 ? -X : C2 simplifies to -X when -C1 == C2.
> > which is the generalized form of the simplification above.
> >
> > X != C1 ? ~X : C2 simplifies to ~X when ~C1 == C2.
> > which is the BIT_NOT_EXPR analog of the NEGATE_EXPR case.
> >
> > and the "until-wrap final value replacement without context":
> >
> > (X + 1) > Y ? -X : 1 simplifies to X >= Y ? -X : 1 when X is unsigned,
> > as when X + 1 overflows, X is -1, so -X == 1.
> >
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check with no new failures.  Ok for mainline?
> 
> +/* X != C1 ? -X : C2 simplifies to -X when -C1 == C2.  */ (simplify
> +(cond (ne @0 INTEGER_CST@1) (negate@3 @0) INTEGER_CST@2)  (if
> +((!wi::only_sign_bit_p (wi::to_wide (@1))
> +   

Re: [RFC/PATCH] C++ constexpr vs. floating point exceptions.

2021-12-01 Thread Jason Merrill via Gcc-patches
On Tue, Sep 21, 2021 at 5:19 PM Jason Merrill  wrote:

> On Tue, Sep 21, 2021 at 4:30 PM Joseph Myers 
> wrote:
>
>> On Tue, 21 Sep 2021, Jakub Jelinek via Gcc-patches wrote:
>>
>> > On Tue, Sep 21, 2021 at 02:15:59PM +0100, Roger Sayle wrote:
>> > > Can you double check?  Integer division by zero is undefined, but
>> isn't floating point
>> > > division by zero defined by the appropriate IEEE standards?
>> >
>> > https://eel.is/c++draft/expr.mul#4 doesn't make the division by zero
>> > behavior conditional on integral types.
>> > C has similar wording.
>>
>> The position for C is that Annex F semantics take precedence over all the
>> ways in which floating-point arithmetic has undefined behavior in the
>> main
>> body of the standard.  So floating-point overflow and division by zero
>> are
>> fully defined in the presence of Annex F support, while out-of-range
>> conversions from floating point to integer types produce an unspecified
>> value (not necessarily the same unspecified value for different
>> executions
>> of the conversion in the abstract machine - as discussed in bug 93806,
>> GCC
>> can get that wrong and act as if a single execution of such a conversion
>> in the abstract machine produces more than one result).
>>
>> In C, as specified in Annex F, initializers for floating-point objects
>> with static or thread storage duration are evaluated with exceptions
>> discarded and the default rounding mode in effect; 7.0 / 0.0 is a fully
>> valid initializer for such an object to initialize it to positive
>> infinity.  As I understand it, the question for this thread is whether
>> C++
>> constexpr should have a similar rule to C static initializers (which
>> ought
>> to apply to 1.0 / 3.0, raising inexact, just as much as to 7.0 / 0.0).
>>
>
> The C rules seem to be
>
> F.8.2 Translation
> During translation the IEC 60559 default modes are in effect:
>  — The rounding direction mode is rounding to nearest.
>  — The rounding precision mode (if supported) is set so that results are
> not shortened.
>  — Trapping or stopping (if supported) is disabled on all floating-point
> exceptions.
> Recommended practice:
> The implementation should produce a diagnostic message for each
> translation-time floating-point exception, other than “inexact”; the
> implementation should then proceed with the translation of the program.
>
> I think following the same rules for C++ would be appropriate in a
> diagnosing context: warn and continue.  In a template argument deduction
> (SFINAE) context, where errors become silent substitution failures, it's
> probably better to treat them as non-constant.
>

I've now added -fconstexpr-fp-except to treat manifestly-constant-evaluated
expressions the same as C static initializers for floating point
arithmetic.  This does not distinguish between diagnosing and SFINAE
context as I was suggesting above, and indeed neither front end currently
warns about compile-time exceptions as Annex F recommends.

Jason


Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-12-01 Thread Peter Bergner via Gcc-patches
On 12/1/21 1:07 PM, Richard Biener wrote:
> On December 1, 2021 6:42:21 PM GMT+01:00, Peter Bergner 
>  wrote:
>> On 12/1/21 3:01 AM, Richard Biener wrote:
>>> Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init
>>> instead of fixing up things at expansion time.
>>
>> The following fixes the ICE.  The bootstrap/regtesting is still running 
>> though.
> 
> OK. 

Great, pushed.  For posterity, below is what I committed.

Peter


middle-end: Skip initialization of opaque type variables [PR103127]

For -ftrivial-auto-var-init=*, skip initializing the variable if it is an
opaque type, because CONST0_RTX(mode) is not defined for opaque modes.

2021-12-01  Peter Bergner  

gcc/
PR middle-end/103127
* gimplify.c (is_var_need_auto_init): Handle opaque types.

gcc/testsuite/
PR middle-end/103127
* gcc.target/powerpc/pr103127.c: New test.

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 8624f8221fd..326476f0238 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1829,6 +1829,7 @@ is_var_need_auto_init (tree decl)
  || !DECL_HARD_REGISTER (decl))
   && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED)
   && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl)))
+  && !OPAQUE_TYPE_P (TREE_TYPE (decl))
   && !is_empty_type (TREE_TYPE (decl)))
 return true;
   return false;
diff --git a/gcc/testsuite/gcc.target/powerpc/pr103127.c 
b/gcc/testsuite/gcc.target/powerpc/pr103127.c
new file mode 100644
index 000..801fc0a4620
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103127.c
@@ -0,0 +1,19 @@
+/* PR target/103127 */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -ftrivial-auto-var-init=zero" } */
+
+/* Verify we do not ICE on the following tests.  */
+
+void
+foo (__vector_quad *dst)
+{
+  __vector_quad acc;
+  *dst = acc;
+}
+
+void
+bar (__vector_pair *dst)
+{
+  __vector_pair pair;
+  *dst = pair;
+}



Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-12-01 Thread Richard Biener via Gcc-patches
On December 1, 2021 6:42:21 PM GMT+01:00, Peter Bergner  
wrote:
>On 12/1/21 3:01 AM, Richard Biener wrote:
>> Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init
>> instead of fixing up things at expansion time.
>
>The following fixes the ICE.  The bootstrap/regtesting is still running though.

OK. 

Thanks, 
Richard. 

>Peter
>
>
>diff --git a/gcc/gimplify.c b/gcc/gimplify.c
>index 8624f8221fd..326476f0238 100644
>--- a/gcc/gimplify.c
>+++ b/gcc/gimplify.c
>@@ -1829,6 +1829,7 @@ is_var_need_auto_init (tree decl)
>  || !DECL_HARD_REGISTER (decl))
>   && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED)
>   && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl)))
>+  && !OPAQUE_TYPE_P (TREE_TYPE (decl))
>   && !is_empty_type (TREE_TYPE (decl)))
> return true;
>   return false;



Re: [committed] libstdc++: Optimize ref-count updates in COW std::string

2021-12-01 Thread Jonathan Wakely via Gcc-patches
On Wed, 1 Dec 2021 at 18:16, Florian Weimer wrote:
>
> * Jonathan Wakely via Libstdc:
>
> > diff --git a/libstdc++-v3/include/bits/cow_string.h 
> > b/libstdc++-v3/include/bits/cow_string.h
> > index ced395b80b8..4fae1d02981 100644
> > --- a/libstdc++-v3/include/bits/cow_string.h
> > +++ b/libstdc++-v3/include/bits/cow_string.h
> > @@ -105,7 +105,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > *  destroy the empty-string _Rep object.
> > *
> > *  All but the last paragraph is considered pretty conventional
> > -   *  for a C++ string implementation.
> > +   *  for a Copy-On-Write C++ string implementation.
> >*/
> >// 21.3  Template class basic_string
> >template
> > @@ -207,10 +207,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > // so we need to use an atomic load. However, _M_is_leaked
> > // predicate does not change concurrently (i.e. the string is either
> > // leaked or not), so a relaxed load is enough.
> > -   return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
> > -#else
> > -   return this->_M_refcount < 0;
> > +   if (!__gnu_cxx::__is_single_threaded())
> > + return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
> >  #endif
> > +   return this->_M_refcount < 0;
> >   }
>
> Relaxed MO loads of word-size values on all current architectures only
> have a compiler barrier, so I think the optimization makes things worse?

Hmm, yes.

> (I doubt the conditional lack of a compiler barrier leads to
> optimization improvements elsewhere.)

Probably not. I'll revert the change to _M_is_leaked() and just keep
it for _M_is_shared().

Thanks for pointing that out.



Re: [PATCH] Loop unswitching: support gswitch statements.

2021-12-01 Thread Andrew MacLeod via Gcc-patches

On 12/1/21 09:48, Martin Liška wrote:

On 12/1/21 15:34, Richard Biener wrote:

On Wed, Dec 1, 2021 at 3:25 PM Martin Liška  wrote:


On 12/1/21 15:19, Richard Biener wrote:
which is compute the range of 'lhs' on edge_true into 
predicate->true_range,

assign that same range to ->false_range and then invert it to get the
range on the false_edge.  What I am saying is that for better 
precision

you should do

   ranger->range_on_edge (predicate->false_range, edge_false, 
lhs);


rather than prematurely optimize this to the inversion of the true 
range

since yes, ranger is CFG sensitive and only the_last_ predicate on a
long CFG path is actually inverted.

What am I missing?


I might be misunderstood, but I think it's the problem defined here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584605.html

where I used the ranger->range_on_edge on the false_edge.


Ah, OK.  But then even the true_edge range is possibly wrong, no?


You are of course correct, I've just proved that in debugger ://


Consider

   for (;;)
  {
  if (a < 100)
    if (a > 50)  // unswitch on this
  /* .. */
  if (a < 120)
  /* ... */
  }

then you record [51, 99] for true_range of the a > 50 predicate and thus
simplification will simplify the if (a < 120) check, no?


Yep.



You can only record the range from the (CFG independent) a > 50 check,
thus [51, +INF] but of course at simplification time you can also use
the CFG context at each simplification location.


@Andrew: How can I easily get irange based just on a stmt? Something 
like fold_range

with int_range_max as the 3rd argument?


Sorry, I miss these things if I'm not directly CC'd a lot :-)

So you just want to know the basic range the stmt generates without 
context?    Sure, what you say would be fine, but your want to 
initialize it to the type range:


int_range_max range (TREE_TYPE (name));

you can also simply trigger it using the current SSA_NAME_RANGE_INFO 
global  values query instead of the default current contextual one... 
which , if there isnt a global range, will automatically use the range 
of the type of the argument.. so maybe just try


fold_range (r, stmt, get_global_range_query ())

Andrew






Re: [committed] libstdc++: Optimize ref-count updates in COW std::string

2021-12-01 Thread Florian Weimer via Gcc-patches
* Jonathan Wakely via Libstdc:

> diff --git a/libstdc++-v3/include/bits/cow_string.h 
> b/libstdc++-v3/include/bits/cow_string.h
> index ced395b80b8..4fae1d02981 100644
> --- a/libstdc++-v3/include/bits/cow_string.h
> +++ b/libstdc++-v3/include/bits/cow_string.h
> @@ -105,7 +105,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> *  destroy the empty-string _Rep object.
> *
> *  All but the last paragraph is considered pretty conventional
> -   *  for a C++ string implementation.
> +   *  for a Copy-On-Write C++ string implementation.
>*/
>// 21.3  Template class basic_string
>template
> @@ -207,10 +207,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> // so we need to use an atomic load. However, _M_is_leaked
> // predicate does not change concurrently (i.e. the string is either
> // leaked or not), so a relaxed load is enough.
> -   return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
> -#else
> -   return this->_M_refcount < 0;
> +   if (!__gnu_cxx::__is_single_threaded())
> + return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
>  #endif
> +   return this->_M_refcount < 0;
>   }

Relaxed MO loads of word-size values on all current architectures only
have a compiler barrier, so I think the optimization makes things worse?
(I doubt the conditional lack of a compiler barrier leads to
optimization improvements elsewhere.)

Thanks,
Florian



Re: [PATCH] rs6000: Mirror fix for PR102347 into the new builtins support

2021-12-01 Thread Bill Schmidt via Gcc-patches
On 12/1/21 11:21 AM, Segher Boessenkool wrote:
> Hi!
>
> On Wed, Dec 01, 2021 at 09:29:42AM -0600, Bill Schmidt wrote:
>> Recently Kewen fixed a problem in the old builtins support where
>> rs6000_builtin_decl prematurely indicated that a target builtin is
>> unavailable.  This also needs to be done for the new builtins support, but in
>> this case we have to ensure the error message is still produced from the
>> overload support in rs6000-c.c.  Unfortunately, this is less straightforward
>> than it could be, because header file includes need to be adjusted to make 
>> this
>> happen.  Someday we'll consolidate all the builtin code in one file and this
>> won't have to be so ugly.
>>
>> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is 
>> this
>> okay for trunk?
> This is okay for trunk.  Thanks!
>
> Is there some place we can store what original builtin was used when
> some overload is resolved?  Just in the new builtin code, don't spend
> time on the old stuff :-)

I think we can do much better, with a little work.  The macros are a little
problematic, but I have ideas about how to make this better in a future
patch.  (There's a fair amount of test suite fallout, so it should wait.)

Thanks for the review!

Bill
>
>
> Segher


Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-12-01 Thread Peter Bergner via Gcc-patches
On 12/1/21 3:01 AM, Richard Biener wrote:
> Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init
> instead of fixing up things at expansion time.

The following fixes the ICE.  The bootstrap/regtesting is still running though.

Peter


diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 8624f8221fd..326476f0238 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1829,6 +1829,7 @@ is_var_need_auto_init (tree decl)
  || !DECL_HARD_REGISTER (decl))
   && (flag_auto_var_init > AUTO_INIT_UNINITIALIZED)
   && (!lookup_attribute ("uninitialized", DECL_ATTRIBUTES (decl)))
+  && !OPAQUE_TYPE_P (TREE_TYPE (decl))
   && !is_empty_type (TREE_TYPE (decl)))
 return true;
   return false;


Re: [PATCH] ipa-sra: Check also ECF_LOOPING_CONST_OR_PURE when evaluating calls

2021-12-01 Thread Martin Jambor
On Tue, Nov 30 2021, Richard Biener wrote:
> On Tue, Nov 30, 2021 at 3:24 PM Martin Jambor  wrote:
>>
>> Hi,
>>
>> in PR 103267 Honza found out that IPA-SRA does not look at
>> ECF_LOOPING_CONST_OR_PURE when evaluating if a call can have side
>> effects.  Fixed with this patch.  The testcase infinitely loops in a
>> const function, so it would not make a good addition to the testsuite.
>>
>> Bootstrapped and tested on x86_64-linux.  OK for trunk?
>
> OK.
>

Thank you.  For reference, I am also about to commit (after having
bootstrapped and tested it on x86_64-linx) the following backport to the
old IPA-SRA we have in GCC 9.

Thanks,

Martin



gcc/ChangeLog:

2021-12-01  Martin Jambor  

PR ipa/103267
* tree-sra.c (scan_function): Also check ECF_LOOPING_CONST_OR_PURE
flag.
---
 gcc/tree-sra.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 909b4fef9a8..252953b7512 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -1530,7 +1530,8 @@ scan_function (void)
}
 
  if (final_bbs
- && (flags & (ECF_CONST | ECF_PURE)) == 0)
+ && ((flags & (ECF_CONST | ECF_PURE)) == 0
+ || (flags & ECF_LOOPING_CONST_OR_PURE)))
bitmap_set_bit (final_bbs, bb->index);
}
 
-- 
2.33.1



Re: [PATCH][wwwdocs] Update section on enormous source files in htdocs/projects/beginner.html

2021-12-01 Thread Eric Gallager via Gcc-patches
On Wed, Dec 1, 2021 at 10:54 AM Gerald Pfeifer  wrote:
>
> Hi Eric,
>
> On Wed, 24 Nov 2021, Eric Gallager wrote:
> > This next patch does more than just removing old stuff: it adds an
> > extra sentence to describe a shell command used to generate a list, so
> > to verify that I've got the shell command right, I'm asking for a
> > review.
>
> -There are several other files in this size range, which I have left
> +There are several other files in this size range, which are left
>  out because touching them at all is unwise (reload, the Fortran front
>  end).  You can try, but I am not responsible for any damage to your
>  sanity which may result.
>
> while we are touching this, how about toning the second half of this
> paragraph down and moving away from first person?
>
> How about something like "...breaking them up likely may prove rather
> difficult" (or similar, happy to leave this to you as native spaker)?
>

OK yeah I did remove the first instance of the first person in that
paragraph, but couldn't think of a better wording for the second one,
so that suggestion is helpful; thanks.

>
> +Note that the list of large files in this section is generated with the
> +following shell command, run from the gcc subdirectory:
> +
> +
> +   du -sh *.{c,h,cc} */*.{c,h,cc} | sort -hr | grep -v fortran | head -n 
> 14
> +
>
> This looks like it does what you want it to do.
>
> (Pulling up the database courses I have given in a previous life, I'd pull
> in the grep before the sort - O(n) filtering before O(n·log n) processing
> - but even on my notebook both are instantenous. ;-) )
>
>
> Please consider the two suggestions above and commit the result; just
> share the final patch here (no review required).
>
> Thanks,
> Gerald

OK thanks, I committed the attached patch as f4b4d0f:
https://gcc.gnu.org/git/?p=gcc-wwwdocs.git;a=commitdiff;h=f4b4d0f783246dd6f58944cdb542446d5e7589d3


patch-beginner-projects-02.diff
Description: Binary data


[PATCH] c++: Fix bogus error with __integer_pack [PR94490]

2021-12-01 Thread Marek Polacek via Gcc-patches
Here we issue a bogus:

error: '(0 ? fake_tuple_size_v : fake_tuple_size_v)' is not a 
constant expression

because cxx_constant_value in expand_integer_pack gets

*(0 ? VIEW_CONVERT_EXPR(fake_tuple_size_v) : VIEW_CONVERT_EXPR(fake_tuple_size_v))

which is a REFERENCE_REF_P and we evaluate its operand to 3, so we end
up with *3 and that fails.  Sounds like we need to get rid of the
REFERENCE_REF_P then.  That is what tsubst_copy_and_build/INDIRECT_REF
will do:

if (REFERENCE_REF_P (t))
  {
/* A type conversion to reference type will be enclosed in
   such an indirect ref, but the substitution of the cast
   will have also added such an indirect ref.  */
r = convert_from_reference (r);
  }

so I think it's reasonable to call instantiate_non_dependent_expr_sfinae.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/11?

PR c++/94490

gcc/cp/ChangeLog:

* pt.c (expand_integer_pack): Call
instantiate_non_dependent_expr_sfinae.

gcc/testsuite/ChangeLog:

* g++.dg/ext/integer-pack5.C: New test.
---
 gcc/cp/pt.c  |  1 +
 gcc/testsuite/g++.dg/ext/integer-pack5.C | 29 
 2 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/integer-pack5.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f4b9d9673fb..6b560952639 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -3792,6 +3792,7 @@ expand_integer_pack (tree call, tree args, tsubst_flags_t 
complain,
 }
   else
 {
+  hi = instantiate_non_dependent_expr_sfinae (hi, complain);
   hi = cxx_constant_value (hi);
   int len = valid_constant_size_p (hi) ? tree_to_shwi (hi) : -1;
 
diff --git a/gcc/testsuite/g++.dg/ext/integer-pack5.C 
b/gcc/testsuite/g++.dg/ext/integer-pack5.C
new file mode 100644
index 000..84938649d31
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/integer-pack5.C
@@ -0,0 +1,29 @@
+// PR c++/94490
+// { dg-do compile { target c++14 } }
+
+template
+constexpr int fake_tuple_size_v = 3;
+template struct intseq {};
+
+// So that it compiles with clang++.
+#if __has_builtin(__make_integer_seq)
+using size_t = decltype(sizeof(1));
+template
+using _IdxTuple = intseq<_Indices...>;
+
+template using genseq = __make_integer_seq<_IdxTuple, size_t, N>;
+#else
+template using genseq = intseq<__integer_pack(N)...>;
+#endif
+
+template>
+struct arith_result
+{ };
+
+template
+auto Mul(const T&)
+{
+return [](auto) { return arith_result> { }; }(0);
+}
+
+auto x = Mul(0);

base-commit: 54ebec35abec09a24b47b997172622ca0d8e2318
-- 
2.33.1



Re: [PATCH] rs6000: Mirror fix for PR102347 into the new builtins support

2021-12-01 Thread Segher Boessenkool
Hi!

On Wed, Dec 01, 2021 at 09:29:42AM -0600, Bill Schmidt wrote:
> Recently Kewen fixed a problem in the old builtins support where
> rs6000_builtin_decl prematurely indicated that a target builtin is
> unavailable.  This also needs to be done for the new builtins support, but in
> this case we have to ensure the error message is still produced from the
> overload support in rs6000-c.c.  Unfortunately, this is less straightforward
> than it could be, because header file includes need to be adjusted to make 
> this
> happen.  Someday we'll consolidate all the builtin code in one file and this
> won't have to be so ugly.
> 
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
> okay for trunk?

This is okay for trunk.  Thanks!

Is there some place we can store what original builtin was used when
some overload is resolved?  Just in the new builtin code, don't spend
time on the old stuff :-)


Segher


Re: [PATCH] c++: ICE with auto(0) in requires-expression [PR103408]

2021-12-01 Thread Patrick Palka via Gcc-patches
On Wed, Dec 1, 2021 at 10:26 AM Marek Polacek via Gcc-patches
 wrote:
>
> Here we ICE on
>
>   int f() requires (auto(0));
>
> in do_auto_deduction when handling the auto: we're in a non-templated
> requires-expression which are parsed under processing_template_decl == 1
> and empty current_template_parms, so 'current_template_args ()' will
> crash.  This code is invalid as per "C++20 CA378: Remove non-templated
> constrained functions", but of course we shouldn't crash.

FWIW it looks like we can trip over the same bug with valid code:

  static_assert(requires { auto(0); });

>
> Since in the scenario above it's expected that current_template_parms is
> null, I've just added a check, and let grokfndecl issue an error.
> I guess another approch would be to fake up a template parameter list
> before calling do_auto_deduction.
>
> For good measure, I've added several well-formed cases with auto(x) in
> a requires-expression.
>
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
>
> PR c++/103408
>
> gcc/cp/ChangeLog:
>
> * pt.c (do_auto_deduction): Check current_template_parms before
> current_template_args ().
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/cpp23/auto-fncast9.C: New test.
> ---
>  gcc/cp/pt.c   |  2 +-
>  gcc/testsuite/g++.dg/cpp23/auto-fncast9.C | 27 +++
>  2 files changed, 28 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast9.C
>
> diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> index f4b9d9673fb..012ca5d06c0 100644
> --- a/gcc/cp/pt.c
> +++ b/gcc/cp/pt.c
> @@ -30041,7 +30041,7 @@ do_auto_deduction (tree type, tree init, tree 
> auto_node,
> (but we still may have used them for constraint checking above).  */;
>else if (context == adc_unify)
>  targs = add_to_template_args (outer_targs, targs);
> -  else if (processing_template_decl)
> +  else if (processing_template_decl && current_template_parms)
>  targs = add_to_template_args (current_template_args (), targs);
>return tsubst (type, targs, complain, NULL_TREE);

Won't this mean the call to tsubst here will end up lowering the level
of the auto from 2 to 1 rather than replacing it with the actual
deduced type?

It also looks like this approach doesn't handle static_assert(requires
{ auto(auto(0)); }), probably due to this substitution issue.  I guess
we could add a dummy level to 'targs' to work around this..

>  }
> diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C 
> b/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C
> new file mode 100644
> index 000..45a0ff9b460
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C
> @@ -0,0 +1,27 @@
> +// PR c++/103408
> +// { dg-do compile { target c++23 } }
> +
> +int bad1() requires (auto(true)); // { dg-error "constraints on a 
> non-templated function" }
> +
> +template
> +int f1() requires (auto(B));
> +
> +template
> +struct S { T t; constexpr operator bool() { return true; } };
> +
> +int bad2() requires (bool(S{1})); // { dg-error "constraints on a 
> non-templated function" }
> +int bad3() requires (bool(S(1))); // { dg-error "constraints on a 
> non-templated function" }
> +
> +template
> +int f2() requires (bool(S{N}));
> +
> +template
> +int f3() requires (bool(S(N)));
> +
> +void
> +g ()
> +{
> +  f1();
> +  f2<42>();
> +  f3<42>();
> +}
>
> base-commit: 6b8ecbc6d6652d061d7c72c64352d51eca2df6ca
> --
> 2.33.1
>



Re: [PATCH] path solver: Minimize exported ranges to subsequent blocks.

2021-12-01 Thread Jeff Law via Gcc-patches




On 12/1/2021 9:18 AM, Aldy Hernandez via Gcc-patches wrote:

I'm going to hold off pushing this until I hear from the global
maintainers and/or release managers.

This patch fixes a pathological performance case, but it may also be
handled by Andrew's fixes in this area.  It's up to y'all to decide if
it's something we want in this release.  An alternative would be to
keep track of any pathological issues going forward, and see if this
fixes the issue.  I'm going to be taking leave any day now :).
I'm leaving this to you and Andrew to decide.  Y'all know this code 
better than anyone.


And yes, I keep expecting you to disappear without notice :-)

Jeff




[PATCH, PING] rs6000: Builtins test changes for test_fpscr_[d]rn_builtin_error.c

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping this patch.

Thanks!
Bill

On 11/18/21 10:36 AM, Bill Schmidt wrote:
> Hi!  This is the last patch broken out of the previous test suite patch
> for the new builtins support.
>
> One advantage of the new builtins support is uniform error messages for
> arguments with restricted values.  Previously this was done in many places
> in an ad hoc manner, with little uniformity.  This patch adjusts the
> expected error messages accordingly.
>
> All such error messages are now one of the following:
>   "argument %d must be a %d-bit unsigned literal"
>   "argument %d must be a literal between %d and %d, inclusive"
>   "argument %d must be a variable or a literal between %d and %d, inclusive"
>   "argument %d must be either a literal %d or a literal %d"
>
> These messages were chosen to require the fewest changes from previous
> messages while still introducing uniformity.  This patch adjusts error
> messages for some cases where this produces changed messages.  In
> particular, some messages are improved because previously they did not
> admit the possibility that an argument could hold a variable.
>
> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64)
> with no regressions.  Is this okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-11-17  Bill Schmidt  
>
> gcc/testsuite/
>   * gcc.target/powerpc/test_fpscr_drn_builtin_error.c: Adjust error
>   messages.
>   * gcc.target/powerpc/test_fpscr_rn_builtin_error.c: Likewise.
> ---
>  .../powerpc/test_fpscr_drn_builtin_error.c   |  4 ++--
>  .../gcc.target/powerpc/test_fpscr_rn_builtin_error.c | 12 ++--
>  2 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c 
> b/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
> index 028ab0b6d66..4f9d9e08e8a 100644
> --- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_drn_builtin_error.c
> @@ -9,8 +9,8 @@ int main ()
>   __builtin_set_fpscr_drn() also support a variable as an argument but
>   can't test variable value at compile time.  */
>  
> -  __builtin_set_fpscr_drn(-1);  /* { dg-error "Argument must be a value 
> between 0 and 7" } */ 
> -  __builtin_set_fpscr_drn(8);   /* { dg-error "Argument must be a value 
> between 0 and 7" } */ 
> +  __builtin_set_fpscr_drn(-1);  /* { dg-error "argument 1 must be a variable 
> or a literal between 0 and 7, inclusive" } */ 
> +  __builtin_set_fpscr_drn(8);   /* { dg-error "argument 1 must be a variable 
> or a literal between 0 and 7, inclusive" } */ 
>  
>  }
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c 
> b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
> index aea65091b0c..10391b71008 100644
> --- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_error.c
> @@ -8,13 +8,13 @@ int main ()
>   int arguments.  The builtins __builtin_set_fpscr_rn() also supports a
>   variable as an argument but can't test variable value at compile time.  
> */
>  
> -  __builtin_mtfsb0(-1);  /* { dg-error "Argument must be a constant between 
> 0 and 31" } */
> -  __builtin_mtfsb0(32);  /* { dg-error "Argument must be a constant between 
> 0 and 31" } */
> +  __builtin_mtfsb0(-1);  /* { dg-error "argument 1 must be a 5-bit unsigned 
> literal" } */
> +  __builtin_mtfsb0(32);  /* { dg-error "argument 1 must be a 5-bit unsigned 
> literal" } */
>  
> -  __builtin_mtfsb1(-1);  /* { dg-error "Argument must be a constant between 
> 0 and 31" } */
> -  __builtin_mtfsb1(32);  /* { dg-error "Argument must be a constant between 
> 0 and 31" } */ 
> +  __builtin_mtfsb1(-1);  /* { dg-error "argument 1 must be a 5-bit unsigned 
> literal" } */
> +  __builtin_mtfsb1(32);  /* { dg-error "argument 1 must be a 5-bit unsigned 
> literal" } */ 
>  
> -  __builtin_set_fpscr_rn(-1);  /* { dg-error "Argument must be a value 
> between 0 and 3" } */ 
> -  __builtin_set_fpscr_rn(4);   /* { dg-error "Argument must be a value 
> between 0 and 3" } */ 
> +  __builtin_set_fpscr_rn(-1);  /* { dg-error "argument 1 must be a variable 
> or a literal between 0 and 3, inclusive" } */ 
> +  __builtin_set_fpscr_rn(4);   /* { dg-error "argument 1 must be a variable 
> or a literal between 0 and 3, inclusive" } */ 
>  }
>  


[PATCH, PING] rs6000: Builtins test changes for pragma_misc9.c

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping this patch.

Thanks!
Bill

On 11/18/21 10:18 AM, Bill Schmidt wrote:
> Hi!  This patch is broken out from the test suite patch for the new
> builtins support.  This one is just a minor adjustment for the error
> message wording.
>
> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64)
> with no regressions.  Is this okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-11-17  Bill Schmidt  
>
> gcc/testsuite/
>   * gcc.target/powerpc/pragma_misc9.c: Adjust error message.
> ---
>  gcc/testsuite/gcc.target/powerpc/pragma_misc9.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c 
> b/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
> index e03099bd084..c1667d9f7db 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pragma_misc9.c
> @@ -20,7 +20,7 @@ vector bool int
>  test2 (vector signed int a, vector signed int b)
>  {
>return vec_cmpnez (a, b);
> -  /* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mcpu=power9' 
> option" "" { target *-*-* } .-1 } */
> +  /* { dg-error "'__builtin_altivec_vcmpnezw' requires the '-mcpu=power9' 
> and '-mvsx' options" "" { target *-*-* } .-1 } */
>  }
>  
>  #pragma GCC target ("cpu=power7")
> @@ -28,7 +28,7 @@ vector signed int
>  test3 (vector signed int a, vector signed int b)
>  {
>return vec_mergee (a, b);
> -  /* { dg-error "'__builtin_altivec_vmrgew_v4si' requires the 
> '-mpower8-vector' option" "" { target *-*-* } .-1 } */
> +  /* { dg-error "'__builtin_altivec_vmrgew_v4si' requires the '-mcpu=power8' 
> and '-mvsx' options" "" { target *-*-* } .-1 } */
>  }
>  
>  #pragma GCC target ("cpu=power6")


[PATCH, PING] rs6000: Builtins test changes for pr80315-*.c, pr88100.c

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping this patch.

Thanks!
Bill

On 11/18/21 10:15 AM, Bill Schmidt wrote:
> Hi!  This patch is broken out from the test case patch for the new
> builtins support.
>
> One advantage of the new builtins support is uniform error messages for
> arguments with restricted values.  Previously this was done in many places
> in an ad hoc manner, with little uniformity.  This patch adjusts the
> expected error messages accordingly.
>
> All error messages are now one of the following:
>   "argument %d must be a %d-bit unsigned literal"
>   "argument %d must be a literal between %d and %d, inclusive"
>   "argument %d must be a variable or a literal between %d and %d, inclusive"
>   "argument %d must be either a literal %d or a literal %d"
>
> These messages were chosen to require the fewest changes from previous
> messages while still introducing uniformity.  This patch adjusts error
> messages for some cases where this produces changed messages.
>
> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with
> no regressions.  is this okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-11-17  Bill Schmidt  
>
> gcc/testsuite/
>   * gcc.target/powerpc/pr80315-1.c: Adjust error message.
>   * gcc.target/powerpc/pr80315-2.c: Likewise.
>   * gcc.target/powerpc/pr80315-3.c: Likewise.
>   * gcc.target/powerpc/pr80315-4.c: Likewise.
>   * gcc.target/powerpc/pr88100.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/powerpc/pr80315-1.c |  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr80315-2.c |  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr80315-3.c |  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr80315-4.c |  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr88100.c   | 12 ++--
>  5 files changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-1.c 
> b/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
> index e2db0ff4b5f..f37f1f169a2 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-1.c
> @@ -10,6 +10,6 @@ main()
>int mask;
>  
>/* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
> -  res = __builtin_crypto_vshasigmaw (test, 1, 0xff); /* { dg-error {argument 
> 3 must be in the range \[0, 15\]} } */
> +  res = __builtin_crypto_vshasigmaw (test, 1, 0xff); /* { dg-error {argument 
> 3 must be a 4-bit unsigned literal} } */
>return 0;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-2.c 
> b/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
> index 144b705c012..0819a0511b7 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-2.c
> @@ -10,6 +10,6 @@ main ()
>int mask;
>  
>/* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
> -  res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error {argument 
> 3 must be in the range \[0, 15\]} } */
> +  res = __builtin_crypto_vshasigmad (test, 1, 0xff); /* { dg-error {argument 
> 3 must be a 4-bit unsigned literal} } */
>return 0;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-3.c 
> b/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
> index 99a3e24eadd..cc2e46cf5cb 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-3.c
> @@ -12,6 +12,6 @@ main ()
>int mask;
>  
>/* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
> -  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be 
> in the range \[0, 15\]} } */
> +  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be a 
> 4-bit unsigned literal} } */
>return res;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr80315-4.c 
> b/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
> index 7f5f6f75029..ac12910741b 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr80315-4.c
> @@ -12,6 +12,6 @@ main ()
>int mask;
>  
>/* Argument 2 must be 0 or 1.  Argument 3 must be in range 0..15.  */
> -  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be 
> in the range \[0, 15\]} } */
> +  res = vec_shasigma_be (test, 1, 0xff); /* { dg-error {argument 3 must be a 
> 4-bit unsigned literal} } */
>return res;
>  }
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr88100.c 
> b/gcc/testsuite/gcc.target/powerpc/pr88100.c
> index 4452145ce95..764c897a497 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr88100.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr88100.c
> @@ -10,35 +10,35 @@
>  vector unsigned char
>  splatu1 (void)
>  {
> -  return vec_splat_u8(0x100);/* { dg-error "argument 1 must be a 5-bit 
> signed literal" } */
> +  return vec_splat_u8(0x100);/* { dg-error "argument 1 must be a literal 
> between -16 and 15, inclusive" } */
>  }
>  
>  vector unsigned short
>  splatu2 (void)
>  {
> -  return vec_splat_u16(0x1);/* { dg-error "argument 1 must be a 5-bit 
> signed literal" } */
>

Patch ping related to OpenMP

2021-12-01 Thread Tobias Burnus

Patch ping - lightly sorted by priority, simplicity and dependency:

* [PATCH, v5, OpenMP 5.0] Improve OpenMP target support for C++ [PR92120 v5]
  https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584602.html

* [PATCH, v2, OpenMP 5.0] Remove array section base-pointer mapping semantics, 
and other front-end adjustments (mainline trunk)
  https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584994.html

* [PATCH] C, C++, Fortran, OpenMP: Add 'has_device_addr' clause to 'target' 
construct
  https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585361.html

* libgomp.texi: Update OMP_PLACES
  https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582121.html

* [patch] Fortran/OpenMP: Support most of 5.1 atomic extensions
  https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584450.html

* [PATCH 00/16] OpenMP: lvalues in "map" clauses and struct handling rework
  https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585439.html

Thanks,

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH, PING] rs6000: Builtins test changes for compare-bytes tests

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping this patch.

Thanks!
Bill

On 11/18/21 7:47 AM, Bill Schmidt wrote:
> Hi!  This patch is broken out from the patch with test suite changes for the
> new builtins support.
>
> With the old builtins support, cmpb-2.c produces:
>   warning: implicit declaration of function '__builtin_cmpb; did you mean 
> '__builtin_bcmp'?
>
> With the new support, it produces:
>   error: '__builtin_p6_cmpb requires the '-mcpu=power6' option and either the 
> '-m64' or '-mpowerpc64' option
>   note: builtin '__builtin_cmpb' requires builtin '__builtin_p6_cmpb'
>
> The reason for this is that this builtin wasn't even initialized in the
> old support.  This reflects a difference in philosophy between the old and
> new methods.  The old support often doesn't initialize builtins for which
> the conditions don't apply based on compile options, but this can backfire
> in general when such constructs as "#pragma target" are used.  The new
> support initializes all builtins, and waits until expand time to determine
> whether or not they are enabled.  Besides added flexibility, we also get
> better error messages as a result.
>
> The case for cmpb32-2.c is similar.
>
> Tested on powerpc64le-linux-gnu and powerpc64-linux-gnu (-m32/-m64) with
> no regressions.  Is this okay for trunk?
>
> Thanks!
> Bill
>
>
> 2021-11-17  Bill Schmidt  
>
> gcc/testsuite/
>   * gcc.target/powerpc/cmpb-2.c: Adjust error message.
>   * gcc.target/powerpc/cmpb32-2.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/powerpc/cmpb-2.c   | 2 +-
>  gcc/testsuite/gcc.target/powerpc/cmpb32-2.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb-2.c 
> b/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
> index 113ab6a5f99..02b84d0731d 100644
> --- a/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/cmpb-2.c
> @@ -8,7 +8,7 @@ void abort ();
>  unsigned long long int
>  do_compare (unsigned long long int a, unsigned long long int b)
>  {
> -  return __builtin_cmpb (a, b);  /* { dg-warning "implicit declaration 
> of function '__builtin_cmpb'" } */
> +  return __builtin_cmpb (a, b);  /* { dg-error "'__builtin_p6_cmpb' 
> requires the '-mcpu=power6' option" } */
>  }
>  
>  void
> diff --git a/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c 
> b/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
> index 37b54745e0e..d4264ab6e7d 100644
> --- a/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/cmpb32-2.c
> @@ -7,7 +7,7 @@ void abort ();
>  unsigned int
>  do_compare (unsigned int a, unsigned int b)
>  {
> -  return __builtin_cmpb (a, b);  /* { dg-warning "implicit declaration of 
> function '__builtin_cmpb'" } */
> +  return __builtin_cmpb (a, b);  /* { dg-error "'__builtin_p6_cmpb_32' 
> requires the '-mcpu=power6' option" } */
>  }
>  
>  void


[PATCH, PING] rs6000: Builtins test changes for BFP scalar tests

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!  I'd like to ping this patch.  Segher had objected to the change in 
diagnostics,
but I hope we've solved that now with the better informational message [1].

Thanks!
Bill

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585250.html

On 11/17/21 2:58 PM, Bill Schmidt wrote:
> Hi!  This patch is broken out of the previous patch for all the builtins test
> suite adjustments.  Here we have some slight changes in error messages due to
> how the internals have changed between the old and new builtins methods.
>
> For scalar-extract-exp-2.c we change:
>   error: '__builtin_vec_scalar_extract_exp is not supported in this compiler 
> configuration'
>
> to:
>   error: '__builtin_vsx_scalar_extract_exp' requires the '-mcpu=power9' 
> option and either the '-m64' or '-mpowerpc64' option
>   note: builtin '__builtin_vec_scalar_extract_exp' requires builtin 
> '__builtin_vsx_scalar_extract_exp'
>
> The new message provides more information.  In both cases, it is less than
> ideal that we don't refer to scalar_extract_exp, which is referenced in
> the source line, but this is because scalar_extract_exp is #define'd to
> __builtin_vec_scalar_extract_exp, so it's unavoidable.  Certainly this is no
> worse than before, and arguably better.
>
> The cases for:
>   scalar-insert-exp-2.c
>   scalar-insert-exp-5.c
>   scalar-insert-exp-8.c
> are all similar.
>
> For scalar-extract-sig-2.c we again change:
>   error: '__builtin_vec_scalar_extract_sig' is not supported in this compiler 
> configuration'
>
> to:
>   error: '__builtin_vsx_scalar_extract_sig' requires the '-mcpu=power9' 
> option and either the '-m64' or '-mpowerpc64' option
>   note: builtin '__builtin_vec_scalar_extract_sig' requires builtin 
> '__builtin_vsx_scalar_extract_sig'
>
> Here it is clearer because there is no #define to muddy things up, and
> again the new message is arguably better than the old.
>
> For scalar-test-neg-{2,3,5}.c, we actually change the test case.  This is
> because we deliberately removed some undocumented and pointless   
> overloads,
> where each overload mapped to a single builtin.  These were:
>   __builtin_vec_scalar_test_neg_sp
>   __builtin_vec_scalar_test_neg_dp
>   __builtin_vec_scalar_test_neg_qp
> which are redundant with the "real" overload:
>   __builtin_vec_scalar_test_neg
> The latter maps to three builtins of the appropriate type.
>
> The revised test case uses the "real" overload instead, and otherwise the
> changes to the error messages are the same as for all the other cases.
>
> 2021-11-17  Bill Schmidt  
>
> gcc/testsuite/
>   * gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Adjust error
>   message.
>   * gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-8.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-test-neg-2.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-test-neg-3.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-test-neg-5.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c | 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c | 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-2.c  | 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-5.c  | 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-8.c  | 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-2.c| 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-3.c| 2 +-
>  gcc/testsuite/gcc.target/powerpc/bfp/scalar-test-neg-5.c| 2 +-
>  8 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c 
> b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
> index 922180675fc..53b67c95cf9 100644
> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-2.c
> @@ -14,7 +14,7 @@ get_exponent (double *p)
>  {
>double source = *p;
>  
> -  return scalar_extract_exp (source);/* { dg-error 
> "'__builtin_vec_scalar_extract_exp' is not supported in this compiler 
> configuration" } */
> +  return scalar_extract_exp (source);/* { dg-error 
> "'__builtin_vsx_scalar_extract_exp' requires the" } */
>  }
>  
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c 
> b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
> index e24d4bd23fe..39ee74c94dc 100644
> --- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-2.c
> @@ -12,5 +12,5 @@ get_significand (double *p)
>  {
>double source = *p;
>  
> -  return __builtin_vec_scalar_extract_sig (source); /* { dg-error 
> "'__builtin_vec_scalar_extract_sig' is not supported in this compiler 

Re: [PATCH v2] rs6000: Fix a handful of 32-bit built-in function problems

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!

I'd like to ping this patch.  By the way, the diagnostics are improved [1] 
since I
sent it, so that we now inform the user that the overloaded function is 
implemented
by the instantiated function.

Thanks!
Bill

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585250.html

On 11/16/21 12:56 PM, Bill Schmidt wrote:
> Hi!  I previously posted [1] to correct some problems with the new builtins
> support targeting 32-bit code gen.  Based on the discussion, I've made some
> adjustments and would like to submit this for consideration.
>
> We eventually agreed that the strange behavior for -m32 -mpowerpc64 for 
> certain
> HTM builtins should be removed.  All of the registers TEXASR, TEXASRU, TFHAR,
> and TFIAR are now accessed using the unsigned long data type in all 
> configurations.
>
> Segher didn't like the change in the error message for the cmpb-3.c test case,
> but I think this should be fine.  The test case just tests for the error 
> message,
> but there is also a "note" message that provides additional information.  The
> diagnostics that the user sees will look like this:
>
> cmpb-3.c:11:3: error: '__builtin_p6_cmpb' requires the '-mcpu=power6' option 
> and either the '-m64' or '-mpowerpc64' option
> cmpb-3.c:11:3: note: builtin '__builtin_cmpb' requires builtin 
> '__builtin_p6_cmpb'
>
> So it's clear to the user that their use of __builtin_cmpb at line 11 
> triggered
> the error.
>
> Bootstrapped and tested on powerpc64le-linux-gnu, and on powerpc64-linux-gnu
> using -m32/-m64.  Is this okay for trunk?
>
> Thanks!
> Bill
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583905.html
>
>
> 2021-11-16  Bill Schmidt  
>
> gcc/
>   * config/rs6000/rs6000-builtin-new.def (CMPB): Flag as no32bit.
>   (BPERMD): Flag as 32bit (needing special handling for 32-bit).
>   (UNPACK_TD): Return unsigned long long instead of unsigned long.
>   (GET_TEXASR): Return unsigned long instead of unsigned long long.
>   (GET_TEXASRU): Likewise.
>   (GET_TFHAR): Likewise.
>   (GET_TFIAR): Likewise.
>   (SET_TEXASR): Pass unsigned long instead of unsigned long long.
>   (SET_TEXASRU): Likewise.
>   (SET_TFHAR): Likewise.
>   (SET_TFIAR): Likewise.
>   (TABORTDC): Likewise.
>   (TABORTDCI): Likewise.
>   * config/rs6000/rs6000-call.c (rs6000_expand_new_builtin): Fix error
>   handling for no32bit.  Add 32bit handling for RS6000_BIF_BPERMD.
>
> gcc/testsuite/
>   * gcc.target/powerpc/cmpb-3.c: Adjust error message.
> ---
>  gcc/config/rs6000/rs6000-builtin-new.def  | 30 +++
>  gcc/config/rs6000/rs6000-call.c   |  9 ---
>  gcc/testsuite/gcc.target/powerpc/cmpb-3.c |  2 +-
>  3 files changed, 22 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-builtin-new.def 
> b/gcc/config/rs6000/rs6000-builtin-new.def
> index 58dfce1ca37..30556e5c7f2 100644
> --- a/gcc/config/rs6000/rs6000-builtin-new.def
> +++ b/gcc/config/rs6000/rs6000-builtin-new.def
> @@ -273,7 +273,7 @@
>  ; Power6 builtins requiring 64-bit GPRs (even with 32-bit addressing).
>  [power6-64]
>const signed long __builtin_p6_cmpb (signed long, signed long);
> -CMPB cmpbdi3 {}
> +CMPB cmpbdi3 {no32bit}
>  
>  
>  ; AltiVec builtins.
> @@ -2018,7 +2018,7 @@
>  ADDG6S addg6s {}
>  
>const signed long __builtin_bpermd (signed long, signed long);
> -BPERMD bpermd_di {}
> +BPERMD bpermd_di {32bit}
>  
>const unsigned int __builtin_cbcdtd (unsigned int);
>  CBCDTD cbcdtd {}
> @@ -2971,7 +2971,7 @@
>void __builtin_set_fpscr_drn (const int[0,7]);
>  SET_FPSCR_DRN rs6000_set_fpscr_drn {}
>  
> -  const unsigned long __builtin_unpack_dec128 (_Decimal128, const int<1>);
> +  const unsigned long long __builtin_unpack_dec128 (_Decimal128, const 
> int<1>);
>  UNPACK_TD unpacktd {}
>  
>  
> @@ -3014,39 +3014,39 @@
>  
>  
>  [htm]
> -  unsigned long long __builtin_get_texasr ();
> +  unsigned long __builtin_get_texasr ();
>  GET_TEXASR nothing {htm,htmspr}
>  
> -  unsigned long long __builtin_get_texasru ();
> +  unsigned long __builtin_get_texasru ();
>  GET_TEXASRU nothing {htm,htmspr}
>  
> -  unsigned long long __builtin_get_tfhar ();
> +  unsigned long __builtin_get_tfhar ();
>  GET_TFHAR nothing {htm,htmspr}
>  
> -  unsigned long long __builtin_get_tfiar ();
> +  unsigned long __builtin_get_tfiar ();
>  GET_TFIAR nothing {htm,htmspr}
>  
> -  void __builtin_set_texasr (unsigned long long);
> +  void __builtin_set_texasr (unsigned long);
>  SET_TEXASR nothing {htm,htmspr}
>  
> -  void __builtin_set_texasru (unsigned long long);
> +  void __builtin_set_texasru (unsigned long);
>  SET_TEXASRU nothing {htm,htmspr}
>  
> -  void __builtin_set_tfhar (unsigned long long);
> +  void __builtin_set_tfhar (unsigned long);
>  SET_TFHAR nothing {htm,htmspr}
>  
> -  void __builtin_set_tfiar (unsigned long long);
> +  void __builtin_set_tfiar (unsigned long);
>  

Re: [PATCH] path solver: Minimize exported ranges to subsequent blocks.

2021-12-01 Thread Aldy Hernandez via Gcc-patches
I'm going to hold off pushing this until I hear from the global
maintainers and/or release managers.

This patch fixes a pathological performance case, but it may also be
handled by Andrew's fixes in this area.  It's up to y'all to decide if
it's something we want in this release.  An alternative would be to
keep track of any pathological issues going forward, and see if this
fixes the issue.  I'm going to be taking leave any day now :).

Aldy

On Sat, Nov 27, 2021 at 9:32 AM Aldy Hernandez  wrote:
>
> Currently we have a set of interesting names in the path that we solve
> for.  However, solving *all* of them on exit from each block is
> unnecessary, even if an edge range is available.  We can reduce the
> queried edges by only solving SSA names from the current block
> that will be needed in subsequent blocks.
>
> With this patch we avoid the explosion in PR103254, without the need for
> Andrew's patch (which is still independently needed).
>
> In my testbed we loose 6 threads in the non-resolving threading passes,
> but the overall thread count is the same, as we pick them up later.  We
> could probably try harder to get them, but it's probably not worth the
> effort.
>
> Benchmarking this patch shows no difference in overall compilation speed
> and a tiny slowdown (< 0.80%) for some of the threading passes.
> Benchmarked for both -O2, -O3, and -O2 --param ranger-logical-depth=10.
>
> I'm unsure what the review requirements are in stage3, so...
>
> OK for trunk?
>
> p.s. I saw some cleanups that could be done (intersecting more bitmaps,
> putting the rest of the bitmaps in the new obstack, etc etc), but I
> avoided doing them to minimize the churn in stage3.
>
> Regstrapped on x86-64 & ppc64le Linux.
>
> PR tree-optimization/103254
>
> gcc/ChangeLog:
>
> * gimple-range-path.cc (path_range_query::path_range_query):
> Initialize obstack.
> (path_range_query::~path_range_query): Release obstack.
> (path_range_query::needs_on_entry_for_block): New.
> (path_range_query::compute_needs_on_entry_set): New.
> (path_range_query::compute_ranges_in_block): Only calculate
> outgoing edges for m_needs_on_entry set.
> (path_range_query::compute_ranges): Call
> compute_needs_on_entry_set.
> * gimple-range-path.h: Add needs_on_entry_for_block,
> compute_needs_on_entry_set, m_bitmaps, and m_needs_on_entry.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/pr103254.c: Add -fdump-tree-vrp2-details.
> * gcc.dg/pr103254-2.c: New test.
> ---
>  gcc/gimple-range-path.cc  | 67 +--
>  gcc/gimple-range-path.h   |  4 ++
>  gcc/testsuite/gcc.dg/pr103254-2.c | 29 +
>  gcc/testsuite/gcc.dg/pr103254.c   |  6 ++-
>  4 files changed, 102 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr103254-2.c
>
> diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
> index b9c71226c1c..32226df05b5 100644
> --- a/gcc/gimple-range-path.cc
> +++ b/gcc/gimple-range-path.cc
> @@ -48,6 +48,7 @@ path_range_query::path_range_query (bool resolve, 
> gimple_ranger *ranger)
>  m_ranger = ranger;
>
>m_oracle = new path_oracle (m_ranger->oracle ());
> +  bitmap_obstack_initialize (&m_bitmaps);
>  }
>
>  path_range_query::~path_range_query ()
> @@ -57,6 +58,7 @@ path_range_query::~path_range_query ()
>  delete m_ranger;
>BITMAP_FREE (m_has_cache_entry);
>delete m_cache;
> +  bitmap_obstack_release (&m_bitmaps);
>  }
>
>  // Return TRUE if NAME is in the import bitmap.
> @@ -401,6 +403,62 @@ path_range_query::compute_ranges_in_phis (basic_block bb)
>  }
>  }
>
> +// Return the set of SSA names that would be useful to have resolved
> +// on entry to this block.  These are the names that will ultimately
> +// help resolve the final conditional on the path.
> +
> +void
> +path_range_query::needs_on_entry_for_block (bitmap on_entry, unsigned bbn)
> +{
> +  gori_compute &g = m_ranger->gori ();
> +  basic_block next = m_path[bbn];
> +  bitmap_copy (on_entry, g.imports (next));
> +
> +  // Any incoming PHI args are needed on entry, but are not in the
> +  // g.imports() above, so add them manually.
> +  for (auto iter = gsi_start_phis (next); !gsi_end_p (iter); gsi_next 
> (&iter))
> +{
> +  gphi *phi = iter.phi ();
> +  tree result = gimple_phi_result (phi);
> +
> +  if (bitmap_bit_p (m_imports, SSA_NAME_VERSION (result)))
> +   for (size_t i = 0; i < gimple_phi_num_args (phi); ++i)
> + {
> +   tree arg = gimple_phi_arg (phi, i)->def;
> +
> +   if (TREE_CODE (arg) == SSA_NAME
> +   && m_path.contains (gimple_phi_arg_edge (phi, i)->src))
> + bitmap_set_bit (on_entry, SSA_NAME_VERSION (arg));
> + }
> +}
> +  bitmap_and_into (on_entry, m_imports);
> +}
> +
> +// Compute the set of needed on-entry names for all path blocks.
> +
> +void
> +path_range_query::compute_needs_on_entr

Re: [PATCH] path solver: Use only one ssa_global_cache.

2021-12-01 Thread Aldy Hernandez via Gcc-patches
No one has said anything in 48 hours.  By the power vested in me, I
declare it approved.

Pushed.

On Mon, Nov 29, 2021 at 5:21 PM Aldy Hernandez  wrote:
>
> We're using a temporary range cache while computing ranges for PHIs to
> make sure the real cache doesn't get set until all PHIs are computed.
> With the ltrans beast in LTO mode this causes undue overhead.
>
> Since we already have a bitmap to indicate whether there's a cache
> entry, we can avoid the extra cache object by clearing it while PHIs
> are being calculated.
>
> Tested on x86-64 Linux.
>
> OK for trunk?
>
> gcc/ChangeLog:
>
> PR tree-optimization/103409
> * gimple-range-path.cc (path_range_query::compute_ranges_in_phis):
> Do all the work with just one ssa_global_cache.
> * gimple-range-path.h: Remove m_tmp_phi_cache.
> ---
>  gcc/gimple-range-path.cc | 23 +++
>  gcc/gimple-range-path.h  |  2 --
>  2 files changed, 11 insertions(+), 14 deletions(-)
>
> diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
> index b9c71226c1c..15ef72fd492 100644
> --- a/gcc/gimple-range-path.cc
> +++ b/gcc/gimple-range-path.cc
> @@ -375,30 +375,29 @@ void
>  path_range_query::compute_ranges_in_phis (basic_block bb)
>  {
>int_range_max r;
> -  gphi_iterator iter;
> +  auto_bitmap phi_set;
>
>// PHIs must be resolved simultaneously on entry to the block
>// because any dependencies must be satistifed with values on entry.
>// Thus, we calculate all PHIs first, and then update the cache at
>// the end.
>
> -  m_tmp_phi_cache.clear ();
> -  for (iter = gsi_start_phis (bb); !gsi_end_p (iter); gsi_next (&iter))
> +  for (auto iter = gsi_start_phis (bb); !gsi_end_p (iter); gsi_next (&iter))
>  {
>gphi *phi = iter.phi ();
>tree name = gimple_phi_result (phi);
>
>if (import_p (name) && range_defined_in_block (r, name, bb))
> -   m_tmp_phi_cache.set_global_range (name, r);
> -}
> -  for (iter = gsi_start_phis (bb); !gsi_end_p (iter); gsi_next (&iter))
> -{
> -  gphi *phi = iter.phi ();
> -  tree name = gimple_phi_result (phi);
> -
> -  if (m_tmp_phi_cache.get_global_range (r, name))
> -   set_cache (r, name);
> +   {
> + unsigned v = SSA_NAME_VERSION (name);
> + set_cache (r, name);
> + bitmap_set_bit (phi_set, v);
> + // Pretend we don't have a cache entry for this name until
> + // we're done with all PHIs.
> + bitmap_clear_bit (m_has_cache_entry, v);
> +   }
>  }
> +  bitmap_ior_into (m_has_cache_entry, phi_set);
>  }
>
>  // Compute ranges defined in the current block, or exported to the
> diff --git a/gcc/gimple-range-path.h b/gcc/gimple-range-path.h
> index 57a9ae9bdcd..77c92c07bc1 100644
> --- a/gcc/gimple-range-path.h
> +++ b/gcc/gimple-range-path.h
> @@ -82,8 +82,6 @@ private:
>// Range cache for SSA names.
>ssa_global_cache *m_cache;
>
> -  ssa_global_cache m_tmp_phi_cache;
> -
>// Set for each SSA that has an active entry in the cache.
>bitmap m_has_cache_entry;
>
> --
> 2.31.1
>



Re: [PATCH, PR90030] Fortran OpenMP/OpenACC array mapping alignment fix

2021-12-01 Thread Jakub Jelinek via Gcc-patches
On Thu, Nov 04, 2021 at 04:23:43PM +0800, Chung-Lin Tang wrote:
> 2021-11-04  Chung-Lin Tang  
>   Thomas Schwinge 
> 
>   PR fortran/90030
> 
> gcc/fortran/ChangeLog:
> 
>   * trans-openmp.c (gfc_omp_finish_clause): Remove fold_convert to pointer
>   to char_type_node, add gcc_assert of POINTER_TYPE_P.
>   (gfc_trans_omp_array_section): Likewise.
>   (gfc_trans_omp_clauses): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gfortran.dg/goacc/finalize-1.f: Adjust scan test.
>   * gfortran.dg/gomp/affinity-clause-1.f90: Likewise.
>   * gfortran.dg/gomp/affinity-clause-5.f90: Likewise.
>   * gfortran.dg/gomp/defaultmap-4.f90: Likewise.
>   * gfortran.dg/gomp/defaultmap-5.f90: Likewise.
>   * gfortran.dg/gomp/defaultmap-6.f90: Likewise.
>   * gfortran.dg/gomp/map-3.f90: Likewise.
>   * gfortran.dg/gomp/pr78260-2.f90: Likewise.
>   * gfortran.dg/gomp/pr78260-3.f90: Likewise.
> 
> libgomp/ChangeLog:
> 
>   * testsuite/libgomp.oacc-fortran/pr90030.f90: New test.
>   * testsuite/libgomp.fortran/pr90030.f90: New test.

Ok, thanks.

Jakub



Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-12-01 Thread Peter Bergner via Gcc-patches
On 12/1/21 9:06 AM, Qing Zhao wrote:
>> On Dec 1, 2021, at 3:01 AM, Richard Biener  
>> wrote:
>> Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init
>> instead of fixing up things at expansion time.
> 
> Agreed. 
> 
> So, Peter, will you update the routine “is_var_need_auto_init” in gimplify.c
> to exclude OPAQUE_TYPE to fix this issue?

Sure, I can give that a try and verify it fixes the ICE too.

Peter




Re: [PATCH] Allow loop crossing paths in back threader copier.

2021-12-01 Thread Aldy Hernandez via Gcc-patches
On Wed, Dec 1, 2021 at 2:36 PM Richard Biener
 wrote:
>
> On Tue, Nov 30, 2021 at 4:48 PM Aldy Hernandez via Gcc-patches
>  wrote:
> >
> > We are currently restricting loop crossing paths in the generic copier
> > used by the back threader, but we should be able to handle them after
> > loop_done has completed.
>
> But this isn't a cost thing but a correctness (as in, can we update
> loops properly)
> thing.  We are passing 'loop' down to copy_bbs and very much rely on
> seeing either BBs of that loop or copying complete subloops of it.
>
> You will likely see excess loop fixup caused by this or ICEs in case the
> caller of this function will not always set LOOPS_NEED_FIXUP
> (not that copy_bbs handling of loops is perfect).

Ughh, fair enough.

In which case at least I've commented in the PR what the problem is
here.  Perhaps we could benefit from a BB copier that could handle
loops, which is way above my circle of competence ;-).

Aldy



Re: [PATCH][wwwdocs] Update section on enormous source files in htdocs/projects/beginner.html

2021-12-01 Thread Gerald Pfeifer
Hi Eric,

On Wed, 24 Nov 2021, Eric Gallager wrote:
> This next patch does more than just removing old stuff: it adds an
> extra sentence to describe a shell command used to generate a list, so
> to verify that I've got the shell command right, I'm asking for a
> review.

-There are several other files in this size range, which I have left
+There are several other files in this size range, which are left
 out because touching them at all is unwise (reload, the Fortran front
 end).  You can try, but I am not responsible for any damage to your
 sanity which may result.

while we are touching this, how about toning the second half of this
paragraph down and moving away from first person?

How about something like "...breaking them up likely may prove rather
difficult" (or similar, happy to leave this to you as native spaker)?


+Note that the list of large files in this section is generated with the
+following shell command, run from the gcc subdirectory:
+
+
+   du -sh *.{c,h,cc} */*.{c,h,cc} | sort -hr | grep -v fortran | head -n 14
+

This looks like it does what you want it to do.

(Pulling up the database courses I have given in a previous life, I'd pull 
in the grep before the sort - O(n) filtering before O(n·log n) processing
- but even on my notebook both are instantenous. ;-) )


Please consider the two suggestions above and commit the result; just
share the final patch here (no review required).

Thanks,
Gerald


Re: [PATCH] dwarf: Multi-register CFI address support.

2021-12-01 Thread Jakub Jelinek via Gcc-patches
On Thu, Nov 11, 2021 at 06:12:50PM +, Hafiz Abid Qadeer wrote:
>   * dwarf2cfi.c (dw_stack_pointer_regnum): Change type to struct cfa_reg.
>   (dw_frame_pointer_regnum): Likewise.
>   (new_cfi_row): Use set_by_dwreg.
>   (get_cfa_from_loc_descr): Use set_by_dwreg.  Support register spans.
>   handle DW_OP_bregx with DW_OP_breg{0-31}. Support DW_OP_lit*,

2 spaces instead of 1 before Support above.

> +   cfa->reg.set_by_dwreg ((op == DW_OP_bregx)
> + ? (ptr->dw_loc_oprnd1.v.val_int) : (op - DW_OP_breg0));

Formatting.  All those 3 () inner pairs are unnecessary, and it would be
nicer to use a temporary, like:
  unsigned regno
= (op == DW_OP_bregx
   ? ptr->dw_loc_oprnd1.v.val_int : op - DW_OP_breg0);
  cfa->reg.set_by_dwreg (regno);

> +   unsigned int regno = (op == DW_OP_bregx)
> + ? (ptr->dw_loc_oprnd1.v.val_int) : (op - DW_OP_breg0);

With the ()s similarly, and also ? should be below op.

> +   gcc_assert (regno == (cfa->reg.reg - 1));

Again, the inner () pair is unnecessary.
> +   cfa->reg.span_width = (cfa->offset.to_constant () / 8);

And here the outer () pair.

> +
> +#if CHECKING_P

Please use if (CHECKING_P) instead of #if.

> +  /* Ensure that the above assumption is accurate.  */
> +  for (unsigned int i = 0; i < result.span; i++)
> + {
> +   gcc_assert (known_eq (GET_MODE_SIZE (GET_MODE (XVECEXP (span,
> +   0, i))),
> + result.span_width));
> +   gcc_assert (REG_P (XVECEXP (span, 0, i)));
> +   gcc_assert (dwf_regno (XVECEXP (span, 0, i)) == result.reg + i);
> + }
> +#endif
> +}
> +
> +  return result;

> @@ -3135,7 +3267,8 @@ static unsigned int
>  execute_dwarf2_frame (void)
>  {
>/* Different HARD_FRAME_POINTER_REGNUM might coexist in the same file.  */
> -  dw_frame_pointer_regnum = DWARF_FRAME_REGNUM (HARD_FRAME_POINTER_REGNUM);
> +  dw_frame_pointer_regnum = dwf_cfa_reg (gen_rtx_REG
> + (Pmode, HARD_FRAME_POINTER_REGNUM));

If at all possible, avoid function name on one line and ( with first
argument on another one.  In the above case it can be easily avoided by
  dw_frame_pointer_regnum
= dwf_cfa_reg (gen_rtx_REG (Pmode, HARD_FRAME_POINTER_REGNUM));

> +static void
> +build_breg_loc (struct dw_loc_descr_node **head, unsigned int regno)
> +{
> +  if (regno <= 31)
> +add_loc_descr (head, new_loc_descr ((enum dwarf_location_atom)
> +   (DW_OP_breg0 + regno),  0, 0));

Bad formatting, (DW_OP_breg0 should be below (enum
> +
> +  /* deal with the remaining registers in the span.  */

Capital letter on Deal

> +  for (int i = (reg.span - 2); i >= 0; i--)

Please remove the redundant inner () pair.
> --- a/gcc/dwarf2out.h
> +++ b/gcc/dwarf2out.h
> @@ -119,6 +119,38 @@ struct GTY(()) dw_fde_node {
>  };
>  
>  
> +/* This represents a register, in DWARF_FRAME_REGNUM space, for use in CFA
> +   definitions and expressions.
> +   Most architectures only need a single register number, but some (amdgcn)
> +   have pointers that span multiple registers.  DWARF permits arbitrary
> +   register sets but existing use-cases only require contiguous register
> +   sets, as represented here.  */
> +struct GTY(()) cfa_reg {
> +  unsigned int reg;
> +  unsigned short span;
> +  unsigned short span_width;  /* A.K.A. register mode size.  */
> +
> +  cfa_reg& set_by_dwreg (unsigned int r)
> +{
> +  reg = r;
> +  span = 1;
> +  span_width = 0;  /* Unknown size (permitted when span == 1).  */
> +  return *this;
> +}
> +
> +  bool operator== (const cfa_reg other) const

The normal C++ way would be (const cfa_reg &other), wouldn't it?
Or otherwise the const keyword doesn't make much sense.

> +{
> +  return (reg == other.reg && span == other.span
> +   && (span_width == other.span_width
> +   || (span == 1
> +   && (span_width == 0 || other.span_width == 0;
> +}
> +  bool operator!= (const cfa_reg other) const

Likewise.
Please add an empty line above operator!=

> +{
> +  return !(*this == other);
> +}
> +};
> +

Otherwise LGTM.

Jakub



[PATCH] rs6000: Mirror fix for PR102347 into the new builtins support

2021-12-01 Thread Bill Schmidt via Gcc-patches
Hi!

Recently Kewen fixed a problem in the old builtins support where
rs6000_builtin_decl prematurely indicated that a target builtin is
unavailable.  This also needs to be done for the new builtins support, but in
this case we have to ensure the error message is still produced from the
overload support in rs6000-c.c.  Unfortunately, this is less straightforward
than it could be, because header file includes need to be adjusted to make this
happen.  Someday we'll consolidate all the builtin code in one file and this
won't have to be so ugly.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is this
okay for trunk?

Thanks!
Bill


2021-12-01  Bill Schmidt  

gcc/
PR target/102347
* config/rs6000/rs6000-c.c (rs6000-builtins.h): Stop including.
(rs6000-internal.h): Include.
(altivec_resolve_new_overloaded_builtin): Move call to
rs6000_invalid_new_builtin here from rs6000_new_builtin_decl.
* config/rs6000/rs6000-call.c (rs6000-builtins.h): Stop including.
(rs6000_invalid_new_builtin): Remove static qualifier.
(rs6000_new_builtin_decl): Remove test for supported builtin.
* config/rs6000/rs6000-internal.h (rs6000-builtins.h): Include.
(rs6000_invalid_new_builtin): Declare.
* config/rs6000/rs6000.c (rs6000-builtins.h): Don't include.
---
 gcc/config/rs6000/rs6000-c.c| 11 +++
 gcc/config/rs6000/rs6000-call.c |  9 +
 gcc/config/rs6000/rs6000-internal.h |  3 +++
 gcc/config/rs6000/rs6000.c  |  1 -
 4 files changed, 11 insertions(+), 13 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 5eeac9d4c06..8e83d97e72f 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -35,7 +35,7 @@
 #include "langhooks.h"
 #include "c/c-tree.h"
 
-#include "rs6000-builtins.h"
+#include "rs6000-internal.h"
 
 static tree altivec_resolve_new_overloaded_builtin (location_t, tree, void *);
 
@@ -2987,11 +2987,14 @@ altivec_resolve_new_overloaded_builtin (location_t loc, 
tree fndecl,
const char *name = rs6000_overload_info[adj_fcode].ovld_name;
if (!supported)
  {
+   /* Indicate that the instantiation of the overloaded builtin
+  name is not available with the target flags in effect.  */
+   rs6000_gen_builtins fcode = (rs6000_gen_builtins) instance->bifid;
+   rs6000_invalid_new_builtin (fcode);
+   /* Provide clarity of the relationship between the overload
+  and the instantiation.  */
const char *internal_name
  = rs6000_builtin_info_x[instance->bifid].bifname;
-   /* An error message making reference to the name of the
-  non-overloaded function has already been issued.  Add
-  clarification of the previous message.  */
rich_location richloc (line_table, input_location);
inform (&richloc,
"overloaded builtin %qs is implemented by builtin %qs",
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index cd477fa4876..01688c4169d 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -69,7 +69,6 @@
 #include "opts.h"
 
 #include "rs6000-internal.h"
-#include "rs6000-builtins.h"
 
 #if TARGET_MACHO
 #include "gstab.h"  /* for N_SLINE */
@@ -11905,7 +11904,7 @@ rs6000_invalid_builtin (enum rs6000_builtins fncode)
 /* Raise an error message for a builtin function that is called without the
appropriate target options being set.  */
 
-static void
+void
 rs6000_invalid_new_builtin (enum rs6000_gen_builtins fncode)
 {
   size_t j = (size_t) fncode;
@@ -16624,12 +16623,6 @@ rs6000_new_builtin_decl (unsigned code, bool /* 
initialize_p */)
   if (fcode >= RS6000_OVLD_MAX)
 return error_mark_node;
 
-  if (!rs6000_new_builtin_is_supported (fcode))
-{
-  rs6000_invalid_new_builtin (fcode);
-  return error_mark_node;
-}
-
   return rs6000_builtin_decls_x[code];
 }
 
diff --git a/gcc/config/rs6000/rs6000-internal.h 
b/gcc/config/rs6000/rs6000-internal.h
index 88cf9bd5692..a880fd37618 100644
--- a/gcc/config/rs6000/rs6000-internal.h
+++ b/gcc/config/rs6000/rs6000-internal.h
@@ -22,6 +22,8 @@
 #ifndef GCC_RS6000_INTERNAL_H
 #define GCC_RS6000_INTERNAL_H
 
+#include "rs6000-builtins.h"
+
 /* Structure used to define the rs6000 stack */
 typedef struct rs6000_stack {
   int reload_completed;/* stack info won't change from here on 
*/
@@ -140,6 +142,7 @@ extern void rs6000_output_mi_thunk (FILE *file,
 extern bool rs6000_output_addr_const_extra (FILE *file, rtx x);
 extern bool rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi);
 extern tree rs6000_build_builtin_va_list (void);
+extern void rs6000_invalid_new_builtin (rs6000_gen_builtins fncode);
 extern void rs6000_va_start (tree valist, rtx nextarg);
 extern tree rs6000_gimplify_va_arg (tree valist, tree type, gimple_

[PATCH] c++: Fix for decltype(auto) and parenthesized expr [PR103403]

2021-12-01 Thread Marek Polacek via Gcc-patches
In r11-4758, I tried to fix this problem:

  int &&i = 0;
  decltype(auto) j = i; // should behave like int &&j = i; error

wherein do_auto_deduction was getting confused with a REFERENCE_REF_P
and it didn't realize its operand was a name, not an expression, and
deduced the wrong type.

Unfortunately that fix broke this:

  int&& r = 1;
  decltype(auto) rr = (r);

where 'rr' should be 'int &' since '(r)' is an expression, not a name.  But
because I stripped the INDIRECT_REF with the r11-4758 change, we deduced
'rr's type as if decltype had gotten a name, resulting in 'int &&'.

I suspect I thought that the REF_PARENTHESIZED_P check when setting
'bool id' in do_auto_deduction would handle the (r) case, but that's not
the case; while the documentation for REF_PARENTHESIZED_P specifically says
it can be set in INDIRECT_REF, we don't actually do so.

This patch sets REF_PARENTHESIZED_P even on REFERENCE_REF_P, so that
do_auto_deduction can use it.

It also removes code in maybe_undo_parenthesized_ref that I think is
dead -- and we don't hit it while running dg.exp.  To adduce more data,
it also looks dead here:
https://splichal.eu/lcov/gcc/cp/semantics.c.gcov.html

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk and 11?

PR c++/103403

gcc/cp/ChangeLog:

* cp-gimplify.c (cp_fold): Don't recurse if maybe_undo_parenthesized_ref
doesn't change its argument.
* pt.c (do_auto_deduction): Don't strip REFERENCE_REF_P trees.  Also
REF_PARENTHESIZED_P for REFERENCE_REF_P.
* semantics.c (force_paren_expr): Set REF_PARENTHESIZED_P on
REFERENCE_REF_P trees too.
(maybe_undo_parenthesized_ref): Remove dead code.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/decltype-auto2.C: New test.
* g++.dg/cpp1y/decltype-auto3.C: New test.
---
 gcc/cp/cp-gimplify.c|  3 ++-
 gcc/cp/pt.c |  5 ++---
 gcc/cp/semantics.c  | 18 --
 gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C | 12 
 gcc/testsuite/g++.dg/cpp1y/decltype-auto3.C | 12 
 5 files changed, 32 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto3.C

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 0a002db14e7..e3ede02a48e 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -2421,7 +2421,8 @@ cp_fold (tree x)
   if (REF_PARENTHESIZED_P (x))
{
  tree p = maybe_undo_parenthesized_ref (x);
- return cp_fold (p);
+ if (p != x)
+   return cp_fold (p);
}
   goto unary;
 
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f4b9d9673fb..c5b41b57028 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -29889,11 +29889,10 @@ do_auto_deduction (tree type, tree init, tree 
auto_node,
   else if (AUTO_IS_DECLTYPE (auto_node))
 {
   tree stripped_init = tree_strip_any_location_wrapper (init);
-  if (REFERENCE_REF_P (stripped_init))
-   stripped_init = TREE_OPERAND (stripped_init, 0);
   bool id = (DECL_P (stripped_init)
 || ((TREE_CODE (init) == COMPONENT_REF
- || TREE_CODE (init) == SCOPE_REF)
+ || TREE_CODE (init) == SCOPE_REF
+ || REFERENCE_REF_P (init))
 && !REF_PARENTHESIZED_P (init)));
   tree deduced = finish_decltype_type (init, id, complain);
   deduced = canonicalize_type_argument (deduced, complain);
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index cd1956497f8..edba4b60e10 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -2049,7 +2049,8 @@ force_paren_expr (tree expr, bool even_uneval)
 return expr;
 
   if (TREE_CODE (expr) == COMPONENT_REF
-  || TREE_CODE (expr) == SCOPE_REF)
+  || TREE_CODE (expr) == SCOPE_REF
+  || REFERENCE_REF_P (expr))
 REF_PARENTHESIZED_P (expr) = true;
   else if (DECL_P (tree_strip_any_location_wrapper (expr)))
 {
@@ -2072,19 +2073,8 @@ maybe_undo_parenthesized_ref (tree t)
   if (cxx_dialect < cxx14)
 return t;
 
-  if (INDIRECT_REF_P (t) && REF_PARENTHESIZED_P (t))
-{
-  t = TREE_OPERAND (t, 0);
-  while (TREE_CODE (t) == NON_LVALUE_EXPR
-|| TREE_CODE (t) == NOP_EXPR)
-   t = TREE_OPERAND (t, 0);
-
-  gcc_assert (TREE_CODE (t) == ADDR_EXPR
- || TREE_CODE (t) == STATIC_CAST_EXPR);
-  t = TREE_OPERAND (t, 0);
-}
-  else if ((TREE_CODE (t) == PAREN_EXPR || TREE_CODE (t) == VIEW_CONVERT_EXPR)
-&& REF_PARENTHESIZED_P (t))
+  if ((TREE_CODE (t) == PAREN_EXPR || TREE_CODE (t) == VIEW_CONVERT_EXPR)
+  && REF_PARENTHESIZED_P (t))
 t = TREE_OPERAND (t, 0);
 
   return t;
diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C 
b/gcc/testsuite/g++.dg/cpp1y/decltype-auto2.C
new file mode 100644
index 000..56e011e36f4
--- /dev/null
+++ b/gcc/testsuit

[PATCH] c++: ICE with unnamed tparm and concept [PR103408]

2021-12-01 Thread Marek Polacek via Gcc-patches
Here we crash when issuing the "constraint C has type T, not bool"
error, because pp_cxx_parameter_mapping wasn't prepared to see an
anonymous template parameter.  With this patch we print

error: constraint 'auto() [with  = 0]' has type '', 
not 'bool'

The "" is what this patch adds.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/103408

gcc/cp/ChangeLog:

* cxx-pretty-print.c (pp_cxx_parameter_mapping): Print ""
rather than crash on an unnamed template parameter.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/concepts-err1.C: New test.
---
 gcc/cp/cxx-pretty-print.c  | 4 +++-
 gcc/testsuite/g++.dg/cpp23/concepts-err1.C | 6 ++
 2 files changed, 9 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/concepts-err1.C

diff --git a/gcc/cp/cxx-pretty-print.c b/gcc/cp/cxx-pretty-print.c
index 25cabfee39f..3ea357deb80 100644
--- a/gcc/cp/cxx-pretty-print.c
+++ b/gcc/cp/cxx-pretty-print.c
@@ -2891,8 +2891,10 @@ pp_cxx_parameter_mapping (cxx_pretty_printer *pp, tree 
map)
 
   if (TYPE_P (parm))
pp->type_id (parm);
+  else if (tree name = DECL_NAME (TEMPLATE_PARM_DECL (parm)))
+   pp_cxx_tree_identifier (pp, name);
   else
-   pp_cxx_tree_identifier (pp, DECL_NAME (TEMPLATE_PARM_DECL (parm)));
+   pp->translate_string ("");
 
   pp_cxx_whitespace (pp);
   pp_equal (pp);
diff --git a/gcc/testsuite/g++.dg/cpp23/concepts-err1.C 
b/gcc/testsuite/g++.dg/cpp23/concepts-err1.C
new file mode 100644
index 000..e5bdc542bad
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/concepts-err1.C
@@ -0,0 +1,6 @@
+// PR c++/103408
+// { dg-do compile { target c++23 } }
+
+template
+concept C = auto([]{}); // { dg-error "constraint" }
+static_assert(C<0>); // { dg-error "non-constant condition for static 
assertion" }

base-commit: 6b8ecbc6d6652d061d7c72c64352d51eca2df6ca
-- 
2.33.1



[PATCH] c++: ICE with auto(0) in requires-expression [PR103408]

2021-12-01 Thread Marek Polacek via Gcc-patches
Here we ICE on

  int f() requires (auto(0));

in do_auto_deduction when handling the auto: we're in a non-templated
requires-expression which are parsed under processing_template_decl == 1
and empty current_template_parms, so 'current_template_args ()' will
crash.  This code is invalid as per "C++20 CA378: Remove non-templated
constrained functions", but of course we shouldn't crash.

Since in the scenario above it's expected that current_template_parms is
null, I've just added a check, and let grokfndecl issue an error.
I guess another approch would be to fake up a template parameter list
before calling do_auto_deduction.

For good measure, I've added several well-formed cases with auto(x) in
a requires-expression.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/103408

gcc/cp/ChangeLog:

* pt.c (do_auto_deduction): Check current_template_parms before
current_template_args ().

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-fncast9.C: New test.
---
 gcc/cp/pt.c   |  2 +-
 gcc/testsuite/g++.dg/cpp23/auto-fncast9.C | 27 +++
 2 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast9.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f4b9d9673fb..012ca5d06c0 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -30041,7 +30041,7 @@ do_auto_deduction (tree type, tree init, tree auto_node,
(but we still may have used them for constraint checking above).  */;
   else if (context == adc_unify)
 targs = add_to_template_args (outer_targs, targs);
-  else if (processing_template_decl)
+  else if (processing_template_decl && current_template_parms)
 targs = add_to_template_args (current_template_args (), targs);
   return tsubst (type, targs, complain, NULL_TREE);
 }
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C 
b/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C
new file mode 100644
index 000..45a0ff9b460
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast9.C
@@ -0,0 +1,27 @@
+// PR c++/103408
+// { dg-do compile { target c++23 } }
+
+int bad1() requires (auto(true)); // { dg-error "constraints on a 
non-templated function" }
+
+template
+int f1() requires (auto(B));
+
+template
+struct S { T t; constexpr operator bool() { return true; } };
+
+int bad2() requires (bool(S{1})); // { dg-error "constraints on a 
non-templated function" }
+int bad3() requires (bool(S(1))); // { dg-error "constraints on a 
non-templated function" }
+
+template
+int f2() requires (bool(S{N}));
+
+template
+int f3() requires (bool(S(N)));
+
+void
+g ()
+{
+  f1();
+  f2<42>();
+  f3<42>();
+}

base-commit: 6b8ecbc6d6652d061d7c72c64352d51eca2df6ca
-- 
2.33.1



[PATCH] c++: Handle auto(x) in parameter-declaration-clause [PR103401]

2021-12-01 Thread Marek Polacek via Gcc-patches
In C++23, auto(x) is valid, so decltype(auto(x)) should also be valid,
so

  void f(decltype(auto(0)));

should be just as

  void f(int);

but currently, everytime we see 'auto' in a parameter-declaration-clause,
we try to synthesize_implicit_template_parm for it, creating a new template
parameter list.  The code above actually has us calling s_i_t_p twice;
once from cp_parser_decltype_expr -> cp_parser_postfix_expression which
fails and then again from cp_parser_decltype_expr -> cp_parser_expression.
So it looks like we have f and we accept ill-formed code.

So we need to be more careful about synthesizing the implicit template
parameter.  cp_parser_postfix_expression looked like a sensible place.

The r11-1913 change is OK: we need to make sure that we see '(auto)' after
decltype to go ahead with 'decltype(auto)'.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/103401

gcc/cp/ChangeLog:

* parser.c (cp_parser_postfix_expression): Set
auto_is_implicit_function_template_parm_p when parsing a postfix
expression.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-fncast7.C: New test.
* g++.dg/cpp23/auto-fncast8.C: New test.
---
 gcc/cp/parser.c   |  2 ++
 gcc/testsuite/g++.dg/cpp23/auto-fncast7.C |  9 
 gcc/testsuite/g++.dg/cpp23/auto-fncast8.C | 28 +++
 3 files changed, 39 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast7.C
 create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-fncast8.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 55e6a1a8b3a..c43b180f888 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -7508,6 +7508,8 @@ cp_parser_postfix_expression (cp_parser *parser, bool 
address_p, bool cast_p,
   looking at a functional cast.  We could also be looking at
   an id-expression.  So, we try the functional cast, and if
   that doesn't work we fall back to the primary-expression.  */
+   auto cleanup = make_temp_override
+ (parser->auto_is_implicit_function_template_parm_p, false);
cp_parser_parse_tentatively (parser);
/* Look for the simple-type-specifier.  */
 ++parser->prevent_constrained_type_specifiers;
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C 
b/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C
new file mode 100644
index 000..763164f3e5b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast7.C
@@ -0,0 +1,9 @@
+// PR c++/103401
+// { dg-do compile { target c++23 } }
+
+void f(decltype(auto(0))) { }
+
+int main()
+{
+  f(0); // { dg-error "no matching function" }
+}
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C 
b/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C
new file mode 100644
index 000..760827a5d6e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-fncast8.C
@@ -0,0 +1,28 @@
+// PR c++/103401
+// { dg-do compile { target c++23 } }
+
+void f1 (decltype(auto(0)));
+void f2 (decltype(auto{0}));
+void f3 (int = auto(42));
+void f4 (int = auto{42});
+void f5 (decltype(auto(0)) = auto(42));
+void f6 (auto (x));
+void f7 (int[auto(10)]);
+void f8 (int[auto{10}]);
+
+void
+g ()
+{
+  f1 (1);
+  f2 (1);
+  f3 ();
+  f3 (1);
+  f4 ();
+  f4 (1);
+  f5 ();
+  f5 (1);
+  f6 ('a');
+  int a[10];
+  f7 (&a[0]);
+  f8 (&a[0]);
+}

base-commit: e5440bc08e07fd491dcccd47e1b86a5985ee117c
-- 
2.33.1



Patch ping

2021-12-01 Thread Jakub Jelinek via Gcc-patches
Hi!

I'd like to ping following patches:

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583289.html
  c++, dyninit: Optimize C++ dynamic initialization by constants into 
DECL_INITIAL adjustment [PR102876]

  While Jason has added -fimplicit-constexpr which can fix up some cases,
  it still doesn't handle non-inline functions, or functions defined after
  the dynamic initialization, or functions defined in other TUs, so I think
  trying to optimize those after IPA is useful and as has been said, other
  compilers do it late as well.

https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584802.html
  pch, v2: Add support for PCH for relocatable executables

  I do support Iain's patches to make PCH configure time disableable,
  but if people do want it and can't use it because of PIEs, it seems
  a pitty not to support it when it is so easy.

Thanks

Jakub



PING [PATCH][wwwdocs] Update section on enormous source files in htdocs/projects/beginner.html

2021-12-01 Thread Eric Gallager via Gcc-patches
Hi, I'd like to ping this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585294.html

On Wed, Nov 24, 2021 at 2:11 AM Eric Gallager  wrote:
>
> On Tue, Nov 23, 2021 at 6:27 PM Eric Gallager  wrote:
> >
> > On Fri, Nov 19, 2021 at 8:14 AM Eric Gallager  wrote:
> > >
> > > On Fri, Nov 19, 2021 at 1:48 AM Gerald Pfeifer  wrote:
> > > >
> > > > Cool, thank you!
> > > >
> > > > Please feel free to commit patches like this without asking for
> > > > approval (though I'm happy to review and approve).
> > > >
> > > > Gerald
> > > >
> > >
> > > OK thanks; committed as dbaebcd
> >
> > I've also committed one to remove the section on traditional C now:
> > https://gcc.gnu.org/git/?p=gcc-wwwdocs.git;a=commitdiff;h=ca83c13ad6bf0d351220dafa36264ebc7a6b7816
>
> This next patch does more than just removing old stuff: it adds an
> extra sentence to describe a shell command used to generate a list, so
> to verify that I've got the shell command right, I'm asking for a
> review.
>
> Thanks,
> Eric Gallager


[committed] libstdc++: Clear RB tree after moving elements [PR103501]

2021-12-01 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk.


If the allocator-extended move constructor move-constructs each element
into the new container, the contents of the old container are left in
moved-from states. We cannot know if those states preserve the
container's ordering and uniqueness guarantees, so just erase all
moved-from elements.

libstdc++-v3/ChangeLog:

PR libstdc++/103501
* include/bits/stl_tree.h (_Rb_tree(_Rb_tree&&, false_type)):
Clear container if elements have been moved-from.
* testsuite/23_containers/map/allocator/move_cons.cc: Expect
moved-from container to be empty.
* testsuite/23_containers/multimap/allocator/move_cons.cc:
Likewise.
* testsuite/23_containers/multiset/allocator/103501.cc: New test.
* testsuite/23_containers/set/allocator/103501.cc: New test.
---
 libstdc++-v3/include/bits/stl_tree.h  |  6 ++--
 .../23_containers/map/allocator/move_cons.cc  |  2 +-
 .../multimap/allocator/move_cons.cc   |  2 +-
 .../multiset/allocator/103501.cc  | 32 +++
 .../23_containers/set/allocator/103501.cc | 32 +++
 5 files changed, 70 insertions(+), 4 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/23_containers/multiset/allocator/103501.cc
 create mode 100644 libstdc++-v3/testsuite/23_containers/set/allocator/103501.cc

diff --git a/libstdc++-v3/include/bits/stl_tree.h 
b/libstdc++-v3/include/bits/stl_tree.h
index 0692525be57..55b8c9c7cb2 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -1644,9 +1644,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
_M_move_data(__x, true_type());
   else
{
+ constexpr bool __move = !__move_if_noexcept_cond::value;
  _Alloc_node __an(*this);
- _M_root() =
-   _M_copy::value>(__x, __an);
+ _M_root() = _M_copy<__move>(__x, __an);
+ if _GLIBCXX17_CONSTEXPR (__move)
+   __x.clear();
}
 }
 
diff --git a/libstdc++-v3/testsuite/23_containers/map/allocator/move_cons.cc 
b/libstdc++-v3/testsuite/23_containers/map/allocator/move_cons.cc
index 5beb26276b0..b82d3532135 100644
--- a/libstdc++-v3/testsuite/23_containers/map/allocator/move_cons.cc
+++ b/libstdc++-v3/testsuite/23_containers/map/allocator/move_cons.cc
@@ -41,7 +41,7 @@ void test01()
   VERIFY(1 == v1.get_allocator().get_personality());
   VERIFY(2 == v2.get_allocator().get_personality());
 
-  VERIFY( v1[1].empty() );
+  VERIFY( v1.empty() );
   VERIFY( v2[1] == str );
 }
 
diff --git 
a/libstdc++-v3/testsuite/23_containers/multimap/allocator/move_cons.cc 
b/libstdc++-v3/testsuite/23_containers/multimap/allocator/move_cons.cc
index 91f2d0be508..37db0f005d1 100644
--- a/libstdc++-v3/testsuite/23_containers/multimap/allocator/move_cons.cc
+++ b/libstdc++-v3/testsuite/23_containers/multimap/allocator/move_cons.cc
@@ -41,7 +41,7 @@ void test01()
   VERIFY(1 == v1.get_allocator().get_personality());
   VERIFY(2 == v2.get_allocator().get_personality());
 
-  VERIFY( v1.begin()->second.empty() );
+  VERIFY( v1.empty() );
   VERIFY( v2.begin()->second == str );
 }
 
diff --git a/libstdc++-v3/testsuite/23_containers/multiset/allocator/103501.cc 
b/libstdc++-v3/testsuite/23_containers/multiset/allocator/103501.cc
new file mode 100644
index 000..24f657eceba
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/multiset/allocator/103501.cc
@@ -0,0 +1,32 @@
+// { dg-do run { target c++11 } }
+
+// PR libstdc++/103501
+
+#include 
+#include 
+#include 
+
+struct Y
+{
+  int i;
+
+  Y(int i) : i(i) { }
+  Y(const Y& y) noexcept : i(y.i) { }
+  Y(Y&& y) noexcept : i(y.i) { y.i = -y.i; }
+
+  bool operator<(const Y& rhs) const { return i < rhs.i; }
+};
+
+int main()
+{
+  using Alloc = __gnu_test::uneq_allocator;
+  std::multiset, Alloc> s1{ {1, 2, 3}, Alloc(1)};
+  std::multiset, Alloc> s2{ std::move(s1), Alloc(2) };
+  const Y* prev = nullptr;
+  for (const Y& y : s1)
+  {
+if (prev)
+  VERIFY( !(y < *prev) );
+prev = &y;
+  }
+}
diff --git a/libstdc++-v3/testsuite/23_containers/set/allocator/103501.cc 
b/libstdc++-v3/testsuite/23_containers/set/allocator/103501.cc
new file mode 100644
index 000..7267cf9663f
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/set/allocator/103501.cc
@@ -0,0 +1,32 @@
+// { dg-do run { target c++11 } }
+
+// PR libstdc++/103501
+
+#include 
+#include 
+#include 
+
+struct X
+{
+  int i;
+
+  X(int i) : i(i) { }
+  X(const X& x) noexcept : i(x.i) { }
+  X(X&& x) noexcept : i(x.i) { x.i = -1; }
+
+  bool operator<(const X& rhs) const { return i < rhs.i; }
+};
+
+int main()
+{
+  using Alloc = __gnu_test::uneq_allocator;
+  std::set, Alloc> s1{ {1, 2, 3}, Alloc(1)};
+  std::set, Alloc> s2{ std::move(s1), Alloc(2) };
+  const X* prev = nullptr;
+  for (const X& x : s1)
+  {
+if (prev)
+  VERIFY( *prev < x );
+prev = &x;
+  }
+}
-- 
2.31.1



[committed] libstdc++: Define std::__is_constant_evaluated() for internal use

2021-12-01 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk.


This adds std::__is_constant_evaluated() as a C++11 wrapper for
__builtin_is_constant_evaluated, but just returning false if the
built-in isn't supported by the compiler. This allows us to use it
throughout the library without checking __has_builtin every time.

Some uses in std::vector and std::string can only be constexpr when the
std::is_constant_evaluated() function actually works, so we might as
well guard them with a relevant macro and call that function directly,
rather than the built-in or std::__is_constant_evaluated().

The remaining checks of the __cpp_lib_is_constant_evaluated macro could
now be replaced by checking __cplusplus >= 202002 instead, but there's
no practical difference. We still need some kind of preprocessor check
there anyway.

libstdc++-v3/ChangeLog:

* doc/doxygen/user.cfg.in (PREDEFINED): Change macro name.
* include/bits/allocator.h (allocate, deallocate): Use
std::__is_constant_evaluated() unconditionally, instead of
checking whether std::is_constant_evaluated() (or the built-in)
can be used.
* include/bits/basic_string.h: Check new macro. call
std::is_constant_evaluated() directly in C++20-only code that is
guarded by a suitable macro.
* include/bits/basic_string.tcc: Likewise.
* include/bits/c++config (__is_constant_evaluated): Define.
(_GLIBCXX_HAVE_BUILTIN_IS_CONSTANT_EVALUATED): Replace with ...
(_GLIBCXX_HAVE_IS_CONSTANT_EVALUATED): New macro.
* include/bits/char_traits.h (char_traits): Replace conditional
calls to std::is_constant_evaluated with unconditional calls to
std::__is_constant_evaluated.
* include/bits/cow_string.h: Use new macro.
* include/bits/ranges_algobase.h (__copy_or_move): Replace
conditional calls to std::is_constant_evaluated with unconditional
calls to std::__is_constant_evaluated.
(__copy_or_move_backward, __fill_n_fn): Likewise.
* include/bits/ranges_cmp.h (ranges::less): Likewise.
* include/bits/stl_algobase.h (lexicographical_compare_three_way):
Likewise.
* include/bits/stl_bvector.h: Call std::is_constant_evaluated
directly in C++20-only code that is guarded by a suitable macro.
* include/bits/stl_construct.h (_Construct, _Destroy, _Destroy_n):
Replace is_constant_evaluated with __is_constant_evaluated.
* include/bits/stl_function.h (greater, less, greater_equal)
(less_equal): Replace __builtin_is_constant_evaluated and
__builtin_constant_p with __is_constant_evaluated.
* include/bits/stl_vector.h: Call std::is_constant_evaluated()
in C++20-only code.
* include/debug/helper_functions.h (__check_singular): Use
__is_constant_evaluated instead of built-in, or remove check
entirely.
* include/std/array (operator<=>): Use __is_constant_evaluated
unconditionally.
* include/std/bit (__bit_ceil): Likewise.
* include/std/type_traits (is_constant_evaluated): Define using
'if consteval' if possible.
* include/std/version: Use new macro.
* libsupc++/compare: Use __is_constant_evaluated instead of
__builtin_is_constant_evaluated.
* testsuite/23_containers/array/tuple_interface/get_neg.cc:
Adjust dg-error lines.
---
 libstdc++-v3/doc/doxygen/user.cfg.in  |   2 +-
 libstdc++-v3/include/bits/allocator.h |   8 +-
 libstdc++-v3/include/bits/basic_string.h  |   6 +-
 libstdc++-v3/include/bits/basic_string.tcc|   4 +-
 libstdc++-v3/include/bits/c++config   |  30 +++-
 libstdc++-v3/include/bits/char_traits.h   | 129 +-
 libstdc++-v3/include/bits/cow_string.h|   2 +-
 libstdc++-v3/include/bits/ranges_algobase.h   |  12 +-
 libstdc++-v3/include/bits/ranges_cmp.h|   5 +-
 libstdc++-v3/include/bits/stl_algobase.h  |   5 +-
 libstdc++-v3/include/bits/stl_bvector.h   |   8 +-
 libstdc++-v3/include/bits/stl_construct.h |  12 +-
 libstdc++-v3/include/bits/stl_function.h  |  24 +---
 libstdc++-v3/include/bits/stl_vector.h|   5 +-
 libstdc++-v3/include/debug/helper_functions.h |  15 +-
 libstdc++-v3/include/std/array|   4 +-
 libstdc++-v3/include/std/bit  |   5 +-
 libstdc++-v3/include/std/type_traits  |  11 +-
 libstdc++-v3/include/std/version  |   4 +-
 libstdc++-v3/libsupc++/compare|   2 +-
 .../array/tuple_interface/get_neg.cc  |   6 +-
 21 files changed, 142 insertions(+), 157 deletions(-)

diff --git a/libstdc++-v3/doc/doxygen/user.cfg.in 
b/libstdc++-v3/doc/doxygen/user.cfg.in
index 17cd6fc1c0e..2f15f2c1b82 100644
--- a/libstdc++-v3/doc/doxygen/user.cfg.in
+++ b/libstdc++-v3/doc/doxygen/user.cfg.in
@@ -2399,7 +2399,7 @@ PREDEFINED = __cplusplus=202002L \
  "__has_builtin(x)

[committed] libstdc++: Optimize ref-count updates in COW std::string

2021-12-01 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk.


Most ref-count updates in the COW string are done via the functions in
, which will use non-atomic ops when the program is
known to be single-threaded. The _M_is_leaked() and _M_is_shared()
functions use __atomic_load_n directly, because 
doesn't provide a load operation. Those functions can check the
__is_single_threaded() predicate to avoid using __atomic_load_n when not
needed.

The move constructor for the fully-dynamic-string increments the
ref-count by either 2 or 1, for leaked or non-leaked strings
respectively. That can be changed to use a non-atomic store of 1 for all
non-shared strings. It can be non-atomic because even if the program is
multi-threaded, conflicting access to the rvalue object while it's being
moved from would be data race anyway. It can store 1 directly for all
non-shared strings because it doesn't matter whether the initial
refcount was -1 or 0, it should be 1 after the move constructor creates
a second owner.

libstdc++-v3/ChangeLog:

* include/bits/cow_string.h (basic_string::_M_is_leaked): Use
non-atomic load when __is_single_threaded() is true.
(basic_string::_M_is_shared): Likewise.
(basic_string::(basic_string&&)) [_GLIBCXX_FULLY_DYNAMIC_STRING]:
Use non-atomic store when rvalue is not shared.
---
 libstdc++-v3/include/bits/cow_string.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/include/bits/cow_string.h 
b/libstdc++-v3/include/bits/cow_string.h
index ced395b80b8..4fae1d02981 100644
--- a/libstdc++-v3/include/bits/cow_string.h
+++ b/libstdc++-v3/include/bits/cow_string.h
@@ -105,7 +105,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*  destroy the empty-string _Rep object.
*
*  All but the last paragraph is considered pretty conventional
-   *  for a C++ string implementation.
+   *  for a Copy-On-Write C++ string implementation.
   */
   // 21.3  Template class basic_string
   template
@@ -207,10 +207,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  // so we need to use an atomic load. However, _M_is_leaked
  // predicate does not change concurrently (i.e. the string is either
  // leaked or not), so a relaxed load is enough.
- return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
-#else
- return this->_M_refcount < 0;
+ if (!__gnu_cxx::__is_single_threaded())
+   return __atomic_load_n(&this->_M_refcount, __ATOMIC_RELAXED) < 0;
 #endif
+ return this->_M_refcount < 0;
}
 
bool
@@ -222,10 +222,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  // but one reference concurrently with this check, so we need this
  // load to be acquire to synchronize with release fetch_and_add in
  // _M_dispose.
- return __atomic_load_n(&this->_M_refcount, __ATOMIC_ACQUIRE) > 0;
-#else
- return this->_M_refcount > 0;
+ if (!__gnu_cxx::__is_single_threaded())
+   return __atomic_load_n(&this->_M_refcount, __ATOMIC_ACQUIRE) > 0;
 #endif
+ return this->_M_refcount > 0;
}
 
void
@@ -629,12 +629,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 #else
// Rather than allocate an empty string for the rvalue string,
// just share ownership with it by incrementing the reference count.
-   // If the rvalue string was "leaked" then it was the unique owner,
-   // so need an extra increment to indicate shared ownership.
-   if (_M_rep()->_M_is_leaked())
- __gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 2);
-   else
+   // If the rvalue string was the unique owner then there are exactly
+   // two owners now.
+   if (_M_rep()->_M_is_shared())
  __gnu_cxx::__atomic_add_dispatch(&_M_rep()->_M_refcount, 1);
+   else
+ _M_rep()->_M_refcount = 1;
 #endif
   }
 
-- 
2.31.1



[committed] libstdc++: Avoid unwanted allocations in filesystem::path

2021-12-01 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk.


When using COW strings, accessing _M_pathname[0] and similar non-const
accessors can cause the string to "leak", meaning it reallocates itself
if it shares ownership with another string object.

This causes test failures for --enable-fully-dynamic-string builds:
/home/jwakely/src/gcc/libstdc++-v3/testsuite/experimental/filesystem/path/construct/90634.cc:62:
 void test01(): Assertion 'bytes_allocated == 0' failed.
FAIL: experimental/filesystem/path/construct/90634.cc execution test

This FAIL happens because the fully-dynamic move constructor results in
shared ownership, so for path(std::move(std::string("foo"))) the
_M_pathname member shares ownership with the temporary, and the
non-const accesses in _M_split_cmpts() cause a new copy of the string to
be allocated. This un-sharing is wasteful, and entirely unnecessary when
sharing ownership with an rvalue that is about to release its ownership
anyway. Even for lvalues, sharing ownership is not a problem and
reallocating a unique copy of the string is wasteful.

This removes non-const accesses of _M_pathname in the
path::_M_split_cmpts() members.

libstdc++-v3/ChangeLog:

* src/c++17/fs_path.cc (path::_M_split_cmpts()): Remove
micro-optimization for "/" path.
* src/filesystem/path.cc (path::_M_split_cmpts()): Only access
the contents of _M_pathname using const member functions.
---
 libstdc++-v3/src/c++17/fs_path.cc   |  5 -
 libstdc++-v3/src/filesystem/path.cc | 31 -
 2 files changed, 17 insertions(+), 19 deletions(-)

diff --git a/libstdc++-v3/src/c++17/fs_path.cc 
b/libstdc++-v3/src/c++17/fs_path.cc
index 506ff25f9a6..967d9471134 100644
--- a/libstdc++-v3/src/c++17/fs_path.cc
+++ b/libstdc++-v3/src/c++17/fs_path.cc
@@ -1872,11 +1872,6 @@ path::_M_split_cmpts()
   _M_cmpts.type(_Type::_Filename);
   return;
 }
-  if (_M_pathname.length() == 1 && _M_pathname[0] == preferred_separator)
-{
-  _M_cmpts.type(_Type::_Root_dir);
-  return;
-}
 
   _Parser parser(_M_pathname);
 
diff --git a/libstdc++-v3/src/filesystem/path.cc 
b/libstdc++-v3/src/filesystem/path.cc
index a935573740f..8e8806a953f 100644
--- a/libstdc++-v3/src/filesystem/path.cc
+++ b/libstdc++-v3/src/filesystem/path.cc
@@ -337,15 +337,18 @@ path::_M_split_cmpts()
   _M_type = _Type::_Multi;
   _M_cmpts.clear();
 
-  if (_M_pathname.empty())
+  // Use const-reference to access _M_pathname, to avoid "leaking" COW string.
+  const auto& pathname = _M_pathname;
+
+  if (pathname.empty())
 return;
 
   {
 // Approximate count of components, to reserve space in _M_cmpts vector:
 int count = 1;
-bool saw_sep_last = _S_is_dir_sep(_M_pathname[0]);
+bool saw_sep_last = _S_is_dir_sep(pathname[0]);
 bool saw_non_sep = !saw_sep_last;
-for (value_type c : _M_pathname)
+for (value_type c : pathname)
   {
if (_S_is_dir_sep(c))
  saw_sep_last = true;
@@ -363,13 +366,13 @@ path::_M_split_cmpts()
   }
 
   size_t pos = 0;
-  const size_t len = _M_pathname.size();
+  const size_t len = pathname.size();
 
   // look for root name or root directory
-  if (_S_is_dir_sep(_M_pathname[0]))
+  if (_S_is_dir_sep(pathname[0]))
 {
   // look for root name, such as "//" or "//foo"
-  if (len > 1 && _M_pathname[1] == _M_pathname[0])
+  if (len > 1 && pathname[1] == pathname[0])
{
  if (len == 2)
{
@@ -378,11 +381,11 @@ path::_M_split_cmpts()
  return;
}
 
- if (!_S_is_dir_sep(_M_pathname[2]))
+ if (!_S_is_dir_sep(pathname[2]))
{
  // got root name, find its end
  pos = 3;
- while (pos < len && !_S_is_dir_sep(_M_pathname[pos]))
+ while (pos < len && !_S_is_dir_sep(pathname[pos]))
++pos;
  if (pos == len)
{
@@ -409,7 +412,7 @@ path::_M_split_cmpts()
   ++pos;
 }
 #ifdef _GLIBCXX_FILESYSTEM_IS_WINDOWS
-  else if (len > 1 && _M_pathname[1] == L':')
+  else if (len > 1 && pathname[1] == L':')
 {
   // got disk designator
   if (len == 2)
@@ -418,7 +421,7 @@ path::_M_split_cmpts()
  return;
}
   _M_add_root_name(2);
-  if (len > 2 && _S_is_dir_sep(_M_pathname[2]))
+  if (len > 2 && _S_is_dir_sep(pathname[2]))
_M_add_root_dir(2);
   pos = 2;
 }
@@ -426,9 +429,9 @@ path::_M_split_cmpts()
   else
 {
   size_t n = 1;
-  for (; n < _M_pathname.size() && !_S_is_dir_sep(_M_pathname[n]); ++n)
+  for (; n < pathname.size() && !_S_is_dir_sep(pathname[n]); ++n)
{ }
-  if (n == _M_pathname.size())
+  if (n == pathname.size())
{
  _M_type = _Type::_Filename;
  return;
@@ -438,7 +441,7 @@ path::_M_split_cmpts()
   size_t back = pos;
   while (pos < len)
 {
-  if (_S_is_dir_sep(_M_pathname[pos]))
+  if (_S_is_dir_sep(pathname[pos]))
{
  if 

Re: [PATCH] middle-end: Skip initialization of opaque type register variables [PR103127]

2021-12-01 Thread Qing Zhao via Gcc-patches


> On Dec 1, 2021, at 3:01 AM, Richard Biener  wrote:
> 
> On Tue, Nov 30, 2021 at 11:35 PM Peter Bergner  wrote:
>> 
>> On 11/30/21 2:44 PM, Qing Zhao via Gcc-patches wrote:
>>> Sorry for the confusing…
>>> My major question is:
>>> 
>>> for a variable of type __vector_pair,  could it be in a register?
>> 
>> Yes.  To be pedantic, it will live in a vector register pair.
>> 
>> 
>>> If it could be in a register, can we initialize this register with some 
>>> constant value?
>> 
>> For a __vector_pair, no, not as it is setup now.  We also do not have a
>> use case where we would want to initialize a __vector_pair to a constant.
>> Our normal (only?) use case with a __vector_pair is to load it up with
>> some actual data from memory that represents a (partial) row of a matrix.
>> 
>> For __vector_quad, it too lives in a register (accumulator register) and
>> represents a small matrix.  We have the __builtin_mma_xxsetaccz (&acc)
>> builtin to initialize it to a zero constant.
> 
> Given all this I suggest to exempt OPAQUE_TYPE from is_var_need_auto_init
> instead of fixing up things at expansion time.

Agreed. 

So, Peter, will you update the routine “is_var_need_auto_init” in gimplify.c to 
exclude OPAQUE_TYPE to fix this issue?

thanks.

Qing
> 
> Richard.
> 
>> Peter



Re: [PATCH] x86: Speed up target attribute handling by using a cache

2021-12-01 Thread Uros Bizjak via Gcc-patches
On Mon, Nov 22, 2021 at 10:36 AM Jakub Jelinek  wrote:
>
> Hi!
>
> The target attribute handling is very expensive and for the common case
> from x86intrin.h where many functions get implicitly the same target
> attribute, we can speed up compilation a lot by caching it.
>
> The following patches both create a single entry cache, where they cache
> for a particular target attribute argument list the resulting
> DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_SPECIFIC_OPTIMIZATION
> values from ix86_valid_target_attribute_p and use the cache if the
> args are the same as last time and we start either from NULL values
> of those, or from the recorded values for those from last time.
>
> Compiling a simple:
>  #include 
>
>  int i;
> testcase with ./cc1 -quiet -O2 -isystem include/ test.c
> takes on my WS without the patches ~0.392s and with either of the
> patches ~0.182s, i.e. roughly half the time as before.
> For ./cc1plus -quiet -O2 -isystem include/ test.c
> it is slightly worse, the speed up is from ~0.613s to ~0.403s.
>
> The difference between the 2 patches is that the first one uses copy_list
> while the second one uses a vec, so I think the second one has the advantage
> of creating less GC garbage.
> I've verified both patches achieve the same content of those
> DECL_FUNCTION_SPECIFIC_TARGET and DECL_FUNCTION_SPECIFIC_OPTIMIZATION
> nodes as before on x86intrin.h by doing debug_tree on those and comparing
> the stderr from without these patches to with these patches.
>
> Both patches were bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk (and which one)?
>
> 2021-11-22  Jakub Jelinek  
>
> * attribs.h (simple_cst_list_equal): Declare.
> * attribs.c (simple_cst_list_equal): No longer static.
> * config/i386/i386-options.c (target_attribute_cache): New variable.
> (ix86_valid_target_attribute_p): Cache DECL_FUNCTION_SPECIFIC_TARGET
> and DECL_FUNCTION_SPECIFIC_OPTIMIZATION based on args.

LGTM, but really I'll just rubber-stamp the patch.

Uros.

> --- gcc/attribs.h.jj2021-11-11 14:35:37.442350841 +0100
> +++ gcc/attribs.h   2021-11-19 11:52:08.843252645 +0100
> @@ -60,6 +60,7 @@ extern tree build_type_attribute_variant
>  extern tree build_decl_attribute_variant (tree, tree);
>  extern tree build_type_attribute_qual_variant (tree, tree, int);
>
> +extern bool simple_cst_list_equal (const_tree, const_tree);
>  extern bool attribute_value_equal (const_tree, const_tree);
>
>  /* Return 0 if the attributes for two types are incompatible, 1 if they
> --- gcc/attribs.c.jj2021-11-11 14:35:37.442350841 +0100
> +++ gcc/attribs.c   2021-11-19 11:51:43.473615692 +0100
> @@ -1290,7 +1290,7 @@ cmp_attrib_identifiers (const_tree attr1
>  /* Compare two constructor-element-type constants.  Return 1 if the lists
> are known to be equal; otherwise return 0.  */
>
> -static bool
> +bool
>  simple_cst_list_equal (const_tree l1, const_tree l2)
>  {
>while (l1 != NULL_TREE && l2 != NULL_TREE)
> --- gcc/config/i386/i386-options.c.jj   2021-11-15 13:19:07.347900863 +0100
> +++ gcc/config/i386/i386-options.c  2021-11-20 00:27:32.919505947 +0100
> @@ -1403,6 +1403,8 @@ ix86_valid_target_attribute_tree (tree f
>return t;
>  }
>
> +static GTY(()) tree target_attribute_cache[3];
> +
>  /* Hook to validate attribute((target("string"))).  */
>
>  bool
> @@ -1423,6 +1425,19 @@ ix86_valid_target_attribute_p (tree fnde
>&& strcmp (TREE_STRING_POINTER (TREE_VALUE (args)), "default") == 0)
>  return true;
>
> +  if ((DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == target_attribute_cache[1]
> +   || DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == NULL_TREE)
> +  && (DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl)
> + == target_attribute_cache[2]
> + || DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) == NULL_TREE)
> +  && simple_cst_list_equal (args, target_attribute_cache[0]))
> +{
> +  DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = target_attribute_cache[1];
> +  DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl)
> +   = target_attribute_cache[2];
> +  return true;
> +}
> +
>tree old_optimize = build_optimization_node (&global_options,
>&global_options_set);
>
> @@ -1456,8 +1471,17 @@ ix86_valid_target_attribute_p (tree fnde
>if (new_target == error_mark_node)
>  ret = false;
>
> -  else if (fndecl && new_target)
> +  else if (new_target)
>  {
> +  if (DECL_FUNCTION_SPECIFIC_TARGET (fndecl) == NULL_TREE
> + && DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl) == NULL_TREE)
> +   {
> + target_attribute_cache[0] = copy_list (args);
> + target_attribute_cache[1] = new_target;
> + target_attribute_cache[2]
> +   = old_optimize != new_optimize ? new_optimize : NULL_TREE;
> +   }
> +
>DECL_FUNCTION_SPECIFIC_TARGET (fndecl) = new_target;
>
>if (old_optimize != new_optimize)
>
> Jakub

Re: [PATCH] Loop unswitching: support gswitch statements.

2021-12-01 Thread Martin Liška

On 12/1/21 15:34, Richard Biener wrote:

On Wed, Dec 1, 2021 at 3:25 PM Martin Liška  wrote:


On 12/1/21 15:19, Richard Biener wrote:

which is compute the range of 'lhs' on edge_true into predicate->true_range,
assign that same range to ->false_range and then invert it to get the
range on the false_edge.  What I am saying is that for better precision
you should do

   ranger->range_on_edge (predicate->false_range, edge_false, lhs);

rather than prematurely optimize this to the inversion of the true range
since yes, ranger is CFG sensitive and only the_last_  predicate on a
long CFG path is actually inverted.

What am I missing?


I might be misunderstood, but I think it's the problem defined here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584605.html

where I used the ranger->range_on_edge on the false_edge.


Ah, OK.  But then even the true_edge range is possibly wrong, no?


You are of course correct, I've just proved that in debugger ://


Consider

   for (;;)
  {
  if (a < 100)
if (a > 50)  // unswitch on this
  /* .. */
  if (a < 120)
  /* ... */
  }

then you record [51, 99] for true_range of the a > 50 predicate and thus
simplification will simplify the if (a < 120) check, no?


Yep.



You can only record the range from the (CFG independent) a > 50 check,
thus [51, +INF] but of course at simplification time you can also use
the CFG context at each simplification location.


@Andrew: How can I easily get irange based just on a stmt? Something like 
fold_range
with int_range_max as the 3rd argument?

Thanks,
Martin



Richard.


Martin




Re: [PATCH] Loop unswitching: support gswitch statements.

2021-12-01 Thread Richard Biener via Gcc-patches
On Wed, Dec 1, 2021 at 3:25 PM Martin Liška  wrote:
>
> On 12/1/21 15:19, Richard Biener wrote:
> > which is compute the range of 'lhs' on edge_true into predicate->true_range,
> > assign that same range to ->false_range and then invert it to get the
> > range on the false_edge.  What I am saying is that for better precision
> > you should do
> >
> >   ranger->range_on_edge (predicate->false_range, edge_false, lhs);
> >
> > rather than prematurely optimize this to the inversion of the true range
> > since yes, ranger is CFG sensitive and only the_last_  predicate on a
> > long CFG path is actually inverted.
> >
> > What am I missing?
>
> I might be misunderstood, but I think it's the problem defined here:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584605.html
>
> where I used the ranger->range_on_edge on the false_edge.

Ah, OK.  But then even the true_edge range is possibly wrong, no?
Consider

  for (;;)
 {
 if (a < 100)
   if (a > 50)  // unswitch on this
 /* .. */
 if (a < 120)
 /* ... */
 }

then you record [51, 99] for true_range of the a > 50 predicate and thus
simplification will simplify the if (a < 120) check, no?

You can only record the range from the (CFG independent) a > 50 check,
thus [51, +INF] but of course at simplification time you can also use
the CFG context at each simplification location.

Richard.

> Martin


[PATCH v4 1/6] tree-object-size: Use trees and support negative offsets

2021-12-01 Thread Siddhesh Poyarekar
Transform tree-object-size to operate on tree objects instead of host
wide integers.  This makes it easier to extend to dynamic expressions
for object sizes.

The compute_builtin_object_size interface also now returns a tree
expression instead of HOST_WIDE_INT, so callers have been adjusted to
account for that.

The trees in object_sizes are each an object_size object with members
size (the bytes from the pointer to the end of the object) and wholesize
(the size of the whole object).  This allows analysis of negative
offsets, which can now be allowed to the extent of the object bounds.
Tests have been added to verify that it actually works.

gcc/ChangeLog:

* tree-object-size.h (compute_builtin_object_size): Return tree
instead of HOST_WIDE_INT.
* builtins.c (fold_builtin_object_size): Adjust.
* gimple-fold.c (gimple_fold_builtin_strncat): Likewise.
* ubsan.c (instrument_object_size): Likewise.
* tree-object-size.c (object_size): New structure.
(object_sizes): Change type to vec.
(initval): New function.
(unknown): Use it.
(size_unknown_p, size_initval, size_unknown): New functions.
(object_sizes_unknown_p): Use it.
(object_sizes_get): Return tree.
(object_sizes_initialize): Rename from object_sizes_set_force
and set VAL parameter type as tree.  Add new parameter WHOLEVAL.
(object_sizes_set): Set VAL parameter type as tree and adjust
implementation.  Add new parameter WHOLEVAL.
(size_for_offset): New function.
(decl_init_size): Adjust comment.
(addr_object_size): Change PSIZE parameter to tree and adjust
implementation.  Add new parameter PWHOLESIZE.
(alloc_object_size): Return tree.
(compute_builtin_object_size): Return tree in PSIZE.
(expr_object_size, call_object_size, unknown_object_size):
Adjust for object_sizes_set change.
(merge_object_sizes): Drop OFFSET parameter and adjust
implementation for tree change.
(plus_stmt_object_size): Call collect_object_sizes_for directly
instead of merge_object_size and call size_for_offset to get net
size.
(cond_expr_object_size, collect_object_sizes_for,
object_sizes_execute): Adjust for change of type from
HOST_WIDE_INT to tree.
(check_for_plus_in_loops_1): Likewise and skip non-positive
offsets.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-object-size-1.c (test9): New test.
(main): Call it.
* gcc.dg/builtin-object-size-2.c (test8): New test.
(main): Call it.
* gcc.dg/builtin-object-size-3.c (test9): New test.
(main): Call it.
* gcc.dg/builtin-object-size-4.c (test8): New test.
(main): Call it.
* gcc.dg/builtin-object-size-5.c (test5, test6, test7): New
tests.

Signed-off-by: Siddhesh Poyarekar 
---
 gcc/builtins.c   |  10 +-
 gcc/gimple-fold.c|  11 +-
 gcc/testsuite/gcc.dg/builtin-object-size-1.c |  30 ++
 gcc/testsuite/gcc.dg/builtin-object-size-2.c |  30 ++
 gcc/testsuite/gcc.dg/builtin-object-size-3.c |  31 ++
 gcc/testsuite/gcc.dg/builtin-object-size-4.c |  30 ++
 gcc/testsuite/gcc.dg/builtin-object-size-5.c |  25 ++
 gcc/tree-object-size.c   | 393 ---
 gcc/tree-object-size.h   |   2 +-
 gcc/ubsan.c  |   5 +-
 10 files changed, 408 insertions(+), 159 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 03829c03a5a..309c3010de3 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -10254,7 +10254,7 @@ maybe_emit_sprintf_chk_warning (tree exp, enum 
built_in_function fcode)
 static tree
 fold_builtin_object_size (tree ptr, tree ost)
 {
-  unsigned HOST_WIDE_INT bytes;
+  tree bytes;
   int object_size_type;
 
   if (!validate_arg (ptr, POINTER_TYPE)
@@ -10279,8 +10279,8 @@ fold_builtin_object_size (tree ptr, tree ost)
   if (TREE_CODE (ptr) == ADDR_EXPR)
 {
   compute_builtin_object_size (ptr, object_size_type, &bytes);
-  if (wi::fits_to_tree_p (bytes, size_type_node))
-   return build_int_cstu (size_type_node, bytes);
+  if (int_fits_type_p (bytes, size_type_node))
+   return fold_convert (size_type_node, bytes);
 }
   else if (TREE_CODE (ptr) == SSA_NAME)
 {
@@ -10288,8 +10288,8 @@ fold_builtin_object_size (tree ptr, tree ost)
later.  Maybe subsequent passes will help determining
it.  */
   if (compute_builtin_object_size (ptr, object_size_type, &bytes)
- && wi::fits_to_tree_p (bytes, size_type_node))
-   return build_int_cstu (size_type_node, bytes);
+ && int_fits_type_p (bytes, size_type_node))
+   return fold_convert (size_type_node, bytes);
 }
 
   return NULL_TREE;
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 1d8fd74f72c..64515aabc04 100644
--- a/gcc/gimple-fold.c
+++

[PATCH v4 5/6] tree-object-size: Handle GIMPLE_CALL

2021-12-01 Thread Siddhesh Poyarekar
Handle non-constant expressions in GIMPLE_CALL arguments.  Also handle
alloca.

gcc/ChangeLog:

* tree-object-size.c (alloc_object_size): Make and return
non-constant size expression.
(call_object_size): Return expression or unknown based on
whether dynamic object size is requested.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-dynamic-object-size-0.c: Add new tests.
* gcc.dg/builtin-object-size-1.c (test1)
[__builtin_object_size]: Alter expected result for dynamic
object size.
* gcc.dg/builtin-object-size-2.c (test1)
[__builtin_object_size]: Likewise.
* gcc.dg/builtin-object-size-3.c (test1)
[__builtin_object_size]: Likewise.
* gcc.dg/builtin-object-size-4.c (test1)
[__builtin_object_size]: Likewise.

Signed-off-by: Siddhesh Poyarekar 
---
 .../gcc.dg/builtin-dynamic-object-size-0.c| 227 +-
 gcc/testsuite/gcc.dg/builtin-object-size-1.c  |   7 +
 gcc/testsuite/gcc.dg/builtin-object-size-2.c  |  14 ++
 gcc/testsuite/gcc.dg/builtin-object-size-3.c  |   7 +
 gcc/testsuite/gcc.dg/builtin-object-size-4.c  |  14 ++
 gcc/tree-object-size.c|  22 +-
 6 files changed, 282 insertions(+), 9 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c 
b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c
index ce0f4eb17f3..2db0e0d1aa2 100644
--- a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c
+++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c
@@ -4,12 +4,71 @@
 typedef __SIZE_TYPE__ size_t;
 #define abort __builtin_abort
 
+void *
+__attribute__ ((alloc_size (1)))
+__attribute__ ((__nothrow__ , __leaf__))
+__attribute__ ((noinline))
+alloc_func (size_t sz)
+{
+  return __builtin_malloc (sz);
+}
+
+void *
+__attribute__ ((alloc_size (1, 2)))
+__attribute__ ((__nothrow__ , __leaf__))
+__attribute__ ((noinline))
+calloc_func (size_t cnt, size_t sz)
+{
+  return __builtin_calloc (cnt, sz);
+}
+
+void *
+__attribute__ ((noinline))
+unknown_allocator (size_t cnt, size_t sz)
+{
+  return __builtin_calloc (cnt, sz);
+}
+
+size_t
+__attribute__ ((noinline))
+test_unknown (size_t cnt, size_t sz)
+{
+  void *ret = unknown_allocator (cnt, sz);
+  return __builtin_dynamic_object_size (ret, 0);
+}
+
+/* Malloc-like allocator.  */
+
+size_t
+__attribute__ ((noinline))
+test_malloc (size_t sz)
+{
+  void *ret = alloc_func (sz);
+  return __builtin_dynamic_object_size (ret, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_builtin_malloc (size_t sz)
+{
+  void *ret = __builtin_malloc (sz);
+  return __builtin_dynamic_object_size (ret, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_builtin_malloc_cond (int cond)
+{
+  void *ret = __builtin_malloc (cond ? 32 : 64);
+  return __builtin_dynamic_object_size (ret, 0);
+}
+
 size_t
 __attribute__ ((noinline))
 test_builtin_malloc_condphi (int cond)
 {
   void *ret;
- 
+
   if (cond)
 ret = __builtin_malloc (32);
   else
@@ -18,6 +77,79 @@ test_builtin_malloc_condphi (int cond)
   return __builtin_dynamic_object_size (ret, 0);
 }
 
+size_t
+__attribute__ ((noinline))
+test_builtin_malloc_condphi2 (int cond, size_t in)
+{
+  void *ret;
+
+  if (cond)
+ret = __builtin_malloc (in);
+  else
+ret = __builtin_malloc (64);
+
+  return __builtin_dynamic_object_size (ret, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_builtin_malloc_condphi3 (int cond, size_t in, size_t in2)
+{
+  void *ret;
+
+  if (cond)
+ret = __builtin_malloc (in);
+  else
+ret = __builtin_malloc (in2);
+
+  return __builtin_dynamic_object_size (ret, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_builtin_malloc_condphi4 (size_t sz, int cond)
+{
+  char *a = __builtin_malloc (sz);
+  char b[sz / 2];
+
+  return __builtin_dynamic_object_size (cond ? b : (void *) &a, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_builtin_malloc_condphi5 (size_t sz, int cond, char *c)
+{
+  char *a = __builtin_malloc (sz);
+
+  return __builtin_dynamic_object_size (cond ? c : (void *) &a, 0);
+}
+
+/* Calloc-like allocator.  */
+
+size_t
+__attribute__ ((noinline))
+test_calloc (size_t cnt, size_t sz)
+{
+  void *ret = calloc_func (cnt, sz);
+  return __builtin_dynamic_object_size (ret, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_builtin_calloc (size_t cnt, size_t sz)
+{
+  void *ret = __builtin_calloc (cnt, sz);
+  return __builtin_dynamic_object_size (ret, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_builtin_calloc_cond (int cond1, int cond2)
+{
+  void *ret = __builtin_calloc (cond1 ? 32 : 64, cond2 ? 1024 : 16);
+  return __builtin_dynamic_object_size (ret, 0);
+}
+
 size_t
 __attribute__ ((noinline))
 test_builtin_calloc_condphi (size_t cnt, size_t sz, int cond)
@@ -33,6 +165,47 @@ test_builtin_calloc_condphi (size_t cnt, size_t sz, int 
cond)
   return __builtin_dynamic_object_size (cond ? ch : (void *) &bin, 0);
 }
 
+/* Passthrough functions.  */
+
+size_t
+__attribute__ ((noinline))
+test_passthrough (size_t

[PATCH v4 6/6] tree-object-size: Dynamic sizes for ADDR_EXPR

2021-12-01 Thread Siddhesh Poyarekar
Allow returning dynamic expressions from ADDR_EXPR for
__builtin_dynamic_object_size and also allow offsets to be dynamic.

gcc/ChangeLog:

* tree-object-size.c (size_valid_p): New function.
(size_for_offset): Remove OFFSET constness assertion.
(addr_object_size): Build dynamic expressions for object
sizes and use size_valid_p to decide if it is valid for the
given OBJECT_SIZE_TYPE.
(compute_builtin_object_size): Allow dynamic offsets when
computing size at O0.
(call_object_size): Call size_valid_p.
(plus_stmt_object_size): Allow non-constant offset and use
size_valid_p to decide if it is valid for the given
OBJECT_SIZE_TYPE.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-dynamic-object-size-0.c: Add new tests.
* gcc.dg/builtin-object-size-1.c (test1)
[__builtin_object_size]: Adjust expected output for dynamic
object sizes.
* gcc.dg/builtin-object-size-2.c (test1)
[__builtin_object_size]: Likewise.
* gcc.dg/builtin-object-size-3.c (test1)
[__builtin_object_size]: Likewise.
* gcc.dg/builtin-object-size-4.c (test1)
[__builtin_object_size]: Likewise.

Signed-off-by: Siddhesh Poyarekar 
---
 .../gcc.dg/builtin-dynamic-object-size-0.c| 158 ++
 gcc/testsuite/gcc.dg/builtin-object-size-1.c  |  30 +++-
 gcc/testsuite/gcc.dg/builtin-object-size-2.c  |  43 -
 gcc/testsuite/gcc.dg/builtin-object-size-3.c  |  25 ++-
 gcc/testsuite/gcc.dg/builtin-object-size-4.c  |  17 +-
 gcc/tree-object-size.c|  91 +-
 6 files changed, 300 insertions(+), 64 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c 
b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c
index 2db0e0d1aa2..4a1f4965ebd 100644
--- a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c
+++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c
@@ -219,6 +219,79 @@ test_deploop (size_t sz, size_t cond)
   return __builtin_dynamic_object_size (bin, 0);
 }
 
+/* Address expressions.  */
+
+struct dynarray_struct
+{
+  long a;
+  char c[16];
+  int b;
+};
+
+size_t
+__attribute__ ((noinline))
+test_dynarray_struct (size_t sz, size_t off)
+{
+  struct dynarray_struct bin[sz];
+
+  return __builtin_dynamic_object_size (&bin[off].c, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_dynarray_struct_subobj (size_t sz, size_t off)
+{
+  struct dynarray_struct bin[sz];
+
+  return __builtin_dynamic_object_size (&bin[off].c[4], 1);
+}
+
+size_t
+__attribute__ ((noinline))
+test_dynarray_struct_subobj2 (size_t sz, size_t off, size_t *objsz)
+{
+  struct dynarray_struct2
+{
+  long a;
+  int b;
+  char c[sz];
+};
+
+  struct dynarray_struct2 bin;
+
+  *objsz = sizeof (bin);
+
+  return __builtin_dynamic_object_size (&bin.c[off], 1);
+}
+
+size_t
+__attribute__ ((noinline))
+test_substring (size_t sz, size_t off)
+{
+  char str[sz];
+
+  return __builtin_dynamic_object_size (&str[off], 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_substring_ptrplus (size_t sz, size_t off)
+{
+  int str[sz];
+
+  return __builtin_dynamic_object_size (str + off, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_substring_ptrplus2 (size_t sz, size_t off, size_t off2)
+{
+  int str[sz];
+  int *ptr = &str[off];
+
+  return __builtin_dynamic_object_size (ptr + off2, 0);
+}
+
 size_t
 __attribute__ ((access (__read_write__, 1, 2)))
 __attribute__ ((noinline))
@@ -227,6 +300,40 @@ test_parmsz_simple (void *obj, size_t sz)
   return __builtin_dynamic_object_size (obj, 0);
 }
 
+size_t
+__attribute__ ((noinline))
+__attribute__ ((access (__read_write__, 1, 2)))
+test_parmsz (void *obj, size_t sz, size_t off)
+{
+  return __builtin_dynamic_object_size (obj + off, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+__attribute__ ((access (__read_write__, 1, 2)))
+test_parmsz_scale (int *obj, size_t sz, size_t off)
+{
+  return __builtin_dynamic_object_size (obj + off, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+__attribute__ ((access (__read_write__, 1, 2)))
+test_loop (int *obj, size_t sz, size_t start, size_t end, int incr)
+{
+  int *ptr = obj + start;
+
+  for (int i = start; i != end; i = i + incr)
+{
+  ptr = ptr + incr;
+  if (__builtin_dynamic_object_size (ptr, 0) == 0)
+   return 0;
+}
+
+  return __builtin_dynamic_object_size (ptr, 0);
+}
+
+
 unsigned nfails = 0;
 
 #define FAIL() ({ \
@@ -287,6 +394,31 @@ main (int argc, char **argv)
 FAIL ();
   if (test_dynarray (__builtin_strlen (argv[0])) != __builtin_strlen (argv[0]))
 FAIL ();
+  if (test_dynarray_struct (42, 4) !=
+  ((42 - 4) * sizeof (struct dynarray_struct)
+   - __builtin_offsetof (struct dynarray_struct, c)))
+FAIL ();
+  if (test_dynarray_struct (42, 48) != 0)
+FAIL ();
+  if (test_substring (128, 4) != 128 - 4)
+FAIL ();
+  if (test_substring (128, 142) != 0)
+FAIL ();
+  if (test_dynarray_struct_subobj (42, 4

[PATCH v4 3/6] tree-object-size: Support dynamic sizes in conditions

2021-12-01 Thread Siddhesh Poyarekar
Handle GIMPLE_PHI and conditionals specially for dynamic objects,
returning PHI/conditional expressions instead of just a MIN/MAX
estimate.

This makes the returned object size variable for loops and conditionals,
so tests need to be adjusted to look for precise size in some cases.
builtin-dynamic-object-size-5.c had to be modified to only look for
success in maximum object size case and skip over the minimum object
size tests because the result is no longer a compile time constant.

I also added some simple tests to exercise conditionals with dynamic
object sizes.

gcc/ChangeLog:

* builtins.c (fold_builtin_object_size): Adjust for dynamic size
expressions.
* tree-object-size.c: Include gimplify-me.h.
(struct object_size_info): New member UNKNOWNS.
(size_initval_p, object_sizes_get_raw): New functions.
(object_sizes_get): Return suitable gimple variable for
object size.
(bundle_sizes): New function.
(object_sizes_set): Use it and handle dynamic object size
expressions.
(object_sizes_set_temp): New function.
(size_for_offset): Adjust for dynamic size expressions.
(emit_phi_nodes, propagate_unknowns, gimplify_size_expressions):
New functions.
(compute_builtin_object_size): Call gimplify_size_expressions
for OST_DYNAMIC.
(dynamic_object_size): New function.
(cond_expr_object_size): Use it.
(phi_dynamic_object_size): New function.
(collect_object_sizes_for): Call it for OST_DYNAMIC.  Adjust to
accommodate dynamic object sizes.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-dynamic-object-size-0.c: New tests.
* gcc.dg/builtin-dynamic-object-size-10.c: Add comment.
* gcc.dg/builtin-object-size-1.c [__builtin_object_size]: Expect
exact size expressions for __builtin_dynamic_object_size.
* gcc.dg/builtin-object-size-2.c [__builtin_object_size]:
Likewise.
* gcc.dg/builtin-object-size-3.c [__builtin_object_size]:
Likewise.
* gcc.dg/builtin-object-size-4.c [__builtin_object_size]:
Likewise.
* gcc.dg/builtin-object-size-5.c [__builtin_object_size]:
Likewise.

Signed-off-by: Siddhesh Poyarekar 
---
 gcc/builtins.c|   6 +-
 .../gcc.dg/builtin-dynamic-object-size-0.c|  72 +++
 .../gcc.dg/builtin-dynamic-object-size-10.c   |   2 +
 gcc/testsuite/gcc.dg/builtin-object-size-1.c  | 119 -
 gcc/testsuite/gcc.dg/builtin-object-size-2.c  |  92 
 gcc/testsuite/gcc.dg/builtin-object-size-3.c  | 121 +
 gcc/testsuite/gcc.dg/builtin-object-size-4.c  |  78 +++
 gcc/testsuite/gcc.dg/builtin-object-size-5.c  |  12 +
 gcc/tree-object-size.c| 489 +-
 9 files changed, 962 insertions(+), 29 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 3b815a6e42a..52b62f32b44 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -10284,7 +10284,8 @@ fold_builtin_object_size (tree ptr, tree ost, enum 
built_in_function fcode)
   if (TREE_CODE (ptr) == ADDR_EXPR)
 {
   compute_builtin_object_size (ptr, object_size_type, &bytes);
-  if (int_fits_type_p (bytes, size_type_node))
+  if ((object_size_type & OST_DYNAMIC)
+ || int_fits_type_p (bytes, size_type_node))
return fold_convert (size_type_node, bytes);
 }
   else if (TREE_CODE (ptr) == SSA_NAME)
@@ -10293,7 +10294,8 @@ fold_builtin_object_size (tree ptr, tree ost, enum 
built_in_function fcode)
later.  Maybe subsequent passes will help determining
it.  */
   if (compute_builtin_object_size (ptr, object_size_type, &bytes)
- && int_fits_type_p (bytes, size_type_node))
+ && ((object_size_type & OST_DYNAMIC)
+ || int_fits_type_p (bytes, size_type_node)))
return fold_convert (size_type_node, bytes);
 }
 
diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c 
b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c
new file mode 100644
index 000..ddedf6a49bd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c
@@ -0,0 +1,72 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+typedef __SIZE_TYPE__ size_t;
+#define abort __builtin_abort
+
+size_t
+__attribute__ ((noinline))
+test_builtin_malloc_condphi (int cond)
+{
+  void *ret;
+ 
+  if (cond)
+ret = __builtin_malloc (32);
+  else
+ret = __builtin_malloc (64);
+
+  return __builtin_dynamic_object_size (ret, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_builtin_calloc_condphi (size_t cnt, size_t sz, int cond)
+{
+  struct
+{
+  int a;
+  char b;
+} bin[cnt];
+
+  char *ch = __builtin_calloc (cnt, sz);
+
+  return __builtin_dynamic_object_size (cond ? ch : (void *) &bin, 0);
+}
+
+size_t
+__attribute__ ((noinline))
+test_deploop (size_t sz, size_t cond)
+{
+  char *bin 

[PATCH v4 2/6] __builtin_dynamic_object_size: Recognize builtin

2021-12-01 Thread Siddhesh Poyarekar
Recognize the __builtin_dynamic_object_size builtin and add paths in the
object size path to deal with it, but treat it like
__builtin_object_size for now.  Also add tests to provide the same
testing coverage for the new builtin name.

gcc/ChangeLog:

* builtins.def (BUILT_IN_DYNAMIC_OBJECT_SIZE): New builtin.
* tree-object-size.h: Move object size type bits enum from
tree-object-size.c and add new value OST_DYNAMIC.
* builtins.c (expand_builtin, fold_builtin_2): Handle it.
(fold_builtin_object_size): Handle new builtin and adjust for
change to compute_builtin_object_size.
* tree-object-size.c: Include builtins.h.
(compute_builtin_object_size): Adjust.
(early_object_sizes_execute_one,
dynamic_object_sizes_execute_one): New functions.
(object_sizes_execute): Rename insert_min_max_p argument to
early.  Handle BUILT_IN_DYNAMIC_OBJECT_SIZE and call the new
functions.
doc/extend.texi (__builtin_dynamic_object_size): Document new
builtin.

gcc/testsuite/ChangeLog:

* g++.dg/ext/builtin-dynamic-object-size1.C: New test.
* g++.dg/ext/builtin-dynamic-object-size2.C: Likewise.
* gcc.dg/builtin-dynamic-alloc-size.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-1.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-10.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-11.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-12.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-13.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-14.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-15.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-16.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-17.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-18.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-19.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-2.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-3.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-4.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-5.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-6.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-7.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-8.c: Likewise.
* gcc.dg/builtin-dynamic-object-size-9.c: Likewise.
* gcc.dg/builtin-object-size-16.c: Adjust to allow inclusion
from builtin-dynamic-object-size-16.c.
* gcc.dg/builtin-object-size-17.c: Likewise.

Signed-off-by: Siddhesh Poyarekar 
---
 gcc/builtins.c|  11 +-
 gcc/builtins.def  |   1 +
 gcc/doc/extend.texi   |  13 ++
 .../g++.dg/ext/builtin-dynamic-object-size1.C |   5 +
 .../g++.dg/ext/builtin-dynamic-object-size2.C |   5 +
 .../gcc.dg/builtin-dynamic-alloc-size.c   |   7 +
 .../gcc.dg/builtin-dynamic-object-size-1.c|   6 +
 .../gcc.dg/builtin-dynamic-object-size-10.c   |   9 ++
 .../gcc.dg/builtin-dynamic-object-size-11.c   |   7 +
 .../gcc.dg/builtin-dynamic-object-size-12.c   |   5 +
 .../gcc.dg/builtin-dynamic-object-size-13.c   |   5 +
 .../gcc.dg/builtin-dynamic-object-size-14.c   |   5 +
 .../gcc.dg/builtin-dynamic-object-size-15.c   |   5 +
 .../gcc.dg/builtin-dynamic-object-size-16.c   |   6 +
 .../gcc.dg/builtin-dynamic-object-size-17.c   |   7 +
 .../gcc.dg/builtin-dynamic-object-size-18.c   |   8 +
 .../gcc.dg/builtin-dynamic-object-size-19.c   | 104 
 .../gcc.dg/builtin-dynamic-object-size-2.c|   6 +
 .../gcc.dg/builtin-dynamic-object-size-3.c|   6 +
 .../gcc.dg/builtin-dynamic-object-size-4.c|   6 +
 .../gcc.dg/builtin-dynamic-object-size-5.c|   7 +
 .../gcc.dg/builtin-dynamic-object-size-6.c|   5 +
 .../gcc.dg/builtin-dynamic-object-size-7.c|   5 +
 .../gcc.dg/builtin-dynamic-object-size-8.c|   5 +
 .../gcc.dg/builtin-dynamic-object-size-9.c|   5 +
 gcc/testsuite/gcc.dg/builtin-object-size-16.c |   2 +
 gcc/testsuite/gcc.dg/builtin-object-size-17.c |   2 +
 gcc/tree-object-size.c| 152 +-
 gcc/tree-object-size.h|  10 ++
 29 files changed, 378 insertions(+), 42 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/builtin-dynamic-object-size1.C
 create mode 100644 gcc/testsuite/g++.dg/ext/builtin-dynamic-object-size2.C
 create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-alloc-size.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-1.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-10.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-11.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-12.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-13.c
 create mode 100644 gcc/testsuite/gcc.dg/builtin-dynamic-object-size-14.c
 create mode 1006

[PATCH v4 0/6] __builtin_dynamic_object_size

2021-12-01 Thread Siddhesh Poyarekar
This patchset implements the __builtin_dynamic_object_size builtin for
gcc.  The primary motivation to have this builtin in gcc is to enable
_FORTIFY_SOURCE=3 support with gcc, thus allowing greater fortification
in use cases where the potential performance tradeoff is acceptable.

Semantics:
--

__builtin_dynamic_object_size has the same signature as
__builtin_object_size; it accepts a pointer and type ranging from 0 to 3
and it returns an object size estimate for the pointer based on an
analysis of which objects the pointer could point to.  The actual
properties of the object size estimate are different:

- In the best case __builtin_dynamic_object_size evaluates to an
  expression that represents a precise size of the object being pointed
  to.

- In case a precise object size expression cannot be evaluated,
  __builtin_dynamic_object_size attempts to evaluate an estimate size
  expression based on the object size type.

- In what situations the builtin returns an estimate vs a precise
  expression is an implementation detail and may change in future.
  Users must always assume, as in the case of __builtin_object_size, that
  the returned value is the maximum or minimum based on the object size
  type they have provided.

- In the worst case of failure, __builtin_dynamic_object_size returns a
  constant (size_t)-1 or (size_t)0.

Implementation:
---

- The __builtin_dynamic_object_size support is implemented in
  tree-object-size.  In most cases the first pass (early_objsz) the
  builtin is treated like __builtin_object_size to preserve subobject
  bounds.

- Each element of the object_sizes vector is now an object_size struct
  holding bytes to the end of the object and the full size of the
  object.  This allows proper handling of negative offsets, allowing
  them to the extent of the whole object bounds.  This improves
  __builtin_object_size usage too with negative offsets, consistently
  returning valid results for pointer decrementing loops too.

- The patchset begins with structural modification of the
  tree-object-size pass, followed by enhancement to return size
  expressions.  I have split the implementation into one feature per
  patch (calls, function parameters, PHI, etc.) to hopefully ease
  review.

Performance:


Expressions generated by this pass in theory could be arbitrarily
complex.  I have not made an attempt to limit nesting of objects since
it seemed too early to do that.  In practice based on the few
applications I built, most of the complexity of the expressions got
folded away.  Even so, the performance overhead is likely to be
non-zero.  If we find performance degradation to be significant, we
could later add nesting limits to bail out if a size expression gets too
complex.

I have implemented simplification of __*_chk to their normal
variants if we can determine at compile time that it is safe.  This
should limit the performance overhead of the expressions in valid cases.

Build time performance doesn't seem to be affected much based on an
unscientific check to time
`make check-gcc RUNTESTFLAGS="dg.exp=builtin*"`.  It only increases by
about a couple of seconds when the dynamic tests are added and remains
more or less in the same ballpark otherwise.

Testing:


I have added tests for dynamic object sizes as well as wrappers for all
__builtin_object_size tests to provide wide coverage.  I have also done
a full bootstrap build and test run on x86_64.  Further I did a
bootstrap build with --with-build-config=bootstrap-ubsan for additional
coverage since it managed to find an issue with the earlier cleanup
patches.  It is however only useful to ensure there are no regressions
since ubsan does not yet use dynamic size expressions.

I have also built bash, cmake, wpa_supplicant and systemtap with
_FORTIFY_SOURCE=2 and _FORTIFY_SOURCE=3 (with a hacked up glibc to make
sure it works) and saw no issues in any of those builds.  I did some
rudimentary analysis of the generated binaries using fortify-metrics[1]
to confirm that there was a difference in coverage between the two
fortification levels.

(Unchanged since v3) Here is a summary of coverage in the above packages:

F = number of fortified calls
T = Total number of calls to fortifiable functions (fortified as well as
unfortified)
C = F * 100/ T

Package F(2)T(2)F(3)T(3)C(2)C(3)
bash428 12201005119635.08%  84.03%
wpa_supplicant  163532322350340850.59%  68.96%
systemtap   324 1990343 199416.28%  17.20%
cmake   830 14181   958 14196   5.85%   6.75%

The numbers are slightly lower than the previous patch series because in
the interim I pushed an improvement to folding of the _chk builtins so
that they can use ranges to simplify the calls to their regular
variants.  Also note that even _FORTIFY_SOURCE=2 coverage should be
improved due to negative offset handling.

Additional testing plans (

[PATCH v4 4/6] tree-object-size: Handle function parameters

2021-12-01 Thread Siddhesh Poyarekar
Handle hints provided by __attribute__ ((access (...))) to compute
dynamic sizes for objects.

gcc/ChangeLog:

* tree-object-size.c: Include tree-dfa.h.
(parm_object_size): New function.
(collect_object_sizes_for): Call it.

gcc/testsuite/ChangeLog:

* gcc.dg/builtin-dynamic-object-size-0.c (test_parmsz_simple):
New function.
(main): Call it.

Signed-off-by: Siddhesh Poyarekar 
---
 .../gcc.dg/builtin-dynamic-object-size-0.c| 11 
 gcc/tree-object-size.c| 50 ++-
 2 files changed, 60 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c 
b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c
index ddedf6a49bd..ce0f4eb17f3 100644
--- a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c
+++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-0.c
@@ -46,6 +46,14 @@ test_deploop (size_t sz, size_t cond)
   return __builtin_dynamic_object_size (bin, 0);
 }
 
+size_t
+__attribute__ ((access (__read_write__, 1, 2)))
+__attribute__ ((noinline))
+test_parmsz_simple (void *obj, size_t sz)
+{
+  return __builtin_dynamic_object_size (obj, 0);
+}
+
 unsigned nfails = 0;
 
 #define FAIL() ({ \
@@ -64,6 +72,9 @@ main (int argc, char **argv)
 FAIL ();
   if (test_deploop (128, 129) != 32)
 FAIL ();
+  if (test_parmsz_simple (argv[0], __builtin_strlen (argv[0]) + 1)
+  != __builtin_strlen (argv[0]) + 1)
+FAIL ();
 
   if (nfails > 0)
 __builtin_abort ();
diff --git a/gcc/tree-object-size.c b/gcc/tree-object-size.c
index dd6c90efebf..bbc708f8ade 100644
--- a/gcc/tree-object-size.c
+++ b/gcc/tree-object-size.c
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-fold.h"
 #include "gimple-iterator.h"
 #include "tree-cfg.h"
+#include "tree-dfa.h"
 #include "stringpool.h"
 #include "attribs.h"
 #include "builtins.h"
@@ -1449,6 +1450,53 @@ cond_expr_object_size (struct object_size_info *osi, 
tree var, gimple *stmt)
   return reexamine;
 }
 
+/* Find size of an object passed as a parameter to the function.  */
+
+static void
+parm_object_size (struct object_size_info *osi, tree var)
+{
+  int object_size_type = osi->object_size_type;
+  tree parm = SSA_NAME_VAR (var);
+
+  if (!(object_size_type & OST_DYNAMIC) || !POINTER_TYPE_P (TREE_TYPE (parm)))
+expr_object_size (osi, var, parm);
+
+  /* Look for access attribute.  */
+  rdwr_map rdwr_idx;
+
+  tree fndecl = cfun->decl;
+  const attr_access *access = get_parm_access (rdwr_idx, parm, fndecl);
+  tree typesize = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (parm)));
+  tree sz = NULL_TREE;
+
+  if (access && access->sizarg != UINT_MAX)
+{
+  tree fnargs = DECL_ARGUMENTS (fndecl);
+  tree arg = NULL_TREE;
+  unsigned argpos = 0;
+
+  /* Walk through the parameters to pick the size parameter and safely
+scale it by the type size.  */
+  for (arg = fnargs; argpos != access->sizarg && arg;
+  arg = TREE_CHAIN (arg), ++argpos);
+
+  if (arg != NULL_TREE && INTEGRAL_TYPE_P (TREE_TYPE (arg)))
+   {
+ sz = get_or_create_ssa_default_def (cfun, arg);
+ if (sz != NULL_TREE)
+   {
+ sz = fold_convert (sizetype, sz);
+ if (typesize)
+   sz = size_binop (MULT_EXPR, sz, typesize);
+   }
+   }
+}
+  if (!sz)
+sz = size_unknown (object_size_type);
+
+  object_sizes_set (osi, SSA_NAME_VERSION (var), sz, sz);
+}
+
 /* Compute an object size expression for VAR, which is the result of a PHI
node.  */
 
@@ -1606,7 +1654,7 @@ collect_object_sizes_for (struct object_size_info *osi, 
tree var)
 case GIMPLE_NOP:
   if (SSA_NAME_VAR (var)
  && TREE_CODE (SSA_NAME_VAR (var)) == PARM_DECL)
-   expr_object_size (osi, var, SSA_NAME_VAR (var));
+   parm_object_size (osi, var);
   else
/* Uninitialized SSA names point nowhere.  */
unknown_object_size (osi, var);
-- 
2.31.1



Re: [PATCH] Loop unswitching: support gswitch statements.

2021-12-01 Thread Martin Liška

On 12/1/21 15:19, Richard Biener wrote:

which is compute the range of 'lhs' on edge_true into predicate->true_range,
assign that same range to ->false_range and then invert it to get the
range on the false_edge.  What I am saying is that for better precision
you should do

  ranger->range_on_edge (predicate->false_range, edge_false, lhs);

rather than prematurely optimize this to the inversion of the true range
since yes, ranger is CFG sensitive and only the_last_  predicate on a
long CFG path is actually inverted.

What am I missing?


I might be misunderstood, but I think it's the problem defined here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584605.html

where I used the ranger->range_on_edge on the false_edge.

Martin


Re: [PATCH] fix spelling of -linker-output-auto-nolto-rel

2021-12-01 Thread Richard Biener via Gcc-patches
On Wed, Dec 1, 2021 at 3:18 PM Rasmus Villemoes
 wrote:
>
> On 01/12/2021 14.17, Richard Biener wrote:
> > On Wed, Dec 1, 2021 at 12:08 PM Rasmus Villemoes  
> > wrote:
> >>
> >> The transposition nolto -> notlo is confusing and it makes the long
> >> name even harder to read than it already is - I kept reading it as
> >> "not lo" until I realized it was a simply typo.
> >>
> >> Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks)
> >
> > Hmm, if we ever released GCC with that lto-plugin behavior you should 
> > preserve
> > accepting -linker-output-auto-notlo-rel in lto-plugin because a newer 
> > lto-plugin
> > is required to work with older released GCC which continue to provide
> > the option with the spelling mistake.
>
> I thought a given version of gcc would only use the plugin it shipped
> with(?).

No.  It's even older lto-plugins being used by newer compiler versions
(just with a possibly reduced feature set) - typing that on a system with
a lto-plugin from GCC 7 ;)

> And even if in theory one should keep the plugin compatible with older
> gcc releases, in this case it is only relevant to vxworks ports, so if
> Eric and Olivier are ok with this as-is, perhaps we can avoid having to
> accept both spellings forever.

I think the cost is negligible - looks like only GCC 11 knows the flag
so I would suggest to backport the patch to the GCC 11 branch as well
(but there as well keep the compatibility).

Richard.

>
> Rasmus


Re: [PATCH] Loop unswitching: support gswitch statements.

2021-12-01 Thread Richard Biener via Gcc-patches
On Wed, Dec 1, 2021 at 3:10 PM Martin Liška  wrote:
>
> On 11/30/21 12:17, Richard Biener wrote:
> > On Mon, Nov 29, 2021 at 1:45 PM Martin Liška  wrote:
> >>
> >> On 11/26/21 09:12, Richard Biener wrote:
> >>> On Wed, Nov 24, 2021 at 3:32 PM Martin Liška  wrote:
> 
>  On 11/24/21 15:14, Martin Liška wrote:
> > It likely miscompiles gcc.dg/loop-unswitch-5.c, working on that..
> 
>  Fixed that in the updated version.
> >>>
> >>> Function level comments need updating it seems.
> >>
> >> I've done that.
> >>
> >>>
> >>> +static unsigned
> >>> +evaluate_insns (class loop *loop,  basic_block *bbs,
> >>> +   predicate_vector &predicate_path,
> >>> +   auto_bb_flag &reachable_flag)
> >>> +{
> >>> +  auto_vec worklist (loop->num_nodes);
> >>> +  worklist.quick_push (bbs[0]);
> >>> ...
> >>>
> >>> so when adding gswitch support the easiest way to make
> >>>
> >>> +  FOR_EACH_EDGE (e, ei, bb->succs)
> >>> +   {
> >>> ...
> >>> +   {
> >>> + worklist.safe_push (dest);
> >>> + dest->flags |= reachable_flag;
> >>>
> >>> work is when the gcond/gswitch simplification would mark
> >>> outgoing edges as (non-)executable.  For gswitch this
> >>> could be achieved by iterating over the case labels and
> >>> intersecting that with the range while for gcond it's a
> >>> matter of setting an edge flag instead of returning true/false.
> >>
> >> Exactly, it can be quite naturally added to the current patch.
> >>
> >>> I'd call the common function evaluate_control_stmt_using_entry_checks
> >>> or so and invoke it on the last stmt of a block with >= 2 outgoing
> >>> edges.
> >>
> >> Yes, I'll do it for the gswitch support patch.
> >>
> >>>
> >>> We still seem to do the simplification work twice, once for costing
> >>> and once for transform, but that's OK for now I guess.
> >>>
> >>> I think you want to clear_aux_for_blocks at the end of the pass.
> >>
> >> Called that.
> >>
> >>>
> >>> Otherwise I like it - it seems you have some TODO around cost
> >>> modeling.  Did you try to do gswitch support ontop as I suggested
> >>> to see if the general structure keeps working?
> >>
> >> I vanished and tested the patch. No, I don't have the gswitch support patch
> >> as the current patch was reworked a few times.
> >>
> >> Can we please progress and have installed the suggested patch?
> >
> > I'd like to see the gswitch support - that's what was posted before stage3
> > close, this patch on its own doesn't seem worth pushing for.  That said,
> > I have some comments below (and the already raised ones about how
> > things might need to change with gswitch support).  Is it so difficult to
> > develop gswitch support as a separate change ontop of this?
>
> I'm going to work on that in the upcoming days. When are you leaving for the 
> Christmas
> holidays :P ?

Wednesday next week is my first day off, but I might peek into mails
now and then.

> >
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >
> > +#include 
> >
> > that's included unconditionally by system.h
>
> Good.
>
> >
> > +/* The type represents a predicate path leading to a basic block.  */
> > +typedef auto_vec> predicate_vector;
> >
> > +static bool tree_unswitch_single_loop (class loop *, int,
> > +  predicate_vector &predicate_path,
> >
> > I think we don't want to pass auto_vec by reference, instead auto_vec should
> > decay to vec<> when passed around.
>
> Ok.
>
> >
> > +  unswitch_predicate *predicate = new unswitch_predicate (cond, lhs);
> > +  if (irange::supports_type_p (TREE_TYPE (lhs)) && CONSTANT_CLASS_P (rhs))
> > +{
> > +  ranger->range_on_edge (predicate->true_range, edge_true, lhs);
> > +  predicate->false_range = predicate->true_range;
> >
> > -  return cond;
> > +  if (!predicate->false_range.varying_p ()
> > + && !predicate->false_range.undefined_p ())
> > +   predicate->false_range.invert ();
> > +}
> >
> > is that correct?  I would guess range_on_edge, for
> >
> > if (a > 10)
> >   if (a < 15)
> >  /* true */
> >   else
> >  /* false */
> >
> > figures [11, 14] on the true edge of if (a < 15) (considered the
> > unswitch predicate),
> > inverting that yields [0, 10] u [15, +INF] but that's at least
> > sub-optimal for the
> > else range.  I think we want to call range_on_edge again to determine the 
> > range
> > on the else branch?
>
> No, as shown in the previous emails, Ranger is CFG sensitive and we should 
> rely
> on our irange merging.

I don't understand.  You do

> > +  unswitch_predicate *predicate = new unswitch_predicate (cond, lhs);
> > +  if (irange::supports_type_p (TREE_TYPE (lhs)) && CONSTANT_CLASS_P (rhs))
> > +{
> > +  ranger->range_on_edge (predicate->true_range, edge_true, lhs);
> > +  predicate->false_range = predicate->true_range;
> >
> > -  return cond;
> > +  if (!predicate->false_range.varying_p ()
> > + 

Re: [PATCH] fix spelling of -linker-output-auto-nolto-rel

2021-12-01 Thread Rasmus Villemoes via Gcc-patches
On 01/12/2021 14.17, Richard Biener wrote:
> On Wed, Dec 1, 2021 at 12:08 PM Rasmus Villemoes  
> wrote:
>>
>> The transposition nolto -> notlo is confusing and it makes the long
>> name even harder to read than it already is - I kept reading it as
>> "not lo" until I realized it was a simply typo.
>>
>> Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks)
> 
> Hmm, if we ever released GCC with that lto-plugin behavior you should preserve
> accepting -linker-output-auto-notlo-rel in lto-plugin because a newer 
> lto-plugin
> is required to work with older released GCC which continue to provide
> the option with the spelling mistake.

I thought a given version of gcc would only use the plugin it shipped
with(?).

And even if in theory one should keep the plugin compatible with older
gcc releases, in this case it is only relevant to vxworks ports, so if
Eric and Olivier are ok with this as-is, perhaps we can avoid having to
accept both spellings forever.

Rasmus


Re: [PING, PATCH] darwin, d: Support outfile substitution for liphobos

2021-12-01 Thread Richard Biener via Gcc-patches
On Tue, Nov 30, 2021 at 6:04 PM Iain Buclaw via Gcc-patches
 wrote:
>
> Ping.
>
> Are the common gcc parts OK (also for backporting)?

Yes, OK for backporting as well after a short burn in period.

Richard,

> Iain.
>
> Excerpts from Iain Buclaw's message of November 26, 2021 1:51 pm:
> > Excerpts from Iain Sandoe's message of November 19, 2021 10:21 am:
> >> Hi Iain
> >>
> >>> On 19 Nov 2021, at 08:32, Iain Buclaw  wrote:
> >>
> >>> This patch fixes a stage2 bootstrap failure in the D front-end on
> >>> darwin due to libgphobos being dynamically linked despite
> >>> -static-libphobos being on the command line.
> >>>
> >>> In the gdc driver, this takes the previous fix for the Darwin D
> >>> bootstrap, and extends it to the -static-libphobos option as well.
> >>> Rather than pushing the -static-libphobos option back onto the command
> >>> line, the setting of SKIPOPT is instead conditionally removed.  The same
> >>> change has been repeated for -static-libstdc++ so there is now no need
> >>> to call generate_option to re-add it.
> >>>
> >>> In the gcc driver, -static-libphobos has been added as a common option,
> >>> validated, and a new outfile substition added to config/darwin.h to
> >>> correctly replace -lgphobos with libgphobos.a.
> >>>
> >>> Bootstrapped and regression tested on x86_64-linux-gnu and
> >>> x86_64-apple-darwin20.
> >>>
> >>> OK for mainline?  This would also be fine for gcc-11 release branch too,
> >>> as well as earlier releases with D support.
> >>
> >> the Darwin parts are fine, thanks
> >>
> >> The SKIPOPT in d-spec, presumably means “skip removing this opt”?
> >> otherwise the #ifndef looks odd (because of the 
> >> static-libgcc|static-libphobos,
> >> darwin.h would do the substitution for -static-libgcc as well, so it’s not 
> >> a 100%
> >> test).
> >>
> >
> > I've only just realised what you meant.  Yes you are of course right,
> > and it should have been #ifdef, attaching a fixed-up patch.
> >
> > Iain.
> >
> > ---
> > gcc/ChangeLog:
> >
> > * common.opt (static-libphobos): Add option.
> > * config/darwin.h (LINK_SPEC): Substitute -lgphobos with 
> > libgphobos.a
> > when linking statically.
> > * gcc.c (driver_handle_option): Set -static-libphobos as always 
> > valid.
> >
> > gcc/d/ChangeLog:
> >
> > * d-spec.cc (lang_specific_driver): Set SKIPOPT on 
> > -static-libstdc++
> > and -static-libphobos only when target supports 
> > LD_STATIC_DYNAMIC.
> > Remove generate_option to re-add -static-libstdc++.
> >
> > libphobos/ChangeLog:
> >
> > * testsuite/testsuite_flags.in: Add libphobos library directory 
> > as
> > search path to --gdcldflags.
> >
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index db6010e4e20..73c12d933f3 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -3527,6 +3527,10 @@ static-libgfortran
> >  Driver
> >  ; Documented for Fortran, but always accepted by driver.
> >
> > +static-libphobos
> > +Driver
> > +; Documented for D, but always accepted by driver.
> > +
> >  static-libstdc++
> >  Driver
> >
> > diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
> > index 7ed01efa694..c4ddd623e8b 100644
> > --- a/gcc/config/darwin.h
> > +++ b/gcc/config/darwin.h
> > @@ -443,6 +443,7 @@ extern GTY(()) int darwin_ms_struct;
> >   %:replace-outfile(-lobjc libobjc-gnu.a%s); \
> >  :%:replace-outfile(-lobjc -lobjc-gnu )}}\
> > %{static|static-libgcc|static-libgfortran:%:replace-outfile(-lgfortran 
> > libgfortran.a%s)}\
> > +   %{static|static-libgcc|static-libphobos:%:replace-outfile(-lgphobos 
> > libgphobos.a%s)}\
> > 
> > %{static|static-libgcc|static-libstdc++|static-libgfortran:%:replace-outfile(-lgomp
> >  libgomp.a%s)}\
> > %{static|static-libgcc|static-libstdc++:%:replace-outfile(-lstdc++ 
> > libstdc++.a%s)}\
> > %{force_cpusubtype_ALL:-arch %(darwin_arch)} \
> > diff --git a/gcc/d/d-spec.cc b/gcc/d/d-spec.cc
> > index b12d28f1047..1304126a675 100644
> > --- a/gcc/d/d-spec.cc
> > +++ b/gcc/d/d-spec.cc
> > @@ -253,13 +253,23 @@ lang_specific_driver (cl_decoded_option 
> > **in_decoded_options,
> >
> >   case OPT_static_libstdc__:
> > saw_static_libcxx = true;
> > +#ifdef HAVE_LD_STATIC_DYNAMIC
> > +   /* Remove -static-libstdc++ from the command only if target supports
> > +  LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
> > +  back-end target can use outfile substitution.  */
> > args[i] |= SKIPOPT;
> > +#endif
> > break;
> >
> >   case OPT_static_libphobos:
> > if (phobos_library != PHOBOS_NOLINK)
> >   phobos_library = PHOBOS_STATIC;
> > +#ifdef HAVE_LD_STATIC_DYNAMIC
> > +   /* Remove -static-libphobos from the command only if target supports
> > +  LD_STATIC_DYNAMIC.  When not supported, it is left in so that a
> > +  back-end target can use outfile substi

Re: [PATCH] Loop unswitching: support gswitch statements.

2021-12-01 Thread Martin Liška

On 11/30/21 12:17, Richard Biener wrote:

On Mon, Nov 29, 2021 at 1:45 PM Martin Liška  wrote:


On 11/26/21 09:12, Richard Biener wrote:

On Wed, Nov 24, 2021 at 3:32 PM Martin Liška  wrote:


On 11/24/21 15:14, Martin Liška wrote:

It likely miscompiles gcc.dg/loop-unswitch-5.c, working on that..


Fixed that in the updated version.


Function level comments need updating it seems.


I've done that.



+static unsigned
+evaluate_insns (class loop *loop,  basic_block *bbs,
+   predicate_vector &predicate_path,
+   auto_bb_flag &reachable_flag)
+{
+  auto_vec worklist (loop->num_nodes);
+  worklist.quick_push (bbs[0]);
...

so when adding gswitch support the easiest way to make

+  FOR_EACH_EDGE (e, ei, bb->succs)
+   {
...
+   {
+ worklist.safe_push (dest);
+ dest->flags |= reachable_flag;

work is when the gcond/gswitch simplification would mark
outgoing edges as (non-)executable.  For gswitch this
could be achieved by iterating over the case labels and
intersecting that with the range while for gcond it's a
matter of setting an edge flag instead of returning true/false.


Exactly, it can be quite naturally added to the current patch.


I'd call the common function evaluate_control_stmt_using_entry_checks
or so and invoke it on the last stmt of a block with >= 2 outgoing
edges.


Yes, I'll do it for the gswitch support patch.



We still seem to do the simplification work twice, once for costing
and once for transform, but that's OK for now I guess.

I think you want to clear_aux_for_blocks at the end of the pass.


Called that.



Otherwise I like it - it seems you have some TODO around cost
modeling.  Did you try to do gswitch support ontop as I suggested
to see if the general structure keeps working?


I vanished and tested the patch. No, I don't have the gswitch support patch
as the current patch was reworked a few times.

Can we please progress and have installed the suggested patch?


I'd like to see the gswitch support - that's what was posted before stage3
close, this patch on its own doesn't seem worth pushing for.  That said,
I have some comments below (and the already raised ones about how
things might need to change with gswitch support).  Is it so difficult to
develop gswitch support as a separate change ontop of this?


I'm going to work on that in the upcoming days. When are you leaving for the 
Christmas
holidays :P ?




Patch can bootstrap on x86_64-linux-gnu and survives regression tests.


+#include 

that's included unconditionally by system.h


Good.



+/* The type represents a predicate path leading to a basic block.  */
+typedef auto_vec> predicate_vector;

+static bool tree_unswitch_single_loop (class loop *, int,
+  predicate_vector &predicate_path,

I think we don't want to pass auto_vec by reference, instead auto_vec should
decay to vec<> when passed around.


Ok.



+  unswitch_predicate *predicate = new unswitch_predicate (cond, lhs);
+  if (irange::supports_type_p (TREE_TYPE (lhs)) && CONSTANT_CLASS_P (rhs))
+{
+  ranger->range_on_edge (predicate->true_range, edge_true, lhs);
+  predicate->false_range = predicate->true_range;

-  return cond;
+  if (!predicate->false_range.varying_p ()
+ && !predicate->false_range.undefined_p ())
+   predicate->false_range.invert ();
+}

is that correct?  I would guess range_on_edge, for

if (a > 10)
  if (a < 15)
 /* true */
  else
 /* false */

figures [11, 14] on the true edge of if (a < 15) (considered the
unswitch predicate),
inverting that yields [0, 10] u [15, +INF] but that's at least
sub-optimal for the
else range.  I think we want to call range_on_edge again to determine the range
on the else branch?


No, as shown in the previous emails, Ranger is CFG sensitive and we should rely
on our irange merging.



  }

-/* Simplifies COND using checks in front of the entry of the LOOP.  Just very
-   simplish (sufficient to prevent us from duplicating loop in unswitching
-   unnecessarily).  */
+static void
+combine_range (predicate_vector &predicate_path, tree index, irange
&path_range)
+{

unless I misread the patch combine_range misses a comment.

+evaluate_control_stmt_using_entry_checks (gimple *stmt,
+ predicate_vector &predicate_path)
  {

so this function for ranger does combine all predicates on the predicate_path
but for the symbolic evaluation it looks at the last predicate only?


Yes.


 I guess
that's because other predicate simplification opportunities are applied already,
correct?


Exactly.


 But doesn't that mean that the combine_range could be done once
when we build the predicate vector instead of for each stmt?  I'm just
looking at the difference in treating both cases - if we first analyze the whole
unswitching path (including all recursions) then we'd have to simplify all
opportunities at once, so iterating over all 

[committed] testsuite: Fix typo in comment in aapcs64 test

2021-12-01 Thread Alex Coplan via Gcc-patches
Committed as obvious.

Thanks,
Alex

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/aapcs64/macro-def.h (PTR): Fix typo in
comment.

diff --git a/gcc/testsuite/gcc.target/aarch64/aapcs64/macro-def.h 
b/gcc/testsuite/gcc.target/aarch64/aapcs64/macro-def.h
index 72a47067631..e681de60778 100644
--- a/gcc/testsuite/gcc.target/aarch64/aapcs64/macro-def.h
+++ b/gcc/testsuite/gcc.target/aarch64/aapcs64/macro-def.h
@@ -63,7 +63,7 @@
 #define ANON_PROMOTED(type,val,type_promoted, val_promoted, offset,...)
\
   ANON(type_promoted, val_promoted, offset, __VA_ARGS__)
 /* Composite larger than 16 bytes is replaced by a pointer to a copy prepared
-   by the caller, so here we extrat the pointer, deref it and compare the
+   by the caller, so here we extract the pointer, deref it and compare the
content with that of the original one.  */
 #define PTR(type, val, offset, ...) {  \
   type * ptr;  \


Re: [PATCH] configure, d: Add support for bootstrapping the D front-end

2021-12-01 Thread Thomas Schwinge
Hi!

On 2021-10-29T17:37:44-0400, Eric Gallager via Gcc-patches 
 wrote:
> On Thu, Oct 28, 2021 at 2:38 PM Jeff Law via Gcc-patches
>  wrote:
>> On 10/9/2021 7:32 AM, Iain Buclaw via Gcc-patches wrote:
>> > The implementation of the D front-end in GCC is based on the original
>> > C++ version of the D programming language compiler, which was ported to
>> > D itself in version 2.069.0 (released in 2015).  [...]
>> > It has come to the point now that I'm happy enough with the process to
>> > switch out the C++ sources in gcc/d/dmd with D sources.

Congratulations: self-hosting compiler!

That, of course, makes the GCC/D compilation process more difficult:

>> Presumably this means that the only way to build D for the first time on
>> a new target is to cross from an existing target that supports D, right?
>>
>> I think that's not unreasonable and I don't think we want to increase
>> the burden of maintaining an old codebase just for the sake of a
>> marginally easier bootstrap process for a new target.

(Some may argue "burden" vs. "marginally", but yes, I agree.)  Plus, the
problem that also for non-cross (native) builds, you now have a baseline
GCC/D compiler requirement:

> There should be some sort of note about this in the documentation,
> IMO; both install.texi

That has been done.

+In order to build GDC, the D compiler, you need a working GDC
+compiler (GCC version 9.1 or later), as the D front end is written in D.

> and the "Caveats" section of
> gcc-12/changes.html (and possibly other places).

That not yet, but yes, I agree that should be done, too.


So it's now a requirement to build/bootstrap GCC/D with GCC 9.1 (or
newer, of course) -- which is quite different from the current GCC 4.8
requirement for all other GCC parts (including self-hosted GCC/Ada, by
the way, which also is happy with GCC 4.8).

I'm one of those (few, I guess?) doing bootstrap verification builds with
actual old GCC 4.8 ("gcc-4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4",
precisely), and that now obviously doesn't cover GCC/D anymore, for
example with '--enable-languages=all' (and without the GDC 4.8 packages
installed):

[...]
+checking for gdc... no
[...]
+configure: WARNING: GDC is required to build d
 configure: WARNING: --enable-host-shared required to build jit
-The following languages will be built: 
c,ada,c++,d,fortran,go,lto,objc,obj-c++
+The following languages will be built: 
c,ada,c++,fortran,go,lto,objc,obj-c++
 *** This configuration is not supported in the following subdirectories:
- target-liboffloadmic
+ target-libphobos target-zlib target-liboffloadmic
 (Any other directories should still work fine.)
[...]

..., or error for explicit '--enable-languages=d'.

So I built myself a stock GCC 9.1 with '--enable-languages=d', and via
'[...]/configure [...] CC=gcc-4.8 CXX=g++-4.8 GDC=[GCC 9.1]/bin/gdc [...]'
I'm able to successfully bootstrap GCC, including GCC/D, in a mixed
GCC 4.8/9.1 configuration, with subsequent good-looking 'make check-d'
results.  Per my superficial review of the build log file, the
'[GCC 9.1]/bin/gdc' indeed is only used during stage 1 build, as it
should be.  So that's good enough as far as I'm concerned, and unless
anyone sees any reason why such a mixed GCC 4.8/9.1 configuration would
be bad, may it be worth putting such information (may 'configure' with
'GDC=[GCC 9.1]/bin/gdc') into the documentation, too?


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] Allow loop crossing paths in back threader copier.

2021-12-01 Thread Richard Biener via Gcc-patches
On Tue, Nov 30, 2021 at 4:48 PM Aldy Hernandez via Gcc-patches
 wrote:
>
> We are currently restricting loop crossing paths in the generic copier
> used by the back threader, but we should be able to handle them after
> loop_done has completed.

But this isn't a cost thing but a correctness (as in, can we update
loops properly)
thing.  We are passing 'loop' down to copy_bbs and very much rely on
seeing either BBs of that loop or copying complete subloops of it.

You will likely see excess loop fixup caused by this or ICEs in case the
caller of this function will not always set LOOPS_NEED_FIXUP
(not that copy_bbs handling of loops is perfect).

> This fixes the PR at -O2, though the problem remains at -O1 because we
> have no threaders smart enough to elide the undefined read.  DOM3 could
> be a candidate when it is converted to either a hybrid threader or
> replaced with the backward threader (when ranger can handle floats).
>
> Tested on x86-64 Linux.
>
> OK for trunk?
>
> PR tree-optimization/80548
>
> gcc/ChangeLog:
>
> * attribs.c (sorted_attr_string): Add assert for -Wstringop-overread.
> * tree-ssa-threadupdate.c
> (back_jt_path_registry::duplicate_thread_path): Allow paths that
> cross loops after loop_done.
> (back_jt_path_registry::update_cfg): Diagnose dropped threads
> after duplicate_thread_path.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/pr80548.c: New test.
> ---
>  gcc/attribs.c  |  1 +
>  gcc/testsuite/gcc.dg/pr80548.c | 23 +++
>  gcc/tree-ssa-threadupdate.c| 19 +++
>  3 files changed, 35 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr80548.c
>
> diff --git a/gcc/attribs.c b/gcc/attribs.c
> index c252f5af07b..9a079b8405a 100644
> --- a/gcc/attribs.c
> +++ b/gcc/attribs.c
> @@ -1035,6 +1035,7 @@ sorted_attr_string (tree arglist)
>attr_str[str_len_sum + len] = TREE_CHAIN (arg) ? ',' : '\0';
>str_len_sum += len + 1;
>  }
> +  gcc_assert (arglist);
>
>/* Replace "=,-" with "_".  */
>for (i = 0; i < strlen (attr_str); i++)
> diff --git a/gcc/testsuite/gcc.dg/pr80548.c b/gcc/testsuite/gcc.dg/pr80548.c
> new file mode 100644
> index 000..232743e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr80548.c
> @@ -0,0 +1,23 @@
> +// { dg-do compile }
> +// { dg-options "-O2 -Wuninitialized" }
> +
> +int g (void);
> +void h (int, int);
> +
> +void f (int b)
> +{
> +  int x, y;
> +
> +  if (b)
> +{
> +  x = g ();
> +  y = g ();
> +}
> +
> +  while (g ())
> +if (b)
> +  {
> +h (x, y); // { dg-bogus "uninit" }
> +y = g ();
> +  }
> +}
> diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
> index 8aac733ac25..b194c11e23d 100644
> --- a/gcc/tree-ssa-threadupdate.c
> +++ b/gcc/tree-ssa-threadupdate.c
> @@ -2410,13 +2410,14 @@ back_jt_path_registry::duplicate_thread_path (edge 
> entry,
>   missuses of the functions.  I.e. if you ask to copy something weird,
>   it will work, but the state of structures probably will not be
>   correct.  */
> -  for (i = 0; i < n_region; i++)
> -{
> -  /* We do not handle subloops, i.e. all the blocks must belong to the
> -same loop.  */
> -  if (region[i]->loop_father != loop)
> -   return false;
> -}
> +  if (!(cfun->curr_properties & PROP_loop_opts_done))
> +for (i = 0; i < n_region; i++)
> +  {
> +   /* We do not handle subloops, i.e. all the blocks must belong to the
> +  same loop.  */
> +   if (region[i]->loop_father != loop)
> + return false;
> +  }
>
>initialize_original_copy_tables ();
>
> @@ -2651,9 +2652,11 @@ back_jt_path_registry::update_cfg (bool 
> /*peel_loop_headers*/)
>   visited_starting_edges.add (entry);
>   retval = true;
>   m_num_threaded_edges++;
> + path->release ();
> }
> +  else
> +   cancel_thread (path, "Failure in duplicate_thread_path");
>
> -  path->release ();
>m_paths.unordered_remove (0);
>free (region);
>  }
> --
> 2.31.1
>


Re: [PATCH] vect: Tighten check for SLP memory groups [PR103517]

2021-12-01 Thread Richard Biener via Gcc-patches
On Wed, Dec 1, 2021 at 2:25 PM Richard Sandiford
 wrote:
>
> Richard Biener via Gcc-patches  writes:
> > On Wed, Dec 1, 2021 at 11:56 AM Richard Sandiford via Gcc-patches
> >  wrote:
> >>
> >> When checking for compatible stmts, vect_build_slp_tree_1 did:
> >>
> >>&& !(STMT_VINFO_GROUPED_ACCESS (stmt_info)
> >> && (first_stmt_code == ARRAY_REF
> >> || first_stmt_code == BIT_FIELD_REF
> >> || first_stmt_code == INDIRECT_REF
> >> || first_stmt_code == COMPONENT_REF
> >> || first_stmt_code == MEM_REF)))
> >>
> >> That is, it allowed any rhs_code as long as the first_stmt_code
> >> looked valid.  This had the effect of allowing IFN_MASK_LOAD
> >> to be paired with an earlier non-call code (but didn't allow
> >> the reverse).
> >>
> >> This patch makes the check symmetrical.
> >>
> >> Still testing on x86_64-linux-gnu.  OK if testing passes, or doesn't
> >> this seem like the right approach?
> >
> > It's indeed a too weak comparison of the classification of the first
> > and the followup operands, some larger refactoring is probably
> > needed to improve here (note how we compare STMT_VINFO_GROUPED_ACCESS
> > of the followup against the tree codes of the first stmt but also later
> > compare first_stmt_load_p against load_p).
> >
> > The proposed patch looks reasonable (but then we could drop
> > the STMT_VINFO_GROUPED_ACCESS (stmt_info) part of the check?),
>
> Yeah, was wondering about that.  Seemed safer to keep it, since without
> it we might pair non-memory BIT_FIELD_REFs with other things.  I guess
> the same goes for the first stmt though, and the mismatch ought to be
> caught later anyway.

Hmm, yeah.  I guess some classify_stmt () returning a custom enum
and comparing that might clean up this whole thing.  But nothing for stage3.

Richard.

> > so OK.
>
> Thanks,
> Richard
>
> > Thanks,
> > Richard.
> >
> >> Richard
> >>
> >>
> >> gcc/
> >> PR tree-optimization/103517
> >> * tree-vect-slp.c (vect_build_slp_tree_1): When allowing two
> >> different component references, check the codes of both them,
> >> rather than just the first.
> >>
> >> gcc/testsuite/
> >> PR tree-optimization/103517
> >> * gcc.dg/vect/pr103517.c: New test.
> >> ---
> >>  gcc/testsuite/gcc.dg/vect/pr103517.c | 13 +
> >>  gcc/tree-vect-slp.c  |  7 ++-
> >>  2 files changed, 19 insertions(+), 1 deletion(-)
> >>  create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c
> >>
> >> diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c 
> >> b/gcc/testsuite/gcc.dg/vect/pr103517.c
> >> new file mode 100644
> >> index 000..de87fc48f84
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/vect/pr103517.c
> >> @@ -0,0 +1,13 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* 
> >> i?86-*-* } } */
> >> +
> >> +int a;
> >> +short b, c;
> >> +extern short d[];
> >> +void e() {
> >> +  for (short f = 1; f < (short)a; f += 2)
> >> +if (d[f + 1]) {
> >> +  b = d[f];
> >> +  c = d[f + 1];
> >> +}
> >> +}
> >> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> >> index 7bff5118bd0..bc22ffeed82 100644
> >> --- a/gcc/tree-vect-slp.c
> >> +++ b/gcc/tree-vect-slp.c
> >> @@ -1121,7 +1121,12 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned 
> >> char *swap,
> >> || first_stmt_code == BIT_FIELD_REF
> >> || first_stmt_code == INDIRECT_REF
> >> || first_stmt_code == COMPONENT_REF
> >> -   || first_stmt_code == MEM_REF)))
> >> +   || first_stmt_code == MEM_REF)
> >> +   && (rhs_code == ARRAY_REF
> >> +   || rhs_code == BIT_FIELD_REF
> >> +   || rhs_code == INDIRECT_REF
> >> +   || rhs_code == COMPONENT_REF
> >> +   || rhs_code == MEM_REF)))
> >>   || first_stmt_load_p != load_p
> >>   || first_stmt_phi_p != phi_p)
> >> {
> >> --
> >> 2.25.1
> >>


Re: [PATCH] vect: Tighten check for SLP memory groups [PR103517]

2021-12-01 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> On Wed, Dec 1, 2021 at 11:56 AM Richard Sandiford via Gcc-patches
>  wrote:
>>
>> When checking for compatible stmts, vect_build_slp_tree_1 did:
>>
>>&& !(STMT_VINFO_GROUPED_ACCESS (stmt_info)
>> && (first_stmt_code == ARRAY_REF
>> || first_stmt_code == BIT_FIELD_REF
>> || first_stmt_code == INDIRECT_REF
>> || first_stmt_code == COMPONENT_REF
>> || first_stmt_code == MEM_REF)))
>>
>> That is, it allowed any rhs_code as long as the first_stmt_code
>> looked valid.  This had the effect of allowing IFN_MASK_LOAD
>> to be paired with an earlier non-call code (but didn't allow
>> the reverse).
>>
>> This patch makes the check symmetrical.
>>
>> Still testing on x86_64-linux-gnu.  OK if testing passes, or doesn't
>> this seem like the right approach?
>
> It's indeed a too weak comparison of the classification of the first
> and the followup operands, some larger refactoring is probably
> needed to improve here (note how we compare STMT_VINFO_GROUPED_ACCESS
> of the followup against the tree codes of the first stmt but also later
> compare first_stmt_load_p against load_p).
>
> The proposed patch looks reasonable (but then we could drop
> the STMT_VINFO_GROUPED_ACCESS (stmt_info) part of the check?),

Yeah, was wondering about that.  Seemed safer to keep it, since without
it we might pair non-memory BIT_FIELD_REFs with other things.  I guess
the same goes for the first stmt though, and the mismatch ought to be
caught later anyway.

> so OK.

Thanks,
Richard

> Thanks,
> Richard.
>
>> Richard
>>
>>
>> gcc/
>> PR tree-optimization/103517
>> * tree-vect-slp.c (vect_build_slp_tree_1): When allowing two
>> different component references, check the codes of both them,
>> rather than just the first.
>>
>> gcc/testsuite/
>> PR tree-optimization/103517
>> * gcc.dg/vect/pr103517.c: New test.
>> ---
>>  gcc/testsuite/gcc.dg/vect/pr103517.c | 13 +
>>  gcc/tree-vect-slp.c  |  7 ++-
>>  2 files changed, 19 insertions(+), 1 deletion(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c
>>
>> diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c 
>> b/gcc/testsuite/gcc.dg/vect/pr103517.c
>> new file mode 100644
>> index 000..de87fc48f84
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/vect/pr103517.c
>> @@ -0,0 +1,13 @@
>> +/* { dg-do compile } */
>> +/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* 
>> i?86-*-* } } */
>> +
>> +int a;
>> +short b, c;
>> +extern short d[];
>> +void e() {
>> +  for (short f = 1; f < (short)a; f += 2)
>> +if (d[f + 1]) {
>> +  b = d[f];
>> +  c = d[f + 1];
>> +}
>> +}
>> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
>> index 7bff5118bd0..bc22ffeed82 100644
>> --- a/gcc/tree-vect-slp.c
>> +++ b/gcc/tree-vect-slp.c
>> @@ -1121,7 +1121,12 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char 
>> *swap,
>> || first_stmt_code == BIT_FIELD_REF
>> || first_stmt_code == INDIRECT_REF
>> || first_stmt_code == COMPONENT_REF
>> -   || first_stmt_code == MEM_REF)))
>> +   || first_stmt_code == MEM_REF)
>> +   && (rhs_code == ARRAY_REF
>> +   || rhs_code == BIT_FIELD_REF
>> +   || rhs_code == INDIRECT_REF
>> +   || rhs_code == COMPONENT_REF
>> +   || rhs_code == MEM_REF)))
>>   || first_stmt_load_p != load_p
>>   || first_stmt_phi_p != phi_p)
>> {
>> --
>> 2.25.1
>>


Re: [PATCH] fix spelling of -linker-output-auto-nolto-rel

2021-12-01 Thread Richard Biener via Gcc-patches
On Wed, Dec 1, 2021 at 12:08 PM Rasmus Villemoes  
wrote:
>
> The transposition nolto -> notlo is confusing and it makes the long
> name even harder to read than it already is - I kept reading it as
> "not lo" until I realized it was a simply typo.
>
> Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks)

Hmm, if we ever released GCC with that lto-plugin behavior you should preserve
accepting -linker-output-auto-notlo-rel in lto-plugin because a newer lto-plugin
is required to work with older released GCC which continue to provide
the option with the spelling mistake.

Richard.

> lto-plugin/
> * lto-plugin.c: Fix -linker-output-auto-notlo-rel ->
> -linker-output-auto-nolto-rel typo.
> (process_option): Adjust accordingly.
>
> gcc/
> * config/vxworks.h (LTO_PLUGIN_SPEC): Adapt to corrected
> spelling of -linker-output-auto-nolto-rel.
> ---
>  gcc/config/vxworks.h| 2 +-
>  lto-plugin/lto-plugin.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/vxworks.h b/gcc/config/vxworks.h
> index e41f16a51e8..bddf2c37f42 100644
> --- a/gcc/config/vxworks.h
> +++ b/gcc/config/vxworks.h
> @@ -306,4 +306,4 @@ extern void vxworks_emit_call_builtin___clear_cache (rtx 
> begin, rtx end);
> further incremental LTO linking.  We do not do repeated incremental 
> linking
> so silence the warning (instead of passing -flinker-output=nolto-rel).  */
>  #undef LTO_PLUGIN_SPEC
> -#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-notlo-rel}"
> +#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-nolto-rel}"
> diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
> index 6ab9822f369..40d7b7b8692 100644
> --- a/lto-plugin/lto-plugin.c
> +++ b/lto-plugin/lto-plugin.c
> @@ -32,7 +32,7 @@ along with this program; see the file COPYING3.  If not see
> -nop: Instead of running lto-wrapper, pass the original to the plugin. 
> This
> only works if the input files are hybrid.
> -linker-output-known: Do not determine linker output
> -   -linker-output-auto-notlo-rel: Switch from rel to nolto-rel mode without
> +   -linker-output-auto-nolto-rel: Switch from rel to nolto-rel mode without
> warning.  This is used on systems like VxWorks (kernel) where the link is
> always partial and repeated incremental linking is generally not used.
> -sym-style={none,win32,underscore|uscore}
> @@ -1321,7 +1321,7 @@ process_option (const char *option)
>  {
>if (strcmp (option, "-linker-output-known") == 0)
>  linker_output_known = true;
> -  else if (strcmp (option, "-linker-output-auto-notlo-rel") == 0)
> +  else if (strcmp (option, "-linker-output-auto-nolto-rel") == 0)
>  linker_output_auto_nolto_rel = true;
>else if (strcmp (option, "-debug") == 0)
>  debug = true;
> --
> 2.31.1
>


Re: [PATCH] vect: Tighten check for SLP memory groups [PR103517]

2021-12-01 Thread Richard Biener via Gcc-patches
On Wed, Dec 1, 2021 at 11:56 AM Richard Sandiford via Gcc-patches
 wrote:
>
> When checking for compatible stmts, vect_build_slp_tree_1 did:
>
>&& !(STMT_VINFO_GROUPED_ACCESS (stmt_info)
> && (first_stmt_code == ARRAY_REF
> || first_stmt_code == BIT_FIELD_REF
> || first_stmt_code == INDIRECT_REF
> || first_stmt_code == COMPONENT_REF
> || first_stmt_code == MEM_REF)))
>
> That is, it allowed any rhs_code as long as the first_stmt_code
> looked valid.  This had the effect of allowing IFN_MASK_LOAD
> to be paired with an earlier non-call code (but didn't allow
> the reverse).
>
> This patch makes the check symmetrical.
>
> Still testing on x86_64-linux-gnu.  OK if testing passes, or doesn't
> this seem like the right approach?

It's indeed a too weak comparison of the classification of the first
and the followup operands, some larger refactoring is probably
needed to improve here (note how we compare STMT_VINFO_GROUPED_ACCESS
of the followup against the tree codes of the first stmt but also later
compare first_stmt_load_p against load_p).

The proposed patch looks reasonable (but then we could drop
the STMT_VINFO_GROUPED_ACCESS (stmt_info) part of the check?),
so OK.

Thanks,
Richard.

> Richard
>
>
> gcc/
> PR tree-optimization/103517
> * tree-vect-slp.c (vect_build_slp_tree_1): When allowing two
> different component references, check the codes of both them,
> rather than just the first.
>
> gcc/testsuite/
> PR tree-optimization/103517
> * gcc.dg/vect/pr103517.c: New test.
> ---
>  gcc/testsuite/gcc.dg/vect/pr103517.c | 13 +
>  gcc/tree-vect-slp.c  |  7 ++-
>  2 files changed, 19 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c 
> b/gcc/testsuite/gcc.dg/vect/pr103517.c
> new file mode 100644
> index 000..de87fc48f84
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr103517.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* 
> i?86-*-* } } */
> +
> +int a;
> +short b, c;
> +extern short d[];
> +void e() {
> +  for (short f = 1; f < (short)a; f += 2)
> +if (d[f + 1]) {
> +  b = d[f];
> +  c = d[f + 1];
> +}
> +}
> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
> index 7bff5118bd0..bc22ffeed82 100644
> --- a/gcc/tree-vect-slp.c
> +++ b/gcc/tree-vect-slp.c
> @@ -1121,7 +1121,12 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char 
> *swap,
> || first_stmt_code == BIT_FIELD_REF
> || first_stmt_code == INDIRECT_REF
> || first_stmt_code == COMPONENT_REF
> -   || first_stmt_code == MEM_REF)))
> +   || first_stmt_code == MEM_REF)
> +   && (rhs_code == ARRAY_REF
> +   || rhs_code == BIT_FIELD_REF
> +   || rhs_code == INDIRECT_REF
> +   || rhs_code == COMPONENT_REF
> +   || rhs_code == MEM_REF)))
>   || first_stmt_load_p != load_p
>   || first_stmt_phi_p != phi_p)
> {
> --
> 2.25.1
>


Re: [PATCH RFA (fold/symtab)] c++: constexpr, fold, weak redecl, fp/0 [PR103310]

2021-12-01 Thread Marek Polacek via Gcc-patches
On Tue, Nov 30, 2021 at 09:09:41PM -0500, Jason Merrill via Gcc-patches wrote:
> For PR61825, honza changed tree_single_nonzero_warnv_p to prevent a later
> declaration from marking a function as weak after we've determined that it
> wasn't weak before.  But we shouldn't do that for speculative folding; we
> should only do it when we actually need a constant value.  In C++, such a
> context is called "manifestly constant-evaluated".  In fold, this seems to
> correspond to the folding_initializer flag, since in C this situation only
> occurs in static initializers.
> 
> This change makes nonzero-1.c well-formed; I've added a nonzero-1a.c to
> verify that we delete the null check eventually if there is no weak
> redeclaration.
> 
> The varasm.c change is so that if we do get the weak redeclaration error, we
> get it at the position of the weak declaration rather than the previous
> declaration.
> 
> Using the FOLD_INIT paths also affects floating point arithmetic: notably,
> this makes floating point division by zero in a manifestly
> constant-evaluated context constant, as in a C static initializer.  I've had
> some success convincing CWG that this is the right direction; C++ should
> follow C's floating point semantics more than we have been doing, and Joseph
> says that the C policy is that Annex F overrides other parts of the standard
> that say that some operations are undefined.  But since we're in stage 3,
> I'm only making this change with the new flag -fconstexpr-fp-except.  It may
> turn on by default in a future release.
> 
> I think this distinction is only relevant for binary operations; arithmetic
> for the floating point case, comparison for possibly non-zero addresses.
> 
> Tested x86_64-pc-linux-gnu, OK for trunk?

[...]

> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 4b8a094b206..5654f044ae4 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -1615,6 +1615,10 @@ fconstexpr-cache-depth=
>  C++ ObjC++ Joined RejectNegative UInteger Var(constexpr_cache_depth) Init(8)
>  -fconstexpr-cache-depth= Specify maximum constexpr recursion 
> cache depth.
>  
> +fconstexpr-fp-except
> +C++ ObjC++ Var(flag_constexpr_fp_except) Init(0)
> +Allow IEC559 floating point exceptions in constant expressions

Sorry about nitpicking but there's a missing period after 'expressions', which
will break a test.

Marek



[PATCH] fix spelling of -linker-output-auto-nolto-rel

2021-12-01 Thread Rasmus Villemoes
The transposition nolto -> notlo is confusing and it makes the long
name even harder to read than it already is - I kept reading it as
"not lo" until I realized it was a simply typo.

Fixes: 5269b24605b1 (Silence warning in LTO mode on VxWorks)

lto-plugin/
* lto-plugin.c: Fix -linker-output-auto-notlo-rel ->
-linker-output-auto-nolto-rel typo.
(process_option): Adjust accordingly.

gcc/
* config/vxworks.h (LTO_PLUGIN_SPEC): Adapt to corrected
spelling of -linker-output-auto-nolto-rel.
---
 gcc/config/vxworks.h| 2 +-
 lto-plugin/lto-plugin.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/vxworks.h b/gcc/config/vxworks.h
index e41f16a51e8..bddf2c37f42 100644
--- a/gcc/config/vxworks.h
+++ b/gcc/config/vxworks.h
@@ -306,4 +306,4 @@ extern void vxworks_emit_call_builtin___clear_cache (rtx 
begin, rtx end);
further incremental LTO linking.  We do not do repeated incremental linking
so silence the warning (instead of passing -flinker-output=nolto-rel).  */
 #undef LTO_PLUGIN_SPEC
-#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-notlo-rel}"
+#define LTO_PLUGIN_SPEC "%{!mrtp:-plugin-opt=-linker-output-auto-nolto-rel}"
diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
index 6ab9822f369..40d7b7b8692 100644
--- a/lto-plugin/lto-plugin.c
+++ b/lto-plugin/lto-plugin.c
@@ -32,7 +32,7 @@ along with this program; see the file COPYING3.  If not see
-nop: Instead of running lto-wrapper, pass the original to the plugin. This
only works if the input files are hybrid. 
-linker-output-known: Do not determine linker output
-   -linker-output-auto-notlo-rel: Switch from rel to nolto-rel mode without
+   -linker-output-auto-nolto-rel: Switch from rel to nolto-rel mode without
warning.  This is used on systems like VxWorks (kernel) where the link is
always partial and repeated incremental linking is generally not used.
-sym-style={none,win32,underscore|uscore}
@@ -1321,7 +1321,7 @@ process_option (const char *option)
 {
   if (strcmp (option, "-linker-output-known") == 0)
 linker_output_known = true;
-  else if (strcmp (option, "-linker-output-auto-notlo-rel") == 0)
+  else if (strcmp (option, "-linker-output-auto-nolto-rel") == 0)
 linker_output_auto_nolto_rel = true;
   else if (strcmp (option, "-debug") == 0)
 debug = true;
-- 
2.31.1



[committed] d: Update documentation of new D language options.

2021-12-01 Thread Iain Buclaw via Gcc-patches
Hi,

This patch adds documentation for the following new D options:

- New switch that controls what code is generated on a contract
  failure (throw or abort).
- New switch that controls mangling of D types in `extern(C++)`
  code, as well as setting the compile-time value of
  `__traits(getTargetInfo "cppStd")`
- New switches that generate C++ headers from D source files.
- New switch to save expanded mixins to a file.
- New switches that now distinguish between D language changes that
  are either (a) an experimental feature or an upcoming breaking
  change, (b) a warning or help on an upcoming change, or (c) revert
  of a change for users who don't want to deal with the breaking
  change for now.

Bootstrapped on x86_64-linux-gnu, and committed to mainline.

Regards,
Iain

---
gcc/d/ChangeLog:

* gdc.texi (Runtime Options): Document -fcheckaction=, -fextern-std=,
-fpreview=, -frevert=.
(Code Generation): Document -fdump-c++-spec=, -fdump-c++-spec-verbose,
-fsave-mixins=.
(Warnings): Update list of supported -ftransitions=.
---
 gcc/d/gdc.texi | 114 ++---
 1 file changed, 109 insertions(+), 5 deletions(-)

diff --git a/gcc/d/gdc.texi b/gcc/d/gdc.texi
index 095f7ecca41..c98eb1f45d9 100644
--- a/gcc/d/gdc.texi
+++ b/gcc/d/gdc.texi
@@ -216,6 +216,20 @@ Don't recognize built-in functions unless they begin with 
the prefix
 @samp{__builtin_}.  By default, the compiler will recognize when a
 function in the @code{core.stdc} package is a built-in function.
 
+@item -fcheckaction=@var{value}
+@cindex @option{-fcheckaction}
+This option controls what code is generated on an assertion, bounds check, or
+final switch failure.  The following values are supported:
+
+@table @samp
+@item context
+Throw an @code{AssertError} with extra context information.
+@item halt
+Halt the program execution.
+@item throw
+Throw an @code{AssertError} (the default).
+@end table
+
 @item -fdebug
 @item -fdebug=@var{value}
 @cindex @option{-fdebug}
@@ -245,6 +259,25 @@ This is equivalent to compiling with the following options:
 gdc -nophoboslib -fno-exceptions -fno-moduleinfo -fno-rtti
 @end example
 
+@item -fextern-std=@var{standard}
+@cindex @option{-fextern-std}
+Sets the C++ name mangling compatibility to the version identified by
+@var{standard}.  The following values are supported:
+
+@table @samp
+@item c++98
+@item c++03
+Sets @code{__traits(getTargetInfo "cppStd")} to @code{199711}.
+@item c++11
+Sets @code{__traits(getTargetInfo "cppStd")} to @code{201103}.
+@item c++14
+Sets @code{__traits(getTargetInfo "cppStd")} to @code{201402}.
+@item c++17
+Sets @code{__traits(getTargetInfo "cppStd")} to @code{201703}.
+@item c++20
+Sets @code{__traits(getTargetInfo "cppStd")} to @code{202002}.
+@end table
+
 @item -fno-invariants
 @cindex @option{-finvariants}
 @cindex @option{-fno-invariants}
@@ -276,6 +309,48 @@ Turns off code generation for postcondition @code{out} 
contracts.
 @cindex @option{-fno-preconditions}
 Turns off code generation for precondition @code{in} contracts.
 
+@item -fpreview=@var{id}
+@cindex @option{-fpreview}
+Turns on an upcoming D language change identified by @var{id}.  The following
+values are supported:
+
+@table @samp
+@item all
+Turns on all upcoming D language features.
+@item dip1000
+Implements @uref{http://wiki.dlang.org/DIP1000} (Scoped pointers).
+@item dip1008
+Implements @uref{http://wiki.dlang.org/DIP1008} (Allow exceptions in
+@code{@@nogc} code).
+@item dip1021
+Implements @uref{http://wiki.dlang.org/DIP1021} (Mutable function arguments).
+@item dip25
+Implements @uref{http://wiki.dlang.org/DIP25} (Sealed references).
+@item dtorfields
+Turns on generation for destructing fields of partially constructed objects.
+@item fieldwise
+Turns on generation of struct equality to use field-wise comparisons.
+@item fixaliasthis
+Implements new lookup rules that check the current scope for @code{alias this}
+before searching in upper scopes.
+@item in
+Implements @code{in} parameters to mean @code{scope const [ref]} and accepts
+rvalues.
+@item inclusiveincontracts
+Implements @code{in} contracts of overridden methods to be a superset of parent
+contract.
+@item intpromote
+Implements C-style integral promotion for unary @code{+}, @code{-} and @code{~}
+expressions.
+@item nosharedaccess
+Turns off and disallows all access to shared memory objects.
+@item rvaluerefparam
+Implements rvalue arguments to @code{ref} parameters.
+@item shortenedmethods
+Implements use of @code{=>} for methods and top-level functions in addition to
+lambdas.
+@end table
+
 @item -frelease
 @cindex @option{-frelease}
 @cindex @option{-fno-release}
@@ -291,6 +366,22 @@ gdc -fno-assert -fbounds-check=safe -fno-invariants \
 -fno-postconditions -fno-preconditions -fno-switch-errors
 @end example
 
+@item -frevert=
+@cindex @option{-frevert}
+Turns off a D language feature identified by @var{

[PATCH] vect: Tighten check for SLP memory groups [PR103517]

2021-12-01 Thread Richard Sandiford via Gcc-patches
When checking for compatible stmts, vect_build_slp_tree_1 did:

   && !(STMT_VINFO_GROUPED_ACCESS (stmt_info)
&& (first_stmt_code == ARRAY_REF
|| first_stmt_code == BIT_FIELD_REF
|| first_stmt_code == INDIRECT_REF
|| first_stmt_code == COMPONENT_REF
|| first_stmt_code == MEM_REF)))

That is, it allowed any rhs_code as long as the first_stmt_code
looked valid.  This had the effect of allowing IFN_MASK_LOAD
to be paired with an earlier non-call code (but didn't allow
the reverse).

This patch makes the check symmetrical.

Still testing on x86_64-linux-gnu.  OK if testing passes, or doesn't
this seem like the right approach?

Richard


gcc/
PR tree-optimization/103517
* tree-vect-slp.c (vect_build_slp_tree_1): When allowing two
different component references, check the codes of both them,
rather than just the first.

gcc/testsuite/
PR tree-optimization/103517
* gcc.dg/vect/pr103517.c: New test.
---
 gcc/testsuite/gcc.dg/vect/pr103517.c | 13 +
 gcc/tree-vect-slp.c  |  7 ++-
 2 files changed, 19 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr103517.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr103517.c 
b/gcc/testsuite/gcc.dg/vect/pr103517.c
new file mode 100644
index 000..de87fc48f84
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr103517.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* 
i?86-*-* } } */
+
+int a;
+short b, c;
+extern short d[];
+void e() {
+  for (short f = 1; f < (short)a; f += 2)
+if (d[f + 1]) {
+  b = d[f];
+  c = d[f + 1];
+}
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 7bff5118bd0..bc22ffeed82 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -1121,7 +1121,12 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char 
*swap,
|| first_stmt_code == BIT_FIELD_REF
|| first_stmt_code == INDIRECT_REF
|| first_stmt_code == COMPONENT_REF
-   || first_stmt_code == MEM_REF)))
+   || first_stmt_code == MEM_REF)
+   && (rhs_code == ARRAY_REF
+   || rhs_code == BIT_FIELD_REF
+   || rhs_code == INDIRECT_REF
+   || rhs_code == COMPONENT_REF
+   || rhs_code == MEM_REF)))
  || first_stmt_load_p != load_p
  || first_stmt_phi_p != phi_p)
{
-- 
2.25.1



Re: [PATCH] Fix alignment of stack slots for overaligned types [PR103500]

2021-12-01 Thread Alex Coplan via Gcc-patches
On 30/11/2021 19:38, Florian Weimer wrote:
> * Alex Coplan via Gcc-patches:
> 
> > Bootstrapped and regtested on aarch64-linux-gnu, x86_64-linux-gnu, and
> > arm-linux-gnueabihf: no regressions.
> >
> > I'd appreciate any feedback. Is it OK for trunk?
> 
> Does this need an ABI warning?

That's a good question. I'm not sure exactly what changes merit an ABI
warning in GCC.

In this case, we aren't changing how the parameter is passed in the
sense that the callee can still locate it in the same way (by following
the pointer in the appropriate GPR), we're simply increasing the
alignment of the stack slot used for the indirected copy.

So on this basis I would guess that it doesn't need a warning, but I'm
happy to be corrected.

Thanks,
Alex

> 
> Thanks,
> Florian
> 


[Ada] Fix incorrect fixed-point computation in expression function

2021-12-01 Thread Pierre-Marie de Rodat via Gcc-patches
This fixes a couple of issues pertaining to (ordinary) fixed-point types
declared with a Small aspect whose value is not equal to the default one.

The processing of this aspect is delayed until the freeze point of the type,
which means that the Small_Value of the entity for the type does not have
the right value until after the freeze point is encountered.

The first issue is that Resolve_Real_Literal could use the Small_Value of
the entity for an unfrozen type, for example during the pre-analysis of a
default expression or of an expression function.

The second issue is that Freeze_Fixed_Point_Type could still use the old
value of Small_Value even after it has set the field to its final value.
It could also overwrite a correct Small_Value with garbage in the case of
the subtype of a private fixed-point type.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* einfo.ads (E_Decimal_Fixed_Point_Subtype): Fix pasto.
* freeze.adb (Freeze_Fixed_Point_Type): Retrieve the underlying type
of the first subtype and do not use a stale value of Small_Value.
* sem_res.adb (Resolve_Real_Literal): In the case of a fixed-point
type, make sure that the base type is frozen and use its Small_Value
to compute the corresponding integer value of the literal.diff --git a/gcc/ada/einfo.ads b/gcc/ada/einfo.ads
--- a/gcc/ada/einfo.ads
+++ b/gcc/ada/einfo.ads
@@ -5353,7 +5353,7 @@ package Einfo is
--Size_Clause   (synth)
 
--  E_Decimal_Fixed_Point_Type
-   --  E_Decimal_Fixed_Subtype$$$no such thing
+   --  E_Decimal_Fixed_Point_Subtype
--Scale_Value
--Digits_Value
--Scalar_Range


diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -8997,8 +8997,9 @@ package body Freeze is
   Brng  : constant Node_Id:= Scalar_Range (Btyp);
   BLo   : constant Node_Id:= Low_Bound (Brng);
   BHi   : constant Node_Id:= High_Bound (Brng);
-  Par   : constant Entity_Id  := First_Subtype (Typ);
-  Small : constant Ureal  := Small_Value (Typ);
+  Ftyp  : constant Entity_Id  := Underlying_Type (First_Subtype (Typ));
+
+  Small : Ureal;
   Loval : Ureal;
   Hival : Ureal;
   Atype : Entity_Id;
@@ -9037,7 +9038,7 @@ package body Freeze is
 
   function Larger (A, B : Ureal) return Boolean is
   begin
- return A > B and then A - Small > B;
+ return A > B and then A - Small_Value (Typ) > B;
   end Larger;
 
   -
@@ -9046,7 +9047,7 @@ package body Freeze is
 
   function Smaller (A, B : Ureal) return Boolean is
   begin
- return A < B and then A + Small < B;
+ return A < B and then A + Small_Value (Typ) < B;
   end Smaller;
 
--  Start of processing for Freeze_Fixed_Point_Type
@@ -9057,9 +9058,15 @@ package body Freeze is
   --  so that all characteristics of the type (size, bounds) can be
   --  computed and validated in the call to Minimum_Size that follows.
 
-  if Has_Delayed_Aspects (First_Subtype (Typ)) then
- Analyze_Aspects_At_Freeze_Point (First_Subtype (Typ));
- Set_Has_Delayed_Aspects (First_Subtype (Typ), False);
+  if Has_Delayed_Aspects (Ftyp) then
+ Analyze_Aspects_At_Freeze_Point (Ftyp);
+ Set_Has_Delayed_Aspects (Ftyp, False);
+  end if;
+
+  --  Inherit the Small value from the first subtype in any case
+
+  if Typ /= Ftyp then
+ Set_Small_Value (Typ, Small_Value (Ftyp));
   end if;
 
   --  If Esize of a subtype has not previously been set, set it now
@@ -9074,16 +9081,6 @@ package body Freeze is
  end if;
   end if;
 
-  --  The 'small attribute may have been specified with an aspect,
-  --  in which case it is processed after a subtype declaration, so
-  --  inherit now the specified value.
-
-  if Typ /= Par
-and then Present (Find_Aspect (Par, Aspect_Small))
-  then
- Set_Small_Value (Typ, Small_Value (Par));
-  end if;
-
   --  Immediate return if the range is already analyzed. This means that
   --  the range is already set, and does not need to be computed by this
   --  routine.
@@ -9100,6 +9097,7 @@ package body Freeze is
  return;
   end if;
 
+  Small := Small_Value (Typ);
   Loval := Realval (Lo);
   Hival := Realval (Hi);
 
@@ -9137,7 +9135,6 @@ package body Freeze is
 Size_Excl_EP  : Int;
 
 Model_Num : Ureal;
-First_Subt: Entity_Id;
 Actual_Lo : Ureal;
 Actual_Hi : Ureal;
 
@@ -9279,10 +9276,8 @@ package body Freeze is
--  to get a base type whose size is smaller than the specified
--  size of the first subtype.
 
-   First_Subt := First_Subtype (Typ);
-
-   if Has_Size_Clause (First_Subt)
- and then Size_Incl_EP <= Esi

[Ada] Tune whitespace of the bounded lists Aggregate aspect

2021-12-01 Thread Pierre-Marie de Rodat via Gcc-patches
Whitespace cleanup only.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-cbdlli.ads (List): Remove extra space in Aggregate
aspect.diff --git a/gcc/ada/libgnat/a-cbdlli.ads b/gcc/ada/libgnat/a-cbdlli.ads
--- a/gcc/ada/libgnat/a-cbdlli.ads
+++ b/gcc/ada/libgnat/a-cbdlli.ads
@@ -56,8 +56,8 @@ is
   Variable_Indexing => Reference,
   Default_Iterator  => Iterate,
   Iterator_Element  => Element_Type,
-  Aggregate => (Empty=> Empty,
-Add_Unnamed  => Append),
+  Aggregate => (Empty   => Empty,
+Add_Unnamed => Append),
   Preelaborable_Initialization
 => Element_Type'Preelaborable_Initialization;
 




[Ada] Allow formal functions to have a default in the form of an expression function

2021-12-01 Thread Pierre-Marie de Rodat via Gcc-patches
As a language extension, the declaration of a generic formal function is
allowed to have a default given by an expression in parentheses, where
the expression is of the function's result type and can refer to formal
parameters of the function (in direct analogy with expression functions).

For example:

   generic
  type T is private;
  with function Copy (Item : T) return T is (Item); -- Defaults to Item
   package Stacks is

  type Stack is limited private;

  procedure Push (S : in out Stack; X : T); -- Calls Copy on X

  function Pop (S : in out Stack) return T; -- Calls Copy to return item

   private
  ...
   end Stacks;

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* doc/gnat_rm/implementation_defined_pragmas.rst: Add
documentation of the new form of formal subprogram default in
the section on language extensions (pragma Extensions_Allowed).
* gnat_rm.texi: Regenerate.
* gen_il-gen-gen_nodes.adb: Add Expression as a syntactic field
of N_Formal_(Abstract|Concrete)_Subprogram_Declaration nodes.
* par-ch12.adb (P_Formal_Subprogram_Declaration): Add parsing
support for the new default of a parenthesized expression for
formal functions. Issue an error when extensions are not
allowed, suggesting use of -gnatX. Update comment with extended
syntax for SUBPROGRAM_DEFAULT.
* sem_ch12.adb (Analyze_Formal_Subprogram_Declaration): Issue an
error when an expression default is given for an abstract formal
function. When a default expression is present for a formal
function, install the function's formals and preanalyze the
expression.
(Instantiate_Formal_Subprogram): Fix typo in RM paragraph in a
comment.  When a formal function has a default expression,
create a body for the function that will evaluate the expression
and will be called when the default applies in an instantiation.
The implicit function is marked as inlined and as having
convention Intrinsic.diff --git a/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst b/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
--- a/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
+++ b/gcc/ada/doc/gnat_rm/implementation_defined_pragmas.rst
@@ -2401,6 +2401,30 @@ of GNAT specific extensions are recognized as follows:
   name, preference is given to the component in a selected_component
   (as is currently the case for tagged types with such component names).
 
+* Expression defaults for generic formal functions
+
+  The declaration of a generic formal function is allowed to specify
+  an expression as a default, using the syntax of an expression function.
+
+  Here is an example of this feature:
+
+  .. code-block:: ada
+
+  generic
+ type T is private;
+ with function Copy (Item : T) return T is (Item); -- Defaults to Item
+  package Stacks is
+
+ type Stack is limited private;
+
+ procedure Push (S : in out Stack; X : T); -- Calls Copy on X
+
+ function Pop (S : in out Stack) return T; -- Calls Copy to return item
+
+  private
+ -- ...
+  end Stacks;
+
 .. _Pragma-Extensions_Visible:
 
 Pragma Extensions_Visible


diff --git a/gcc/ada/gen_il-gen-gen_nodes.adb b/gcc/ada/gen_il-gen-gen_nodes.adb
--- a/gcc/ada/gen_il-gen-gen_nodes.adb
+++ b/gcc/ada/gen_il-gen-gen_nodes.adb
@@ -1136,11 +1136,13 @@ begin -- Gen_IL.Gen.Gen_Nodes
Cc (N_Formal_Abstract_Subprogram_Declaration, N_Formal_Subprogram_Declaration,
(Sy (Specification, Node_Id),
 Sy (Default_Name, Node_Id, Default_Empty),
+Sy (Expression, Node_Id, Default_Empty),
 Sy (Box_Present, Flag)));
 
Cc (N_Formal_Concrete_Subprogram_Declaration, N_Formal_Subprogram_Declaration,
(Sy (Specification, Node_Id),
 Sy (Default_Name, Node_Id, Default_Empty),
+Sy (Expression, Node_Id, Default_Empty),
 Sy (Box_Present, Flag)));
 
Ab (N_Push_Pop_xxx_Label, Node_Kind);


diff --git a/gcc/ada/gnat_rm.texi b/gcc/ada/gnat_rm.texi
--- a/gcc/ada/gnat_rm.texi
+++ b/gcc/ada/gnat_rm.texi
@@ -3853,6 +3853,31 @@ simple name as one of the type’s primitive subprograms, where the
 component is visible at the point of a selected_component using that
 name, preference is given to the component in a selected_component
 (as is currently the case for tagged types with such component names).
+
+@item 
+Expression defaults for generic formal functions
+
+The declaration of a generic formal function is allowed to specify
+an expression as a default, using the syntax of an expression function.
+
+Here is an example of this feature:
+
+@example
+generic
+   type T is private;
+   with function Copy (Item : T) return T is (Item); -- Defaults to Item
+package Stacks is
+
+   type Stack is limited private;
+
+   procedure Push (S : in out Stack; X : T); -- Calls Copy on X
+
+   function Pop (S : in

[Ada] Do not return freeze nodes for start of early call regions

2021-12-01 Thread Pierre-Marie de Rodat via Gcc-patches
This makes sure that the Early Call Region Processor in Sem_Elab does not
return a region starting with a construct that is not suitable for being
included into it, for example a freeze node.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_elab.adb (Previous_Suitable_Construct): New function declared
in the Early_Call_Region_Processor package.
(Find_ECR): Call it to get the previous node at the start.
(Include): Call it to get the previous node during the traversal.diff --git a/gcc/ada/sem_elab.adb b/gcc/ada/sem_elab.adb
--- a/gcc/ada/sem_elab.adb
+++ b/gcc/ada/sem_elab.adb
@@ -6965,6 +6965,11 @@ package body Sem_Elab is
  --  Determine whether arbitrary node N denotes a suitable construct
  --  for inclusion into the early call region.
 
+ function Previous_Suitable_Construct (N : Node_Id) return Node_Id;
+ pragma Inline (Previous_Suitable_Construct);
+ --  Return the previous node suitable for inclusion into the early
+ --  call region.
+
  procedure Transition_Body_Declarations
(Bod  : Node_Id;
 Curr : out Node_Id);
@@ -7209,7 +7214,7 @@ package body Sem_Elab is
  begin
 --  The early call region starts at N
 
-Curr  := Prev (N);
+Curr  := Previous_Suitable_Construct (N);
 Start := N;
 
 --  Inspect each node in reverse declarative order while going in
@@ -7286,7 +7291,7 @@ package body Sem_Elab is
 --  Otherwise the input node is still within some list
 
 else
-   Curr := Prev (Start);
+   Curr := Previous_Suitable_Construct (Start);
 end if;
  end Include;
 
@@ -7378,6 +7383,23 @@ package body Sem_Elab is
 end case;
  end Is_Suitable_Construct;
 
+ -
+ -- Previous_Suitable_Construct --
+ -
+
+ function Previous_Suitable_Construct (N : Node_Id) return Node_Id is
+P : Node_Id;
+
+ begin
+P := Prev (N);
+
+while Present (P) and then not Is_Suitable_Construct (P) loop
+   Prev (P);
+end loop;
+
+return P;
+ end Previous_Suitable_Construct;
+
  --
  -- Transition_Body_Declarations --
  --




  1   2   >