date:20220127

[pushed] c++: pack in enumerator in lambda [PR100198]

2022-01-27 Thread Jason Merrill via Gcc-patches

The GCC 8 lambda overhaul fixed most uses of lambdas in pack expansions, but
local enums and classes within such lambdas that depend on parameter packs
are still broken.  For now, give a sorry instead of an ICE or incorrect
error.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/100198
PR c++/100030
PR c++/100282

gcc/cp/ChangeLog:

* parser.cc (cp_parser_enumerator_definition): Sorry on parameter
pack in lambda.
(cp_parser_class_head): And in class attributes.
* pt.cc (check_for_bare_parameter_packs): Sorry instead of error
in lambda.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-variadic13.C: Accept the sorry
as well as the correct error.
* g++.dg/cpp0x/lambda/lambda-variadic14.C: Likewise.
* g++.dg/cpp0x/lambda/lambda-variadic14a.C: New test.
* g++.dg/cpp0x/lambda/lambda-variadic15.C: New test.
* g++.dg/cpp0x/lambda/lambda-variadic16.C: New test.
---
 gcc/cp/parser.cc  | 19 ++-
 gcc/cp/pt.cc  | 23 +++
 .../g++.dg/cpp0x/lambda/lambda-variadic13.C   |  2 +-
 .../g++.dg/cpp0x/lambda/lambda-variadic14.C   |  2 +-
 .../g++.dg/cpp0x/lambda/lambda-variadic14a.C  |  9 
 .../g++.dg/cpp0x/lambda/lambda-variadic15.C   | 14 +++
 .../g++.dg/cpp0x/lambda/lambda-variadic16.C   | 13 +++
 7 files changed, 75 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic14a.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic15.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic16.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index f1947ee3a1c..94a5c64be4c 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -21146,7 +21146,16 @@ cp_parser_enumerator_definition (cp_parser* parser, 
tree type)
 
   /* If we are processing a template, make sure the initializer of the
  enumerator doesn't contain any bare template parameter pack.  */
-  if (check_for_bare_parameter_packs (value))
+  if (current_lambda_expr ())
+{
+  /* In a lambda it should work, but doesn't currently.  */
+  if (uses_parameter_packs (value))
+   {
+ sorry ("unexpanded parameter pack in enumerator in lambda");
+ value = error_mark_node;
+   }
+}
+  else if (check_for_bare_parameter_packs (value))
 value = error_mark_node;
 
   /* Create the enumerator.  */
@@ -26624,6 +26633,14 @@ cp_parser_class_head (cp_parser* parser,
 
   if (type)
 {
+  if (current_lambda_expr ()
+ && uses_parameter_packs (attributes))
+   {
+ /* In a lambda this should work, but doesn't currently.  */
+ sorry ("unexpanded parameter pack in local class in lambda");
+ attributes = NULL_TREE;
+   }
+
   /* Apply attributes now, before any use of the class as a template
 argument in its base list.  */
   cplus_decl_attributes (, attributes, (int)ATTR_FLAG_TYPE_IN_PLACE);
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 19e73b3b77d..f46a7ad6655 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -4273,10 +4273,27 @@ check_for_bare_parameter_packs (tree t, location_t loc 
/* = UNKNOWN_LOCATION */)
   cp_walk_tree (, _parameter_packs_r, , ppd.visited);
   delete ppd.visited;
 
+  if (!parameter_packs)
+return false;
+
+  if (loc == UNKNOWN_LOCATION)
+loc = cp_expr_loc_or_input_loc (t);
+
   /* It's OK for a lambda to have an unexpanded parameter pack from the
  containing context, but do complain about unexpanded capture packs.  */
-  if (current_class_type && LAMBDA_TYPE_P (current_class_type)
-  && CLASSTYPE_TEMPLATE_INFO (current_class_type))
+  tree lam = current_lambda_expr ();
+  if (lam)
+lam = TREE_TYPE (lam);
+
+  if (lam && lam != current_class_type)
+{
+  /* We're in a lambda, but it isn't the innermost class.
+This should work, but currently doesn't.  */
+  sorry_at (loc, "unexpanded parameter pack in local class in lambda");
+  return true;
+}
+
+  if (lam && CLASSTYPE_TEMPLATE_INFO (lam))
 for (; parameter_packs;
 parameter_packs = TREE_CHAIN (parameter_packs))
   {
@@ -4287,8 +4304,6 @@ check_for_bare_parameter_packs (tree t, location_t loc /* 
= UNKNOWN_LOCATION */)
 
   if (parameter_packs)
 {
-  if (loc == UNKNOWN_LOCATION)
-   loc = cp_expr_loc_or_input_loc (t);
   error_at (loc, "parameter packs not expanded with %<...%>:");
   while (parameter_packs)
 {
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic13.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic13.C
index ac4e631ebb5..3df88296726 100644
--- a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic13.C
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-variadic13.C
@@ -3,7 +3,7 @@
 
 template 
 void f() {
-  [] { struct S : Ts { }; };   // { dg-error "not expanded" }
+  [] { struct S : Ts {

Re: [PATCH] c++: ICE with auto[] and VLA [PR102414]

2022-01-27 Thread Jason Merrill via Gcc-patches


On 1/27/22 20:02, Marek Polacek wrote:

Here we ICE in unify_array_domain when we're trying to deduce the type
of an array, as in

   auto(*p)[i] = (int(*)[i])0;

but unify_array_domain doesn't arbitrarily complex bounds.  Another
test is, e.g.,

   auto (*b)[0/0] = 

where the type of the array is

   <<< Unknown tree: template_type_parm >>>[0:(sizetype) ((ssizetype) (0 / 0) - 
1)]

It seems to me that we need not handle these.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/102414
PR c++/101874

gcc/cp/ChangeLog:

* decl.cc (create_array_type_for_decl): Use template_placeholder_p.
Sorry on a variable-length array of auto.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-array3.C: New test.
* g++.dg/cpp23/auto-array4.C: New test.
---
  gcc/cp/decl.cc   | 14 +++---
  gcc/testsuite/g++.dg/cpp23/auto-array3.C | 17 +
  gcc/testsuite/g++.dg/cpp23/auto-array4.C | 10 ++
  3 files changed, 38 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-array3.C
  create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-array4.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 10e6956117e..4ba8bf3e8a9 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -11089,7 +11089,7 @@ create_array_type_for_decl (tree name, tree type, tree 
size, location_t loc)
  
/* [dcl.type.class.deduct] prohibits forming an array of placeholder

   for a deduced class type.  */
-  if (is_auto (type) && CLASS_PLACEHOLDER_TEMPLATE (type))
+  if (template_placeholder_p (type))
  {
if (name)
error_at (loc, "%qD declared as array of template placeholder "
@@ -11159,8 +11159,16 @@ create_array_type_for_decl (tree name, tree type, tree 
size, location_t loc)
  
/* Figure out the index type for the array.  */

if (size)
-itype = compute_array_index_type_loc (loc, name, size,
- tf_warning_or_error);
+{
+  itype = compute_array_index_type_loc (loc, name, size,
+   tf_warning_or_error);
+  if (type_uses_auto (type)
+ && !TREE_CONSTANT (maybe_constant_value (size)))


Maybe variably_modified_type_p (itype)?


+   {
+ sorry_at (loc, "variable-length array of %");
+ return error_mark_node;
+   }
+}
  
return build_cplus_array_type (type, itype);

  }
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-array3.C 
b/gcc/testsuite/g++.dg/cpp23/auto-array3.C
new file mode 100644
index 000..e383a17d0ee
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-array3.C
@@ -0,0 +1,17 @@
+// PR c++/102414
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+constexpr int sz () { return 3; }
+
+void f ()
+{
+  int a[3];
+  auto (*a1)[0/0] =  // { dg-message "variable-length array of .auto." }
+// { dg-warning "division by zero" "" { target *-*-* } .-1 }


Let's move the error into the other testcase?


+  const int N = 3;
+  auto (*a2)[N] = 
+  constexpr int M = 3;
+  auto (*a3)[M] = 
+  auto (*a4)[sz()] = 
+}
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-array4.C 
b/gcc/testsuite/g++.dg/cpp23/auto-array4.C
new file mode 100644
index 000..4819472a9ac
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-array4.C
@@ -0,0 +1,10 @@
+// PR c++/101874
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+void
+f (int i)
+{
+  auto x[i] = { 0 }; // { dg-message "variable-length array of .auto." }
+  auto(*p)[i] = (int(*)[i])0; // { dg-message "variable-length array of 
.auto." }
+}

base-commit: 99f17e996f21d0ed64c36ed1e52977b705143522

[PATCH] c++: ICE with auto[] and VLA [PR102414]

2022-01-27 Thread Marek Polacek via Gcc-patches

Here we ICE in unify_array_domain when we're trying to deduce the type
of an array, as in

  auto(*p)[i] = (int(*)[i])0;

but unify_array_domain doesn't arbitrarily complex bounds.  Another
test is, e.g.,

  auto (*b)[0/0] = 

where the type of the array is

  <<< Unknown tree: template_type_parm >>>[0:(sizetype) ((ssizetype) (0 / 0) - 
1)]

It seems to me that we need not handle these.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/102414
PR c++/101874

gcc/cp/ChangeLog:

* decl.cc (create_array_type_for_decl): Use template_placeholder_p.
Sorry on a variable-length array of auto.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/auto-array3.C: New test.
* g++.dg/cpp23/auto-array4.C: New test.
---
 gcc/cp/decl.cc   | 14 +++---
 gcc/testsuite/g++.dg/cpp23/auto-array3.C | 17 +
 gcc/testsuite/g++.dg/cpp23/auto-array4.C | 10 ++
 3 files changed, 38 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-array3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp23/auto-array4.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 10e6956117e..4ba8bf3e8a9 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -11089,7 +11089,7 @@ create_array_type_for_decl (tree name, tree type, tree 
size, location_t loc)
 
   /* [dcl.type.class.deduct] prohibits forming an array of placeholder
  for a deduced class type.  */
-  if (is_auto (type) && CLASS_PLACEHOLDER_TEMPLATE (type))
+  if (template_placeholder_p (type))
 {
   if (name)
error_at (loc, "%qD declared as array of template placeholder "
@@ -11159,8 +11159,16 @@ create_array_type_for_decl (tree name, tree type, tree 
size, location_t loc)
 
   /* Figure out the index type for the array.  */
   if (size)
-itype = compute_array_index_type_loc (loc, name, size,
- tf_warning_or_error);
+{
+  itype = compute_array_index_type_loc (loc, name, size,
+   tf_warning_or_error);
+  if (type_uses_auto (type)
+ && !TREE_CONSTANT (maybe_constant_value (size)))
+   {
+ sorry_at (loc, "variable-length array of %");
+ return error_mark_node;
+   }
+}
 
   return build_cplus_array_type (type, itype);
 }
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-array3.C 
b/gcc/testsuite/g++.dg/cpp23/auto-array3.C
new file mode 100644
index 000..e383a17d0ee
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-array3.C
@@ -0,0 +1,17 @@
+// PR c++/102414
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+constexpr int sz () { return 3; }
+
+void f ()
+{
+  int a[3];
+  auto (*a1)[0/0] =  // { dg-message "variable-length array of .auto." }
+// { dg-warning "division by zero" "" { target *-*-* } .-1 }
+  const int N = 3;
+  auto (*a2)[N] = 
+  constexpr int M = 3;
+  auto (*a3)[M] = 
+  auto (*a4)[sz()] = 
+}
diff --git a/gcc/testsuite/g++.dg/cpp23/auto-array4.C 
b/gcc/testsuite/g++.dg/cpp23/auto-array4.C
new file mode 100644
index 000..4819472a9ac
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/auto-array4.C
@@ -0,0 +1,10 @@
+// PR c++/101874
+// { dg-do compile { target c++11 } }
+// { dg-options "" }
+
+void
+f (int i)
+{
+  auto x[i] = { 0 }; // { dg-message "variable-length array of .auto." }
+  auto(*p)[i] = (int(*)[i])0; // { dg-message "variable-length array of 
.auto." }
+}

base-commit: 99f17e996f21d0ed64c36ed1e52977b705143522
-- 
2.34.1

Re: [PATCH] enable -Winvalid-memory-order for C++ [PR99612]

2022-01-27 Thread Martin Sebor via Gcc-patches


On 1/27/22 16:47, Andrew Pinski wrote:

On Wed, Dec 8, 2021 at 8:50 AM Martin Sebor via Gcc-patches
 wrote:


Even with -Wno-system-headers enabled, the -Winvalid-memory-order
code tries to make sure calls to atomic functions with invalid
memory orders are diagnosed even though the C atomic functions
are defined as macros in the  system header.
The warning triggers at all optimization levels, including -O0.

Independently, the core diagnostic enhancements implemented earlier
this year (the warning group control) enable warnings for functions
defined in system headers that are inlined into user code.  This
was done for similar reason as above: because it's desirable to
diagnose invalid calls made from user code to system functions
(e.g., buffer overflows, invalid or mismatched deallocations,
etc.)

However, the C macro solution interferes with the code diagnostic
changes and prevents the invalid memory model warnings from being
issued for the same problems in C++.  In addition, because C++
atomics are ordinary (inline) functions that call the underlying
__atomic_xxx built-ins, the invalid memory orders can only be
detected with both inlining and constant propagation enabled.

The attached patch removes these limitations and enables
-Winvalid-memory-order to trigger even for C++ std::atomic,
(almost) just like it does in C, at all optimization levels
including -O0.

To make that possible I had to move -Winvalid-memory-order from
builtins.c to a GIMPLE pass where it can use context-sensitive
range info at -O0, instead of relying on constant propagation
(only available at -O1 and above).  Although the same approach
could be used to emit better object code for C++ atomics at -O0
(i.e., use the right memory order instead of dropping to seq_cst),
this patch doesn't do that.)

In addition to enabling the warning I've also enhanced it to
include the memory models involved in the diagnosed call (both
the problem ones and the viable alternatives).

Tested on x86_64-linux.

Jonathan, I CC you for two reasons: a) because this solution
is based on your (as well as my own) preference for handling
C++ system headers, and because of our last week's discussion
of the false positives in std::string resulting from the same
choice there.

I don't anticipate this change to lead to the same fallout
because it's unlikely for GCC to synthesize invalid memory
orders out of thin air; and b) because the current solution
can only detect the problems in calls to atomic functions at
-O0 that are declared with attribute always_inline.  This
includes member functions defined in the enclosing atomic
class but not namespace-scope functions.  To make
the detection possible those would also have to be
always_inline.  If that's a change you'd like to see I can
look into making it happen.


This causes gcc.target/aarch64/atomic-inst-cas.c testcase to fail on
aarch64-linux-gnu. We warn now even for the case where we don't have
undefined behavior.
In the sense the code checks the success and failure memory model
before calling __atomic_compare_exchange_n.
Testcase:
int test_cas_atomic_relaxed_consume_char (char* val, char* foo, char* bar) {
  int model_s = 0;
  int model_f = 1;
  if (model_s < model_f) return 0;
  if (model_f == 3 || model_f == 4) return 0;
  return __atomic_compare_exchange_n (val, foo, bar, 0, model_s, model_f);
}

Do we really want to warn about this case and other cases where we
check the memory model before calling __atomic_compare_exchange_n?
That is, do we want to warn this early in the optimization pipeline
and even without flow control checks?


The warning runs very early intentionally: to detect the same
misuses in C++ as in C, without optimization.  (In C++, constant
memory models become variables because the C++ atomics functions
are wrappers around the built-ins.)

In practice, I'd expect most calls to atomic functions to be made
with constant memory models, and code like in the test case above
to be uncommon, so I think the choice of warning at -O0 was
the right one.

Martin

Re: [PATCH] rs6000: Fix up #include or [PR104239]

2022-01-27 Thread Segher Boessenkool

On Thu, Jan 27, 2022 at 02:55:08PM -0600, Paul A. Clarke wrote:
> Should we add similar compile-only tests for all of the standalone include
> files?

The only reason it wasn't detected earlier is that no one actually uses
this (and tests the development compiler).  That is a much bigger
problem, and nothing that a few scattershot tests will help fix :-(

Segher

Re: [PATCH] enable -Winvalid-memory-order for C++ [PR99612]

2022-01-27 Thread Andrew Pinski via Gcc-patches

On Thu, Jan 27, 2022 at 3:47 PM Andrew Pinski  wrote:
>
> On Wed, Dec 8, 2021 at 8:50 AM Martin Sebor via Gcc-patches
>  wrote:
> >
> > Even with -Wno-system-headers enabled, the -Winvalid-memory-order
> > code tries to make sure calls to atomic functions with invalid
> > memory orders are diagnosed even though the C atomic functions
> > are defined as macros in the  system header.
> > The warning triggers at all optimization levels, including -O0.
> >
> > Independently, the core diagnostic enhancements implemented earlier
> > this year (the warning group control) enable warnings for functions
> > defined in system headers that are inlined into user code.  This
> > was done for similar reason as above: because it's desirable to
> > diagnose invalid calls made from user code to system functions
> > (e.g., buffer overflows, invalid or mismatched deallocations,
> > etc.)
> >
> > However, the C macro solution interferes with the code diagnostic
> > changes and prevents the invalid memory model warnings from being
> > issued for the same problems in C++.  In addition, because C++
> > atomics are ordinary (inline) functions that call the underlying
> > __atomic_xxx built-ins, the invalid memory orders can only be
> > detected with both inlining and constant propagation enabled.
> >
> > The attached patch removes these limitations and enables
> > -Winvalid-memory-order to trigger even for C++ std::atomic,
> > (almost) just like it does in C, at all optimization levels
> > including -O0.
> >
> > To make that possible I had to move -Winvalid-memory-order from
> > builtins.c to a GIMPLE pass where it can use context-sensitive
> > range info at -O0, instead of relying on constant propagation
> > (only available at -O1 and above).  Although the same approach
> > could be used to emit better object code for C++ atomics at -O0
> > (i.e., use the right memory order instead of dropping to seq_cst),
> > this patch doesn't do that.)
> >
> > In addition to enabling the warning I've also enhanced it to
> > include the memory models involved in the diagnosed call (both
> > the problem ones and the viable alternatives).
> >
> > Tested on x86_64-linux.
> >
> > Jonathan, I CC you for two reasons: a) because this solution
> > is based on your (as well as my own) preference for handling
> > C++ system headers, and because of our last week's discussion
> > of the false positives in std::string resulting from the same
> > choice there.
> >
> > I don't anticipate this change to lead to the same fallout
> > because it's unlikely for GCC to synthesize invalid memory
> > orders out of thin air; and b) because the current solution
> > can only detect the problems in calls to atomic functions at
> > -O0 that are declared with attribute always_inline.  This
> > includes member functions defined in the enclosing atomic
> > class but not namespace-scope functions.  To make
> > the detection possible those would also have to be
> > always_inline.  If that's a change you'd like to see I can
> > look into making it happen.
>
> This causes gcc.target/aarch64/atomic-inst-cas.c testcase to fail on
> aarch64-linux-gnu. We warn now even for the case where we don't have
> undefined behavior.
> In the sense the code checks the success and failure memory model
> before calling __atomic_compare_exchange_n.
> Testcase:
> int test_cas_atomic_relaxed_consume_char (char* val, char* foo, char* bar) {
>  int model_s = 0;
>  int model_f = 1;
>  if (model_s < model_f) return 0;
>  if (model_f == 3 || model_f == 4) return 0;
>  return __atomic_compare_exchange_n (val, foo, bar, 0, model_s, model_f);
> }
>
> Do we really want to warn about this case and other cases where we
> check the memory model before calling __atomic_compare_exchange_n?
> That is, do we want to warn this early in the optimization pipeline
> and even without flow control checks?

I should mention I filed this as PR 104200.

Thanks,
Andrew

>
> Thanks,
> Andrew Pinski
>
> >
> > Martin

Re: [PATCH] enable -Winvalid-memory-order for C++ [PR99612]

2022-01-27 Thread Andrew Pinski via Gcc-patches

On Wed, Dec 8, 2021 at 8:50 AM Martin Sebor via Gcc-patches
 wrote:
>
> Even with -Wno-system-headers enabled, the -Winvalid-memory-order
> code tries to make sure calls to atomic functions with invalid
> memory orders are diagnosed even though the C atomic functions
> are defined as macros in the  system header.
> The warning triggers at all optimization levels, including -O0.
>
> Independently, the core diagnostic enhancements implemented earlier
> this year (the warning group control) enable warnings for functions
> defined in system headers that are inlined into user code.  This
> was done for similar reason as above: because it's desirable to
> diagnose invalid calls made from user code to system functions
> (e.g., buffer overflows, invalid or mismatched deallocations,
> etc.)
>
> However, the C macro solution interferes with the code diagnostic
> changes and prevents the invalid memory model warnings from being
> issued for the same problems in C++.  In addition, because C++
> atomics are ordinary (inline) functions that call the underlying
> __atomic_xxx built-ins, the invalid memory orders can only be
> detected with both inlining and constant propagation enabled.
>
> The attached patch removes these limitations and enables
> -Winvalid-memory-order to trigger even for C++ std::atomic,
> (almost) just like it does in C, at all optimization levels
> including -O0.
>
> To make that possible I had to move -Winvalid-memory-order from
> builtins.c to a GIMPLE pass where it can use context-sensitive
> range info at -O0, instead of relying on constant propagation
> (only available at -O1 and above).  Although the same approach
> could be used to emit better object code for C++ atomics at -O0
> (i.e., use the right memory order instead of dropping to seq_cst),
> this patch doesn't do that.)
>
> In addition to enabling the warning I've also enhanced it to
> include the memory models involved in the diagnosed call (both
> the problem ones and the viable alternatives).
>
> Tested on x86_64-linux.
>
> Jonathan, I CC you for two reasons: a) because this solution
> is based on your (as well as my own) preference for handling
> C++ system headers, and because of our last week's discussion
> of the false positives in std::string resulting from the same
> choice there.
>
> I don't anticipate this change to lead to the same fallout
> because it's unlikely for GCC to synthesize invalid memory
> orders out of thin air; and b) because the current solution
> can only detect the problems in calls to atomic functions at
> -O0 that are declared with attribute always_inline.  This
> includes member functions defined in the enclosing atomic
> class but not namespace-scope functions.  To make
> the detection possible those would also have to be
> always_inline.  If that's a change you'd like to see I can
> look into making it happen.

This causes gcc.target/aarch64/atomic-inst-cas.c testcase to fail on
aarch64-linux-gnu. We warn now even for the case where we don't have
undefined behavior.
In the sense the code checks the success and failure memory model
before calling __atomic_compare_exchange_n.
Testcase:
int test_cas_atomic_relaxed_consume_char (char* val, char* foo, char* bar) {
 int model_s = 0;
 int model_f = 1;
 if (model_s < model_f) return 0;
 if (model_f == 3 || model_f == 4) return 0;
 return __atomic_compare_exchange_n (val, foo, bar, 0, model_s, model_f);
}

Do we really want to warn about this case and other cases where we
check the memory model before calling __atomic_compare_exchange_n?
That is, do we want to warn this early in the optimization pipeline
and even without flow control checks?

Thanks,
Andrew Pinski

>
> Martin

[committed] libstdc++: Prevent -Wstringop-overread warning in std::deque [PR100516]

2022-01-27 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk.


The compiler warns about the loop in deque::_M_range_initialize because
it doesn't know that the number of nodes has already been correctly
sized to match the size of the input. Use __builtin_unreachable to tell
it that the loop will never be entered if the number of elements is
smaller than a single node.

libstdc++-v3/ChangeLog:

PR libstdc++/100516
* include/bits/deque.tcc (_M_range_initialize):
Add __builtin_unreachable to loop.
* testsuite/23_containers/deque/100516.cc: New test.
---
 libstdc++-v3/include/bits/deque.tcc|  3 +++
 .../testsuite/23_containers/deque/100516.cc| 14 ++
 2 files changed, 17 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/23_containers/deque/100516.cc

diff --git a/libstdc++-v3/include/bits/deque.tcc 
b/libstdc++-v3/include/bits/deque.tcc
index 80f1813bc76..03e0a505e14 100644
--- a/libstdc++-v3/include/bits/deque.tcc
+++ b/libstdc++-v3/include/bits/deque.tcc
@@ -454,6 +454,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 __cur_node < this->_M_impl._M_finish._M_node;
 ++__cur_node)
  {
+   if (__n < _S_buffer_size())
+ __builtin_unreachable(); // See PR 100516
+
_ForwardIterator __mid = __first;
std::advance(__mid, _S_buffer_size());
std::__uninitialized_copy_a(__first, __mid, *__cur_node,
diff --git a/libstdc++-v3/testsuite/23_containers/deque/100516.cc 
b/libstdc++-v3/testsuite/23_containers/deque/100516.cc
new file mode 100644
index 000..ef32ae10545
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/deque/100516.cc
@@ -0,0 +1,14 @@
+// { dg-options "-O2 -Wstringop-overread" }
+// { dg-do compile { target c++11 } }
+
+// Bug 100516
+// Unexpected -Wstringop-overread in deque initialization from empty
+// initializer_list
+
+#include 
+
+void f()
+{
+std::initializer_list il{};
+std::deque{il};
+}
-- 
2.34.1

[committed] analyzer: show region creation events for uninit warnings

2022-01-27 Thread David Malcolm via Gcc-patches

When reviewing the output of -fanalyzer on PR analyzer/104224 I noticed
that despite very verbose paths, the diagnostic paths for
  -Wanalyzer-use-of-uninitialized-value
don't show where the uninitialized memory is allocated.

This patch adapts and simplifies material from
  "[PATCH 3/6] analyzer: implement infoleak detection"
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584377.html
in order to add region creation events for the pertinent region (whether
on the stack or heap).

For example, this patch extends:

malloc-1.c: In function 'test_40':
malloc-1.c:461:5: warning: use of uninitialized value '*p' [CWE-457] 
[-Wanalyzer-use-of-uninitialized-value]
  461 |   i = *p;
  |   ~~^~~~
  'test_40': event 1
|
|  461 |   i = *p;
|  |   ~~^~~~
|  | |
|  | (1) use of uninitialized value '*p' here
|

to:

malloc-1.c: In function 'test_40':
malloc-1.c:461:5: warning: use of uninitialized value '*p' [CWE-457] 
[-Wanalyzer-use-of-uninitialized-value]
  461 |   i = *p;
  |   ~~^~~~
  'test_40': events 1-2
|
|  460 |   int *p = (int*)malloc(sizeof(int*));
|  |  ^~~~
|  |  |
|  |  (1) region created on heap here
|  461 |   i = *p;
|  |   ~~
|  | |
|  | (2) use of uninitialized value '*p' here
|

and this helps readability of the resulting warnings, especially in
more complicated cases.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-6906-g00e7d024afb80e.

gcc/analyzer/ChangeLog:
* checker-path.cc (event_kind_to_string): Handle
EK_REGION_CREATION.
(region_creation_event::region_creation_event): New.
(region_creation_event::get_desc): New.
(checker_path::add_region_creation_event): New.
* checker-path.h (enum event_kind): Add EK_REGION_CREATION.
(class region_creation_event): New subclass.
(checker_path::add_region_creation_event): New decl.
* diagnostic-manager.cc
(diagnostic_manager::emit_saved_diagnostic): Pass NULL for new
param to add_events_for_eedge when handling trailing eedge.
(diagnostic_manager::build_emission_path): Create an interesting_t
instance, allow the pending diagnostic to populate it, and pass it
to the calls to add_events_for_eedge.
(diagnostic_manager::add_events_for_eedge): Add "interest" param.
Use it to add region_creation_events for on-stack regions created
within at function entry, and when pertinent dynamically-sized
regions are created.
(diagnostic_manager::prune_for_sm_diagnostic): Add case for
EK_REGION_CREATION.
* diagnostic-manager.h (diagnostic_manager::add_events_for_eedge):
Add "interest" param.
* pending-diagnostic.cc: Include "selftest.h", "tristate.h",
"analyzer/call-string.h", "analyzer/program-point.h",
"analyzer/store.h", and "analyzer/region-model.h".
(interesting_t::add_region_creation): New.
(interesting_t::dump_to_pp): New.
* pending-diagnostic.h (struct interesting_t): New.
(pending_diagnostic::mark_interesting_stuff): New vfunc.
* region-model.cc
(poisoned_value_diagnostic::poisoned_value_diagnostic): Add
(poisoned_value_diagnostic::operator==): Compare m_pkind and
m_src_region fields.
(poisoned_value_diagnostic::mark_interesting_stuff): New.
(poisoned_value_diagnostic::m_src_region): New.
(region_model::check_for_poison): Call
get_region_for_poisoned_expr for uninit values and pass the resul
to the diagnostic.
(region_model::get_region_for_poisoned_expr): New.
(region_model::deref_rvalue): Pass NULL for
poisoned_value_diagnostic's src_region.
* region-model.h (region_model::get_region_for_poisoned_expr): New
decl.
* region.h (frame_region::get_fndecl): New.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/data-model-1.c: Add dg-message directives for
expected region creation events.
* gcc.dg/analyzer/malloc-1.c: Likewise.
* gcc.dg/analyzer/memset-CVE-2017-18549-1.c: Likewise.
* gcc.dg/analyzer/pr101547.c: Likewise.
* gcc.dg/analyzer/pr101875.c: Likewise.
* gcc.dg/analyzer/pr101962.c: Likewise.
* gcc.dg/analyzer/pr104224.c: Likewise.
* gcc.dg/analyzer/pr94047.c: Likewise.
* gcc.dg/analyzer/symbolic-1.c: Likewise.
* gcc.dg/analyzer/uninit-1.c: Likewise.
* gcc.dg/analyzer/uninit-4.c: Likewise.
* gcc.dg/analyzer/uninit-alloca.c: New test.
* gcc.dg/analyzer/uninit-pr94713.c: Add dg-message directive for
expected region creation event.
* gcc.dg/analyzer/uninit-pr94714.c: Likewise.
* gcc.dg/analyzer/zlib-3.c: Likewise.

Signed-off-by: David Malcolm 
---

[committed] libstdc++: Avoid overflow in ranges::advance(i, n, bound)

2022-01-27 Thread Jonathan Wakely via Gcc-patches

Tested powerpc64le-linux, pushed to trunk. Should be backported too.


When (bound - i) or n is the most negative value of its type, the
negative of the value will overflow. Instead of abs(n) >= abs(bound - i)
use n >= (bound - i) when positive and n <= (bound - i) when negative.
The function has a precondition that they must have the same sign, so
this works correctly. The precondition check can be moved into the else
branch, and simplified.

The standard requires calling ranges::advance(i, bound) even if i==bound
is already true, which is technically observable, but that's pointless.
We can just return n in that case. Similarly, for i!=bound but n==0 we
are supposed to call ranges::advance(i, n), but that's pointless. An LWG
issue to allow omitting the pointless calls is expected to be filed.

libstdc++-v3/ChangeLog:

* include/bits/ranges_base.h (ranges::advance): Avoid signed
overflow. Do nothing if already equal to desired result.
* testsuite/24_iterators/range_operations/advance_overflow.cc:
New test.
---
 libstdc++-v3/include/bits/ranges_base.h   | 15 +---
 .../range_operations/advance_overflow.cc  | 37 +++
 2 files changed, 46 insertions(+), 6 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/24_iterators/range_operations/advance_overflow.cc

diff --git a/libstdc++-v3/include/bits/ranges_base.h 
b/libstdc++-v3/include/bits/ranges_base.h
index 08e99ebc5d3..3c5f4b1790a 100644
--- a/libstdc++-v3/include/bits/ranges_base.h
+++ b/libstdc++-v3/include/bits/ranges_base.h
@@ -756,20 +756,23 @@ namespace ranges
  {
const auto __diff = __bound - __it;
 
-   // n and bound must not lead in opposite directions:
-   __glibcxx_assert(__n == 0 || __diff == 0 || (__n < 0 == __diff < 
0));
-   const auto __absdiff = __diff < 0 ? -__diff : __diff;
-   const auto __absn = __n < 0 ? -__n : __n;;
-   if (__absn >= __absdiff)
+   if (__diff == 0)
+ return __n;
+   else if (__diff > 0 ? __n >= __diff : __n <= __diff)
  {
(*this)(__it, __bound);
return __n - __diff;
  }
-   else
+   else if (__n != 0) [[likely]]
  {
+   // n and bound must not lead in opposite directions:
+   __glibcxx_assert(__n < 0 == __diff < 0);
+
(*this)(__it, __n);
return 0;
  }
+   else
+ return 0;
  }
else if (__it == __bound || __n == 0)
  return __n;
diff --git 
a/libstdc++-v3/testsuite/24_iterators/range_operations/advance_overflow.cc 
b/libstdc++-v3/testsuite/24_iterators/range_operations/advance_overflow.cc
new file mode 100644
index 000..0fadcd6e99a
--- /dev/null
+++ b/libstdc++-v3/testsuite/24_iterators/range_operations/advance_overflow.cc
@@ -0,0 +1,37 @@
+// { dg-options "-std=gnu++20" }
+// { dg-do compile { target c++20 } }
+
+// Public domain testcase from Casey Carter, send to LWG list on 2021-07-24.
+//
+// Here's a compile-only test case for which n is INT_MIN, which will overflow
+// if simply negated to get |n|: https://godbolt.org/z/M7Wz1nW58.
+
+#include 
+#include 
+#include 
+
+struct I {
+using difference_type = int;
+using value_type = int;
+
+int x;
+
+constexpr int operator*() const { return x; }
+constexpr I& operator++() { ++x; return *this; }
+constexpr I operator++(int) { ++x; return {x - 1}; }
+constexpr bool operator==(const I&) const = default;
+
+constexpr int operator-(const I& that) const { return x - that.x; }
+
+constexpr I& operator--() { --x; return *this; }
+constexpr I operator--(int) { --x; return {x - 1}; }
+};
+static_assert(std::bidirectional_iterator);
+static_assert(std::sized_sentinel_for);
+
+constexpr bool test() {
+using L = std::numeric_limits;
+I i{-2};
+return std::ranges::advance(i, L::min(), I{-4}) == L::min() + 2;
+}
+static_assert(test());
-- 
2.34.1

Re: [PATCH] c++: var tmpl w/ dependent constrained auto type [PR103341]

2022-01-27 Thread Jason Merrill via Gcc-patches


On 1/27/22 13:06, Patrick Palka wrote:

On Thu, 27 Jan 2022, Patrick Palka wrote:


When deducing the type of a variable template specialization with a
constrained auto type, we might need the template arguments for
satisfaction in case the constraint is dependent.


It looks like templated static data members need similar treatment.
Here's a patch that handles both kinds of entities, what they have
in common is that they're templated and not at function scope
(bootstrap and regtesting in progress):

-- >8 --

Subject: [PATCH] c++: var tmpl w/ dependent constrained auto type [PR103341]

When deducing the type of a variable template specialization (or a
template static data member) with a constrained auto type, we might
need the template arguments during satisfaction from do_auto_deduction
in case the constraint is dependent.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk and perhaps 11?


OK for trunk.  And for 11, if you make it conditional on flag_concepts.


PR c++/103341

gcc/cp/ChangeLog:

* decl.cc (cp_finish_decl): Pass the template arguments of a
variable template specialization or a templated static data
member to do_auto_deduction when the auto is constrained.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-class4.C: New test.
* g++.dg/cpp2a/concepts-var-templ2.C: New test.
---
  gcc/cp/decl.cc   | 12 +++-
  gcc/testsuite/g++.dg/cpp2a/concepts-class4.C | 12 
  gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C | 13 +
  3 files changed, 36 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-class4.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 10e6956117e..465f5620f2c 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -7958,9 +7958,19 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
enum auto_deduction_context adc = adc_variable_type;
if (VAR_P (decl) && DECL_DECOMPOSITION_P (decl))
adc = adc_decomp_type;
+  tree outer_targs = NULL_TREE;
+  if (PLACEHOLDER_TYPE_CONSTRAINTS_INFO (auto_node)
+ && VAR_P (decl)
+ && DECL_LANG_SPECIFIC (decl)
+ && DECL_TEMPLATE_INFO (decl)
+ && !DECL_FUNCTION_SCOPE_P (decl))
+   /* The outer template arguments might be needed for satisfaction.
+  For function scope decls, do_auto_deduction obtains the outer
+  template arguments from current_function_decl.  */
+   outer_targs = DECL_TI_ARGS (decl);
type = TREE_TYPE (decl) = do_auto_deduction (type, d_init, auto_node,
   tf_warning_or_error, adc,
-  NULL_TREE, flags);
+  outer_targs, flags);
if (type == error_mark_node)
return;
if (TREE_CODE (type) == FUNCTION_TYPE)
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-class4.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-class4.C
new file mode 100644
index 000..9d694e758e0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-class4.C
@@ -0,0 +1,12 @@
+// PR c++/103341
+// { dg-do compile { target c++20 } }
+
+template concept C = __is_same(T, U);
+
+template
+struct A {
+  static inline C auto value = 0; // { dg-error "constraint" }
+};
+
+template struct A; // { dg-bogus "" }
+template struct A; // { dg-message "required from here" }
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C
new file mode 100644
index 000..e1802aca75f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C
@@ -0,0 +1,13 @@
+// PR c++/103341
+// { dg-do compile { target c++20 } }
+
+template concept same_as = __is_same(T, U);
+template same_as auto v1a = 1;
+template same_as auto v1b = T();
+template same_as auto v2a = 1; // { dg-error "constraints" }
+template same_as auto v2b = T(); // { dg-error "constraints" }
+
+template int v1a;
+template int v1b;
+template int v2a; // { dg-message "required from here" }
+template int v2b; // { dg-message "required from here" }

Re: [PATCH] RISC-V: Document `auipc' and `bitmanip' `type' attributes

2022-01-27 Thread Andrew Waterman

LGTM, thanks for correcting this oversight in my patch.


On Thu, Jan 27, 2022 at 2:09 PM Maciej W. Rozycki  wrote:
>
> Document new `auipc' and `bitmanip' `type' attributes added respectively
> with commit 88108b27dda9 ("RISC-V: Add sifive-7 pipeline description.")
> and commit 283b1707f237 ("RISC-V: Implement instruction patterns for ZBA
> extension.") but not listed so far.
>
> gcc/
> * config/riscv/riscv.md: Document `auipc' and `bitmanip' `type'
> attributes.
> ---
> Hi,
>
>  There's also the `rotate' `type' attribute, but it's nowhere used, so it
> might be worth removing.  As not-a-regression however that would be GCC 13
> material I guess.
>
>  OK to apply?
>
>   Maciej
> ---
>  gcc/config/riscv/riscv.md |2 ++
>  1 file changed, 2 insertions(+)
>
> gcc-riscv-auipc-bitmanip-type.diff
> Index: gcc/gcc/config/riscv/riscv.md
> ===
> --- gcc.orig/gcc/config/riscv/riscv.md
> +++ gcc/gcc/config/riscv/riscv.md
> @@ -150,6 +150,7 @@
>  ;; mfc transfer from coprocessor
>  ;; const   load constant
>  ;; arith   integer arithmetic instructions
> +;; auipc   integer addition to PC
>  ;; logical  integer logical instructions
>  ;; shift   integer shift instructions
>  ;; slt set less than instructions
> @@ -167,6 +168,7 @@
>  ;; multi   multiword sequence (or user asm statements)
>  ;; nop no operation
>  ;; ghost   an instruction that produces no real code
> +;; bitmanipbit manipulation instructions
>  (define_attr "type"
>"unknown,branch,jump,call,load,fpload,store,fpstore,
> mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,

[PATCH] RISC-V: Document `auipc' and `bitmanip' `type' attributes

2022-01-27 Thread Maciej W. Rozycki

Document new `auipc' and `bitmanip' `type' attributes added respectively 
with commit 88108b27dda9 ("RISC-V: Add sifive-7 pipeline description.") 
and commit 283b1707f237 ("RISC-V: Implement instruction patterns for ZBA 
extension.") but not listed so far.

gcc/
* config/riscv/riscv.md: Document `auipc' and `bitmanip' `type' 
attributes.
---
Hi,

 There's also the `rotate' `type' attribute, but it's nowhere used, so it 
might be worth removing.  As not-a-regression however that would be GCC 13 
material I guess.

 OK to apply?

  Maciej
---
 gcc/config/riscv/riscv.md |2 ++
 1 file changed, 2 insertions(+)

gcc-riscv-auipc-bitmanip-type.diff
Index: gcc/gcc/config/riscv/riscv.md
===
--- gcc.orig/gcc/config/riscv/riscv.md
+++ gcc/gcc/config/riscv/riscv.md
@@ -150,6 +150,7 @@
 ;; mfc transfer from coprocessor
 ;; const   load constant
 ;; arith   integer arithmetic instructions
+;; auipc   integer addition to PC
 ;; logical  integer logical instructions
 ;; shift   integer shift instructions
 ;; slt set less than instructions
@@ -167,6 +168,7 @@
 ;; multi   multiword sequence (or user asm statements)
 ;; nop no operation
 ;; ghost   an instruction that produces no real code
+;; bitmanipbit manipulation instructions
 (define_attr "type"
   "unknown,branch,jump,call,load,fpload,store,fpstore,
mtc,mfc,const,arith,logical,shift,slt,imul,idiv,move,fmove,fadd,fmul,

[pushed] c++: dependent and non-dependent attributes [PR104245]

2022-01-27 Thread Jason Merrill via Gcc-patches

A flaw in my patch for PR51344 was that cplus_decl_attributes calls
decl_attributes after save_template_attributes, which messes up the ordering
that save_template_attributes set up.  Fixed by splitting
save_template_attributes around the call to decl_attributes.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/104245
PR c++/51344

gcc/cp/ChangeLog:

* decl2.cc (save_template_attributes): Take late attrs as parm.
(cplus_decl_attributes): Call it after decl_attributes,
splice_template_attributes before.

gcc/testsuite/ChangeLog:

* g++.dg/lto/alignas1_0.C: New test.
---
 gcc/cp/decl2.cc   | 17 -
 gcc/testsuite/g++.dg/lto/alignas1_0.C |  7 +++
 2 files changed, 15 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/lto/alignas1_0.C

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index b68cf96fa81..a2aa5f1de4e 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -1352,18 +1352,14 @@ splice_template_attributes (tree *attr_p, tree decl)
   return late_attrs;
 }
 
-/* Remove any late attributes from the list in ATTR_P and attach them to
-   DECL_P.  */
+/* Attach any LATE_ATTRS to DECL_P, after the non-dependent attributes have
+   been applied by a previous call to decl_attributes.  */
 
 static void
-save_template_attributes (tree *attr_p, tree *decl_p, int flags)
+save_template_attributes (tree late_attrs, tree *decl_p, int flags)
 {
   tree *q;
 
-  if (attr_p && *attr_p == error_mark_node)
-return;
-
-  tree late_attrs = splice_template_attributes (attr_p, *decl_p);
   if (!late_attrs)
 return;
 
@@ -1666,12 +1662,12 @@ cplus_decl_attributes (tree *decl, tree attributes, int 
flags)
}
 }
 
+  tree late_attrs = NULL_TREE;
   if (processing_template_decl)
 {
   if (check_for_bare_parameter_packs (attributes))
return;
-
-  save_template_attributes (, decl, flags);
+  late_attrs = splice_template_attributes (, *decl);
 }
 
   cp_check_const_attributes (attributes);
@@ -1717,6 +1713,9 @@ cplus_decl_attributes (tree *decl, tree attributes, int 
flags)
   decl_attributes (decl, attributes, flags, last_decl);
 }
 
+  if (late_attrs)
+save_template_attributes (late_attrs, decl, flags);
+
   /* Propagate deprecation out to the template.  */
   if (TREE_DEPRECATED (*decl))
 if (tree ti = get_template_info (*decl))
diff --git a/gcc/testsuite/g++.dg/lto/alignas1_0.C 
b/gcc/testsuite/g++.dg/lto/alignas1_0.C
new file mode 100644
index 000..a2fc72ad894
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lto/alignas1_0.C
@@ -0,0 +1,7 @@
+// PR c++/104245
+// { dg-lto-do assemble }
+// { dg-require-effective-target c++11 }
+
+template  struct A { alignas(T) alignas(int) int a; };
+struct B { B(const char *, const char *, int, int); A b; };
+B c {"", "", 0, 0};

base-commit: ae1b50e2e03aad06408b64c876f5d0511121de0d
-- 
2.27.0

[PATCH v4] x86: Add -m[no-]direct-extern-access

2022-01-27 Thread H.J. Lu via Gcc-patches

The v3 patch was posted at

https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574847.html

There is no progress with repeated pings since then.  Glibc 2.35 and
binutils 2.38 will support GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS.
I'd like to remove copy relocation to improve security and performance
on x86 in GCC 12.  Here is the v4 patch

1. Rename the common option to x86 specific -mdirect-extern-access option.
2. Remove GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS linker check.
3. Use the existing GNU property function in x86 backend.

This new behavior is off by default.  Are there any objections?

Changes in the v3 patch.

1. GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support has been added to
GNU binutils 2.38.  But the -z indirect-extern-access linker option is
only available for Linux/x86.  However, the --max-cache-size=SIZE linker
option was also addded within a day.  --max-cache-size=SIZE is used to
check for GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS support.

Changes in the v2 patch.

1. Rename the option to -fdirect-extern-access.

---
On systems with copy relocation:
* A copy in executable is created for the definition in a shared library
at run-time by ld.so.
* The copy is referenced by executable and shared libraries.
* Executable can access the copy directly.

Issues are:
* Overhead of a copy, time and space, may be visible at run-time.
* Read-only data in the shared library becomes read-write copy in
executable at run-time.
* Local access to data with the STV_PROTECTED visibility in the shared
library must use GOT.

On systems without function descriptor, function pointers vary depending
on where and how the functions are defined.
* If the function is defined in executable, it can be the address of
function body.
* If the function, including the function with STV_PROTECTED visibility,
is defined in the shared library, it can be the address of the PLT entry
in executable or shared library.

Issues are:
* The address of function body may not be used as its function pointer.
* ld.so needs to search loaded shared libraries for the function pointer
of the function with STV_PROTECTED visibility.

Here is a proposal to remove copy relocation and use canonical function
pointer:

1. Accesses, including in PIE and non-PIE, to undefined symbols must
use GOT.
  a. Linker may optimize out GOT access if the data is defined in PIE or
  non-PIE.
2. Read-only data in the shared library remain read-only at run-time
3. Address of global data with the STV_PROTECTED visibility in the shared
library is the address of data body.
  a. Can use IP-relative access.
  b. May need GOT without IP-relative access.
4. For systems without function descriptor,
  a. All global function pointers of undefined functions in PIE and
  non-PIE must use GOT.  Linker may optimize out GOT access if the
  function is defined in PIE or non-PIE.
  b. Function pointer of functions with the STV_PROTECTED visibility in
  executable and shared library is the address of function body.
   i. Can use IP-relative access.
   ii. May need GOT without IP-relative access.
   iii. Branches to undefined functions may use PLT.
5. Single global definition marker:

Add GNU_PROPERTY_1_NEEDED:

#define GNU_PROPERTY_1_NEEDED GNU_PROPERTY_UINT32_OR_LO

to indicate the needed properties by the object file.

Add GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS:

#define GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS (1U << 0)

to indicate that the object file requires canonical function pointers and
cannot be used with copy relocation.  This bit should be cleared in
executable when there are non-GOT or non-PLT relocations in relocatable
input files without this bit set.

  a. Protected symbol access within the shared library can be treated as
  local.
  b. Copy relocation should be disallowed at link-time and run-time.
  c. GOT function pointer reference is required at link-time and run-time.

The indirect external access marker can be used in the following ways:

1. Linker can decide the best way to resolve a relocation against a
protected symbol before seeing all relocations against the symbol.
2. Dynamic linker can decide if it is an error to have a copy relocation
in executable against the protected symbol in a shared library by checking
if the shared library is built with -mno-direct-extern-access.

Add a compiler option, -mdirect-extern-access. -mdirect-extern-access is
the default.  With -mno-direct-extern-access:

1. Always to use GOT to access undefined symbols, including in PIE and
non-PIE.  This is safe to do and does not break the ABI.
2. In executable and shared library, for symbols with the STV_PROTECTED
visibility:
  a. The address of data symbol is the address of data body.
  b. For systems without function descriptor, the function pointer is
  the address of function body.
These break the ABI and resulting shared libraries may not be compatible
with executables which are not compiled with -mno-direct-extern-access.
3. Generate an

Re: [PATCH] testsuite: Fix gfortran.dg/ieee/signaling_?.f90 tests for x86 targets

2022-01-27 Thread FX via Gcc-patches

Hi Uroš,

> Please note that check_effective_target_ia32 test tries to compile code that
> uses __i386__ target-dependent preprocessor definition, so it is guaranteed
> to fail on all non-ia32 targets.

Thanks, I didn’t know the difference!
OK to push.

FX

Re: [PATCH] rs6000: Fix up #include or [PR104239]

2022-01-27 Thread Paul A. Clarke via Gcc-patches

On Wed, Jan 26, 2022 at 03:50:35PM -0500, David Edelsohn via Gcc-patches wrote:
> On Wed, Jan 26, 2022 at 3:45 PM Jakub Jelinek  wrote:
> > r12-4717-g7d37abedf58d66 added immintrin.h and x86gprintrin.h headers
> > to rs6000, these headers are on x86 standalone headers that various
> > programs include directly rather than including them through
> > .
> > Unfortunately, for that change the bmiintrin.h and bmi2intrin.h
> > headers haven't been adjusted, so the effect is that if one includes them
> > (without including also x86intrin.h first) #error will trigger.
> > Furthermore, when including such headers conditionally as some real-world
> > packages do, this means a regression.
> >
> > The following patch fixes it and matches what the x86 bmi{,2}intrin.h
> > headers do.
> 
> Okay.
> 
> Thanks for catching this.

Indeed, thanks. And thanks for reviewing, David.

Should we add similar compile-only tests for all of the standalone include
files?

PC

[PATCH] testsuite: Fix gfortran.dg/ieee/signaling_?.f90 tests for x86 targets

2022-01-27 Thread Uros Bizjak via Gcc-patches

As stated in signaling_?.f90 tests, x86-32 ABI is not suitable to
correctly handle signaling NaNs.  However, XFAIL is not the correct choice
to disable these tests, since various optimizations can generate code
that avoids moves from registers to memory (and back), resulting
in the code that executes correctly, producing spurious XFAIL.
These tests should be disabled on x86-32 using { ! ia32 } dg-directive
which rules out x32 ilp32 ABI, where tests execute without problems.

Please note that check_effective_target_ia32 test tries to compile code that
uses __i386__ target-dependent preprocessor definition, so it is guaranteed
to fail on all non-ia32 targets.

2022-01-27  Uroš Bizjak  

gcc/testsuite/ChangeLog:

* gfortran.dg/ieee/signaling_1.f90 (dg-do):
Run only on non-ia32 targets.
* gfortran.dg/ieee/signaling_2.f90 (dg-do): Ditto.
* gfortran.dg/ieee/signaling_3.f90 (dg-do): Ditto.

Tested on x86_64-linux-gnu {,-m32}.

OK for master?

Uros.
diff --git a/gcc/testsuite/gfortran.dg/ieee/signaling_1.f90 
b/gcc/testsuite/gfortran.dg/ieee/signaling_1.f90
index 2b156811c1e..19fee283f54 100644
--- a/gcc/testsuite/gfortran.dg/ieee/signaling_1.f90
+++ b/gcc/testsuite/gfortran.dg/ieee/signaling_1.f90
@@ -1,4 +1,4 @@
-! { dg-do run { xfail { { i?86-*-* x86_64-*-* } && ilp32 } } }
+! { dg-do run { target { ! ia32 } } }
 ! x87 / x86-32 ABI is unsuitable for signaling NaNs
 !
 ! { dg-additional-sources signaling_1_c.c }
diff --git a/gcc/testsuite/gfortran.dg/ieee/signaling_2.f90 
b/gcc/testsuite/gfortran.dg/ieee/signaling_2.f90
index ee3805272a0..03b04c783eb 100644
--- a/gcc/testsuite/gfortran.dg/ieee/signaling_2.f90
+++ b/gcc/testsuite/gfortran.dg/ieee/signaling_2.f90
@@ -1,4 +1,4 @@
-! { dg-do run { xfail { { i?86-*-* x86_64-*-* } && ilp32 } } }
+! { dg-do run { target { ! ia32 } } }
 ! x87 / x86-32 ABI is unsuitable for signaling NaNs
 !
 ! { dg-require-effective-target issignaling } */
diff --git a/gcc/testsuite/gfortran.dg/ieee/signaling_3.f90 
b/gcc/testsuite/gfortran.dg/ieee/signaling_3.f90
index 22b36980896..ff2585d2589 100644
--- a/gcc/testsuite/gfortran.dg/ieee/signaling_3.f90
+++ b/gcc/testsuite/gfortran.dg/ieee/signaling_3.f90
@@ -1,4 +1,4 @@
-! { dg-do run { xfail { { i?86-*-* x86_64-*-* } && ilp32 } } }
+! { dg-do run { target { ! ia32 } } }
 ! x87 / x86-32 ABI is unsuitable for signaling NaNs
 !
 program test

[r12-6893 Regression] FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2b 1 blank line(s) in output on Linux/x86_64

2022-01-27 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

76ef38e3178a11e76a66b4d4c0e10e85fe186a45 is the first bad commit
commit 76ef38e3178a11e76a66b4d4c0e10e85fe186a45
Author: Martin Liska 
Date:   Thu Jan 27 11:22:42 2022 +0100

Improve wording for -freport-bug option.

caused

FAIL: g++.dg/modules/xtreme-header-5_a.H module-cmi  
(gcm.cache/$srcdir/g++.dg/modules/xtreme-header-5_a.H.gcm)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++17  1 blank line(s) in output
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++17 (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12969)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++2a  1 blank line(s) in output
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++2a (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12969)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++2b  1 blank line(s) in output
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++2b (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12969)
FAIL: g++.dg/modules/xtreme-header-5_a.H -std=c++2b (test for excess errors)
FAIL: g++.dg/modules/xtreme-header_a.H module-cmi  
(gcm.cache/$srcdir/g++.dg/modules/xtreme-header_a.H.gcm)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++17  1 blank line(s) in output
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++17 (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12969)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++17 (test for excess errors)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2a  1 blank line(s) in output
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2a (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12969)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2a (test for excess errors)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2b  1 blank line(s) in output
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2b (internal compiler error: 
tree check: expected none of template_decl, have template_decl in 
add_specializations, at cp/module.cc:12969)
FAIL: g++.dg/modules/xtreme-header_a.H -std=c++2b (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6893/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-5_a.H 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header-5_a.H 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header_a.H 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="modules.exp=g++.dg/modules/xtreme-header_a.H 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: [PATCH][GCC13] Don't force side effects for hardware vector element broadcast

2022-01-27 Thread Maciej W. Rozycki

On Thu, 27 Jan 2022, Richard Biener wrote:

> > > > Index: gcc/gcc/c/c-typeck.cc
> > > > ===
> > > > --- gcc.orig/gcc/c/c-typeck.cc
> > > > +++ gcc/gcc/c/c-typeck.cc
> > > > @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.
> > > >  #include "gomp-constants.h"
> > > >  #include "spellcheck-tree.h"
> > > >  #include "gcc-rich-location.h"
> > > > +#include "optabs-query.h"
> > > >  #include "stringpool.h"
> > > >  #include "attribs.h"
> > > >  #include "asan.h"
> > > > @@ -11923,7 +11924,9 @@ build_binary_op (location_t location, en
> > > >bool maybe_const = true;
> > > >tree sc;
> > > >sc = c_fully_fold (op0, false, _const);
> > > > -  sc = save_expr (sc);
> > > > + if (optab_handler (vec_duplicate_optab,
> > > > +TYPE_MODE (type1)) == CODE_FOR_nothing)
> > > > +   sc = save_expr (sc);
> > >
> > > This doesn't make much sense - I suppose the CONSTRUCTOR retains
> > > TREE_SIDE_EFFECTS but such flag has no meaning on GIMPLE
> > > and thus should have been cleared during gimplification or in the end
> > > ignored by RTL expansion.
> >
> >  This is how the expression built here eventually looks in
> > `store_constructor':
> >
> > (gdb) print exp
> > $41 = 
> > (gdb) pt
> >   > type  > type  > size 
> > unit-size 
> > align:32 warn_if_not_align:0 symtab:0 alias-set -1 
> > canonical-type 0x75cf1260 precision:32
> > pointer_to_this >
> > sizes-gimplified V4SF
> > size 
> > unit-size 
> > align:128 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> > 0x75d19648 nunits:4 context  > v4sf-dup.c>>
> > side-effects length:4
> > val 
> > visited var 
> > def_stmt GIMPLE_NOP
> > version:2>
> > val 
> > visited var 
> > def_stmt GIMPLE_NOP
> > version:2>
> > val 
> > visited var 
> > def_stmt GIMPLE_NOP
> > version:2>
> > val 
> > visited var 
> > def_stmt GIMPLE_NOP
> > version:2>>
> > (gdb)
> >
> > The `side-effects' flag prevents this conditional from executing:
> >
> > /* Try using vec_duplicate_optab for uniform vectors.  */
> > if (!TREE_SIDE_EFFECTS (exp)
> > && VECTOR_MODE_P (mode)
> > && eltmode == GET_MODE_INNER (mode)
> > && ((icode = optab_handler (vec_duplicate_optab, mode))
> > != CODE_FOR_nothing)
> > && (elt = uniform_vector_p (exp))
> > && !VECTOR_TYPE_P (TREE_TYPE (elt)))
> >   {
> > class expand_operand ops[2];
> > create_output_operand ([0], target, mode);
> > create_input_operand ([1], expand_normal (elt), eltmode);
> > expand_insn (icode, 2, ops);
> > if (!rtx_equal_p (target, ops[0].value))
> >   emit_move_insn (target, ops[0].value);
> > break;
> >   }
> >
> > I don't know what's supposed to clear the flag (and what the purpose of
> > setting it in the first place would be then).
> 
> It's probably safe to remove the !TREE_SIDE_EFFECTS check above
> but already gimplification should have made sure all side-effects are
> pushed to separate stmts.  gimplifiation usually calls recompute_side_effects
> but that doesn't seem to touch CONSTRUCTORs.  But I do remember fixing
> some spurious TREE_SIDE_EFFECTS on CTORs before.
> 
> Might be worth verifying in verify_gimple_assign_single that CTORs
> do not have TREE_SIDE_EFFECTS set (unless this is a clobber).

 OK, so maybe there's another bug somewhere that causes the side-effects 
flag not to be cleared where expected, however I an inconvinced as to 
withdrawing my original point.  That is why treat code like:

v4sf
odd_even (v4sf x, float y)
{
  return x + f;
}

effectively like:

v4sf
odd_even (v4sf x, volatile float y)
{
  return x + f;
}

which I infer from the terse justification in the discussions referred is 
the sole purpose of making use of `save_expr' here, also for targets that 
have a cheap (or free if combined with another operation) `vec_duplicateM' 
machine operation?

 While removing the !TREE_SIDE_EFFECTS check may cause `vec_duplicateM' to 
be used, the middle end will still ensure the element broadcast operation 
won't be repeated, e.g. at the cost of consuming a temporary register to 
carry a vector of identical elements, where it may not be the least costly 
approach.  Where we have an actual `vec_duplicateM' insn we can use its 
cost to determine the best approach, can't we?

 Am I still missing something?

  Maciej

[PATCH] constrain PHI handling in -Wuse-after-free (PR104232)

2022-01-27 Thread Martin Sebor via Gcc-patches


The indiscriminate PHI handling by -Wuse-after-free has caused
the false positive reported in PR 104232.  The attached patch
refines the handling to only consider PHIs all of whose operands
refer to the same object and disregard the rest.

Tested on x86_64-linux and by compiling a few toolchain projects,
including Glibc and Binutils/GDB, to verify the absence of false
positives.

MartinConstrain PHI handling in -Wuse-after-free [PR104232].

Resolves:
PR middle-end/104232 - spurious -Wuse-after-free after conditional free

gcc/ChangeLog:

	PR middle-end/104232
	* gimple-ssa-warn-access.cc (pointers_related_p): Add argument.
	Handle PHIs.  Add a synonymous overload.
	(pass_waccess::check_pointer_uses): Call pointers_related_p.

gcc/testsuite/ChangeLog:

	PR middle-end/104232
	* g++.dg/warn/Wuse-after-free4.C: New test.
	* gcc.dg/Wuse-after-free-2.c: New test.
	* gcc.dg/Wuse-after-free-3.c: New test.

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index 3dcaf4230b8..ad5e2f4458e 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -4080,7 +4080,8 @@ maybe_warn_mismatched_realloc (tree ptr, gimple *realloc_stmt, gimple *stmt)
either don't or their relationship cannot be determined.  */
 
 static bool
-pointers_related_p (gimple *stmt, tree p, tree q, pointer_query )
+pointers_related_p (gimple *stmt, tree p, tree q, pointer_query ,
+		auto_bitmap )
 {
   if (!ptr_derefs_may_alias_p (p, q))
 return false;
@@ -4093,7 +4094,47 @@ pointers_related_p (gimple *stmt, tree p, tree q, pointer_query )
it involves a self-referential PHI.  Return a conservative result.  */
 return false;
 
-  return pref.ref == qref.ref;
+  if (pref.ref == qref.ref)
+return true;
+
+  /* If either pointer is a PHI, iterate over all its operands and
+ return true if they're all related to the other pointer.  */
+  tree ptr = q;
+  unsigned version;
+  gphi *phi = pref.phi ();
+  if (phi)
+version = SSA_NAME_VERSION (pref.ref);
+  else
+{
+  phi = qref.phi ();
+  if (!phi)
+	return false;
+
+  ptr = p;
+  version = SSA_NAME_VERSION (qref.ref);
+}
+
+  if (!bitmap_set_bit (visited, version))
+return true;
+
+  unsigned nargs = gimple_phi_num_args (phi);
+  for (unsigned i = 0; i != nargs; ++i)
+{
+  tree arg = gimple_phi_arg_def (phi, i);
+  if (!pointers_related_p (stmt, arg, ptr, qry, visited))
+	return false;
+}
+
+  return true;
+}
+
+/* Convenience wrapper for the above.  */
+
+static bool
+pointers_related_p (gimple *stmt, tree p, tree q, pointer_query )
+{
+  auto_bitmap visited;
+  return pointers_related_p (stmt, p, q, qry, visited);
 }
 
 /* For a STMT either a call to a deallocation function or a clobber, warn
@@ -4192,7 +4233,12 @@ pass_waccess::check_pointer_uses (gimple *stmt, tree ptr,
 	{
 	  if (gimple_code (use_stmt) == GIMPLE_PHI)
 		{
+		  /* Only add a PHI result to POINTERS if all its
+		 operands are related to PTR, otherwise continue.  */
 		  tree lhs = gimple_phi_result (use_stmt);
+		  if (!pointers_related_p (stmt, lhs, ptr, m_ptr_qry))
+		continue;
+
 		  if (TREE_CODE (lhs) == SSA_NAME)
 		{
 		  pointers.safe_push (lhs);
diff --git a/gcc/testsuite/g++.dg/warn/Wuse-after-free4.C b/gcc/testsuite/g++.dg/warn/Wuse-after-free4.C
new file mode 100644
index 000..599d9bfe3c7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wuse-after-free4.C
@@ -0,0 +1,27 @@
+/* PR middle-end/104232 - spurious -Wuse-after-free after conditional free
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+char* f (void);
+
+struct A
+{
+  char *p;
+  A (): p () { }
+  ~A ()
+  {
+__builtin_free (p);   // { dg-bogus "-Wuse-after-free" }
+  }
+};
+
+int test_no_warn (void)
+{
+  A px, qx;
+
+  qx.p = f ();
+  if (!qx.p)
+return 0;
+
+  px.p = f ();
+  return 1;
+}
diff --git a/gcc/testsuite/gcc.dg/Wuse-after-free-2.c b/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
new file mode 100644
index 000..9f7ed4529f0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wuse-after-free-2.c
@@ -0,0 +1,115 @@
+/* PR middle-end/104232 - spurious -Wuse-after-free after conditional free
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+void free (void*);
+
+void sink (void*);
+
+void nowarn_cond_2 (char *p0, char *q0, int i)
+{
+  char *r = i ? p0 : q0;
+
+  free (p0);
+
+  /* The use of a PHI operand could be diagnosed using the "maybe" form
+ of the warning at level 2 but it's not done.  If it ever changes
+ this test and those below will need to be updated.  */
+  sink (r);
+}
+
+void nowarn_cond_2_null (char *p0, int i)
+{
+  char *r = i ? p0 : 0;
+
+  free (p0);
+  sink (r);
+}
+
+void nowarn_cond_3 (char *p0, char *q0, int i)
+{
+  char *r = i < 0 ? p0 - 1 : 0 < i ? p0 + 1 : q0;
+
+  free (p0);
+  sink (r);
+}
+
+void nowarn_cond_3_null (char *p0, int i)
+{
+  char *r = i < 0 ? p0 - 1 : 0 < i ? p0 + 1 : 0;
+
+  free (p0);
+  sink (r);
+}
+
+void

Re: [PATCH] git-undescr.sh: Support full output of git-descr.sh.

2022-01-27 Thread Jakub Jelinek via Gcc-patches

On Thu, Jan 27, 2022 at 07:29:20PM +0100, Martin Liška wrote:
> I would like to support:
> 
> $ git gcc-undescr `git gcc-descr`
> 9cbfbe2497c0117b0598e35658248bc723c0b931
> 
> Which is done in the patch.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> contrib/ChangeLog:
> 
>   * git-undescr.sh: Support full output of git-descr.sh.

Ok.

Jakub

[PATCH] git-undescr.sh: Support full output of git-descr.sh.

2022-01-27 Thread Martin Liška


I would like to support:

$ git gcc-undescr `git gcc-descr`
9cbfbe2497c0117b0598e35658248bc723c0b931

Which is done in the patch.

Ready to be installed?
Thanks,
Martin

contrib/ChangeLog:

* git-undescr.sh: Support full output of git-descr.sh.
---
 contrib/git-undescr.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/contrib/git-undescr.sh b/contrib/git-undescr.sh
index 7d2efe1a8e7..9d882a6814e 100755
--- a/contrib/git-undescr.sh
+++ b/contrib/git-undescr.sh
@@ -3,8 +3,8 @@
 # Script to undescribe a GCC revision
 
 o=$(git config --get gcc-config.upstream);

-r=$(echo $1 | sed -n 's,^r\([0-9]\+\)-[0-9]\+$,\1,p');
-n=$(echo $1 | sed -n 's,^r[0-9]\+-\([0-9]\+\)$,\1,p');
+r=$(echo $1 | sed -n 's,^r\([0-9]\+\)-[0-9]\+\(-g[0-9a-f]\+\)\?$,\1,p');
+n=$(echo $1 | sed -n 's,^r[0-9]\+-\([0-9]\+\)\(-g[0-9a-f]\+\)\?$,\1,p');
 
 test -z $r && echo Invalid id $1 && exit 1;

 h=$(git rev-parse --verify --quiet ${o:-origin}/releases/gcc-$r);
--
2.34.1

Re: [PATCH v3 07/15] arm: Implement MVE predicates as vectors of booleans

2022-01-27 Thread Christophe Lyon via Gcc-patches

On Thu, Jan 27, 2022 at 5:29 PM Kyrylo Tkachov via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> Hi Christophe,
>
> > -Original Message-
> > From: Gcc-patches  > bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> > Lyon via Gcc-patches
> > Sent: Thursday, January 13, 2022 2:56 PM
> > To: gcc-patches@gcc.gnu.org
> > Subject: [PATCH v3 07/15] arm: Implement MVE predicates as vectors of
> > booleans
> >
> > This patch implements support for vectors of booleans to support MVE
> > predicates, instead of HImode.  Since the ABI mandates pred16_t (aka
> > uint16_t) to represent predicates in intrinsics prototypes, we
> > introduce a new "predicate" type qualifier so that we can map relevant
> > builtins HImode arguments and return value to the appropriate vector
> > of booleans (VxBI).
> >
> > We have to update test_vector_ops_duplicate, because it iterates using
> > an offset in bytes, where we would need to iterate in bits: we stop
> > iterating when we reach the end of the vector of booleans.
> >
> > In addition, we have to fix the underlying definition of vectors of
> > booleans because ARM/MVE needs a different representation than
> > AArch64/SVE. With ARM/MVE the 'true' bit is duplicated over the
> > element size, so that a true element of V4BI is represented by
> > '0b'.  This patch updates the aarch64 definition of VNx*BI as
> > needed.
> >
> > 2022-01-13  Christophe Lyon  
> >   Richard Sandiford  
> >
> >   gcc/
> >   PR target/100757
> >   PR target/101325
> >   * config/aarch64/aarch64-modes.def (VNx16BI, VNx8BI, VNx4BI,
> >   VNx2BI): Update definition.
> >   * config/arm/arm-builtins.c (arm_init_simd_builtin_types): Add new
> >   simd types.
> >   (arm_init_builtin): Map predicate vectors arguments to HImode.
> >   (arm_expand_builtin_args): Move HImode predicate arguments to
> > VxBI
> >   rtx. Move return value to HImode rtx.
> >   * config/arm/arm-builtins.h (arm_type_qualifiers): Add
> > qualifier_predicate.
> >   * config/arm/arm-modes.def (B2I, B4I, V16BI, V8BI, V4BI): New
> > modes.
> >   * config/arm/arm-simd-builtin-types.def (Pred1x16_t,
> >   Pred2x8_t,Pred4x4_t): New.
> >   * emit-rtl.c (init_emit_once): Handle all boolean modes.
> >   * genmodes.c (mode_data): Add boolean field.
> >   (blank_mode): Initialize it.
> >   (make_complex_modes): Fix handling of boolean modes.
> >   (make_vector_modes): Likewise.
> >   (VECTOR_BOOL_MODE): Use new COMPONENT parameter.
> >   (make_vector_bool_mode): Likewise.
> >   (BOOL_MODE): New.
> >   (make_bool_mode): New.
> >   (emit_insn_modes_h): Fix generation of boolean modes.
> >   (emit_class_narrowest_mode): Likewise.
> >   * machmode.def: Use new BOOL_MODE instead of
> > FRACTIONAL_INT_MODE
> >   to define BImode.
> >   * rtx-vector-builder.c (rtx_vector_builder::find_cached_value):
> >   Fix handling of constm1_rtx for VECTOR_BOOL.
> >   * simplify-rtx.c (native_encode_rtx): Fix support for VECTOR_BOOL.
> >   (native_decode_vector_rtx): Likewise.
> >   (test_vector_ops_duplicate): Skip vec_merge test
> >   with vectors of booleans.
> >   * varasm.c (output_constant_pool_2): Likewise.
>
> The arm parts look ok. I guess Richard is best placed to approve the
> midend parts, but I see he's on the ChangeLog so maybe he needs others to
> review them. But then again Richard is maintainer of the gen* machinery
> that's the most complicated part of the patch so he can self-approve 
>

Thanks Kyrill,

Regarding the ARM part, Andre had a concern, I don't know if my proposal is
OK for him?

Christophe


> Thanks,
> Kyrill
>
> >
> > diff --git a/gcc/config/aarch64/aarch64-modes.def
> > b/gcc/config/aarch64/aarch64-modes.def
> > index 976bf9b42be..8f399225a80 100644
> > --- a/gcc/config/aarch64/aarch64-modes.def
> > +++ b/gcc/config/aarch64/aarch64-modes.def
> > @@ -47,10 +47,10 @@ ADJUST_FLOAT_FORMAT (HF, _half_format);
> >
> >  /* Vector modes.  */
> >
> > -VECTOR_BOOL_MODE (VNx16BI, 16, 2);
> > -VECTOR_BOOL_MODE (VNx8BI, 8, 2);
> > -VECTOR_BOOL_MODE (VNx4BI, 4, 2);
> > -VECTOR_BOOL_MODE (VNx2BI, 2, 2);
> > +VECTOR_BOOL_MODE (VNx16BI, 16, BI, 2);
> > +VECTOR_BOOL_MODE (VNx8BI, 8, BI, 2);
> > +VECTOR_BOOL_MODE (VNx4BI, 4, BI, 2);
> > +VECTOR_BOOL_MODE (VNx2BI, 2, BI, 2);
> >
> >  ADJUST_NUNITS (VNx16BI, aarch64_sve_vg * 8);
> >  ADJUST_NUNITS (VNx8BI, aarch64_sve_vg * 4);
> > diff --git a/gcc/config/arm/arm-builtins.c
> b/gcc/config/arm/arm-builtins.c
> > index 9c645722230..2ccfa37c302 100644
> > --- a/gcc/config/arm/arm-builtins.c
> > +++ b/gcc/config/arm/arm-builtins.c
> > @@ -1548,6 +1548,13 @@ arm_init_simd_builtin_types (void)
> >arm_simd_types[Bfloat16x4_t].eltype = arm_bf16_type_node;
> >arm_simd_types[Bfloat16x8_t].eltype = arm_bf16_type_node;
> >
> > +  if (TARGET_HAVE_MVE)
> > +{
> > +  arm_simd_types[Pred1x16_t].eltype = unsigned_intHI_type_node;
> > +

Re: [PATCH] c++: var tmpl w/ dependent constrained auto type [PR103341]

2022-01-27 Thread Patrick Palka via Gcc-patches

On Thu, 27 Jan 2022, Patrick Palka wrote:

> When deducing the type of a variable template specialization with a
> constrained auto type, we might need the template arguments for
> satisfaction in case the constraint is dependent.

It looks like templated static data members need similar treatment.
Here's a patch that handles both kinds of entities, what they have
in common is that they're templated and not at function scope
(bootstrap and regtesting in progress):

-- >8 --

Subject: [PATCH] c++: var tmpl w/ dependent constrained auto type [PR103341]

When deducing the type of a variable template specialization (or a
template static data member) with a constrained auto type, we might
need the template arguments during satisfaction from do_auto_deduction
in case the constraint is dependent.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk and perhaps 11?

PR c++/103341

gcc/cp/ChangeLog:

* decl.cc (cp_finish_decl): Pass the template arguments of a
variable template specialization or a templated static data
member to do_auto_deduction when the auto is constrained.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-class4.C: New test.
* g++.dg/cpp2a/concepts-var-templ2.C: New test.
---
 gcc/cp/decl.cc   | 12 +++-
 gcc/testsuite/g++.dg/cpp2a/concepts-class4.C | 12 
 gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C | 13 +
 3 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-class4.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 10e6956117e..465f5620f2c 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -7958,9 +7958,19 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
   enum auto_deduction_context adc = adc_variable_type;
   if (VAR_P (decl) && DECL_DECOMPOSITION_P (decl))
adc = adc_decomp_type;
+  tree outer_targs = NULL_TREE;
+  if (PLACEHOLDER_TYPE_CONSTRAINTS_INFO (auto_node)
+ && VAR_P (decl)
+ && DECL_LANG_SPECIFIC (decl)
+ && DECL_TEMPLATE_INFO (decl)
+ && !DECL_FUNCTION_SCOPE_P (decl))
+   /* The outer template arguments might be needed for satisfaction.
+  For function scope decls, do_auto_deduction obtains the outer
+  template arguments from current_function_decl.  */
+   outer_targs = DECL_TI_ARGS (decl);
   type = TREE_TYPE (decl) = do_auto_deduction (type, d_init, auto_node,
   tf_warning_or_error, adc,
-  NULL_TREE, flags);
+  outer_targs, flags);
   if (type == error_mark_node)
return;
   if (TREE_CODE (type) == FUNCTION_TYPE)
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-class4.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-class4.C
new file mode 100644
index 000..9d694e758e0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-class4.C
@@ -0,0 +1,12 @@
+// PR c++/103341
+// { dg-do compile { target c++20 } }
+
+template concept C = __is_same(T, U);
+
+template
+struct A {
+  static inline C auto value = 0; // { dg-error "constraint" }
+};
+
+template struct A; // { dg-bogus "" }
+template struct A; // { dg-message "required from here" }
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C
new file mode 100644
index 000..e1802aca75f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C
@@ -0,0 +1,13 @@
+// PR c++/103341
+// { dg-do compile { target c++20 } }
+
+template concept same_as = __is_same(T, U);
+template same_as auto v1a = 1;
+template same_as auto v1b = T();
+template same_as auto v2a = 1; // { dg-error "constraints" }
+template same_as auto v2b = T(); // { dg-error "constraints" }
+
+template int v1a;
+template int v1b;
+template int v2a; // { dg-message "required from here" }
+template int v2b; // { dg-message "required from here" }
-- 
2.35.0

Re: [PATCH] contrib: Put gcc-descr and gcc-undescr to file.

2022-01-27 Thread Martin Liška


On 1/27/22 16:35, Jakub Jelinek wrote:

For backwards compatibility, I'd prefer --full to be an alias to --long,
and maybe the --short handling should short=yes; long=no and
similarly --long/--full handling should long=yes; short=no
so that --short --long is --long and --long --short is --short.


I like the suggestions, applied that!

Similarly to Jonathan's comment, thank for it.

I've just pushed that to master.

Martin



Otherwise LGTM.

Re: [PATCH v2] rs6000: Disable MMA if no VSX support [PR103627]

2022-01-27 Thread Segher Boessenkool

Hi!

On Thu, Jan 27, 2022 at 07:21:33PM +0800, Kewen.Lin wrote:
>   PR target/103627
>   * config/rs6000/rs6000.cc (rs6000_option_override_internal): Disable
>   MMA if !TARGET_VSX.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/103627
>   * gcc.target/powerpc/pr103627-1.c: New test.
>   * gcc.target/powerpc/pr103627-2.c: New test.

Okay for trunk.  Thanks!


Segher

[PATCH] c++: var tmpl w/ dependent constrained auto type [PR103341]

2022-01-27 Thread Patrick Palka via Gcc-patches

When deducing the type of a variable template specialization with a
constrained auto type, we might need the template arguments for
satisfaction in case the constraint is dependent.

Bootstrapped and reegtested on x86_64-pc-linux-gnu, does this look OK
for trunk and perhaps 11?

PR c++/103341

gcc/cp/ChangeLog:

* decl.cc (cp_finish_decl): Pass the template arguments of a
variable template specialization to do_auto_deduction when
the auto is constrained.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-var-templ2.C: New test.
---
 gcc/cp/decl.cc   |  7 ++-
 gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C | 13 +
 2 files changed, 19 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 10e6956117e..b698fa9b95a 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -7958,9 +7958,14 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
   enum auto_deduction_context adc = adc_variable_type;
   if (VAR_P (decl) && DECL_DECOMPOSITION_P (decl))
adc = adc_decomp_type;
+  tree outer_targs = NULL_TREE;
+  if (is_constrained_auto (auto_node)
+ && variable_template_specialization_p (decl))
+   /* The outer template arguments might be needed for satisfaction.  */
+   outer_targs = DECL_TI_ARGS (decl);
   type = TREE_TYPE (decl) = do_auto_deduction (type, d_init, auto_node,
   tf_warning_or_error, adc,
-  NULL_TREE, flags);
+  outer_targs, flags);
   if (type == error_mark_node)
return;
   if (TREE_CODE (type) == FUNCTION_TYPE)
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C
new file mode 100644
index 000..56cead5e8c3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-var-templ2.C
@@ -0,0 +1,13 @@
+// PR c++/103341
+// { dg-do compile { target c++20 } }
+
+template concept same_as = __is_same(T, U);
+template same_as auto v1a{1};
+template same_as auto v1b{1};
+template same_as auto v2a{T()}; // { dg-error "constraints" }
+template same_as auto v2b{T()}; // { dg-error "constraints" }
+
+template int v1a;
+template int v1b;
+template int v2a; // { dg-message "required from here" }
+template int v2b; // { dg-message "required from here" }
-- 
2.35.0

RE: [PATCH v3 13/15] arm: Convert more MVE/CDE builtins to predicate qualifiers

2022-01-27 Thread Kyrylo Tkachov via Gcc-patches




> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: Thursday, January 13, 2022 2:56 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH v3 13/15] arm: Convert more MVE/CDE builtins to predicate
> qualifiers
> 
> This patch covers a few non-load/store builtins where we do not use
> the  iterator and thus we cannot use .
> 

Ok.
Thanks,
Kyrill

> 2022-01-13  Christophe Lyon  
> 
>   gcc/
>   PR target/100757
>   PR target/101325
>   * config/arm/arm-builtins.c (CX_UNARY_UNONE_QUALIFIERS): Use
>   predicate.
>   (CX_BINARY_UNONE_QUALIFIERS): Likewise.
>   (CX_TERNARY_UNONE_QUALIFIERS): Likewise.
>   (TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS): Delete.
>   (QUADOP_NONE_NONE_NONE_NONE_UNONE_QUALIFIERS): Delete.
>   (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE_QUALIFIERS):
> Delete.
>   * config/arm/arm_mve_builtins.def: Use predicated qualifiers.
>   * config/arm/mve.md: Use VxBI instead of HI.
> 
> diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
> index 73678a00398..f9437752a22 100644
> --- a/gcc/config/arm/arm-builtins.c
> +++ b/gcc/config/arm/arm-builtins.c
> @@ -295,7 +295,7 @@ static enum arm_type_qualifiers
>  arm_cx_unary_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_none, qualifier_immediate, qualifier_none,
>qualifier_unsigned_immediate,
> -  qualifier_unsigned };
> +  qualifier_predicate };
>  #define CX_UNARY_UNONE_QUALIFIERS (arm_cx_unary_unone_qualifiers)
> 
>  /* T (immediate, T, T, unsigned immediate).  */
> @@ -304,7 +304,7 @@
> arm_cx_binary_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_none, qualifier_immediate,
>qualifier_none, qualifier_none,
>qualifier_unsigned_immediate,
> -  qualifier_unsigned };
> +  qualifier_predicate };
>  #define CX_BINARY_UNONE_QUALIFIERS (arm_cx_binary_unone_qualifiers)
> 
>  /* T (immediate, T, T, T, unsigned immediate).  */
> @@ -313,7 +313,7 @@
> arm_cx_ternary_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_none, qualifier_immediate,
>qualifier_none, qualifier_none, qualifier_none,
>qualifier_unsigned_immediate,
> -  qualifier_unsigned };
> +  qualifier_predicate };
>  #define CX_TERNARY_UNONE_QUALIFIERS
> (arm_cx_ternary_unone_qualifiers)
> 
>  /* The first argument (return type) of a store should be void type,
> @@ -509,12 +509,6 @@
> arm_ternop_none_none_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>  #define TERNOP_NONE_NONE_NONE_IMM_QUALIFIERS \
>(arm_ternop_none_none_none_imm_qualifiers)
> 
> -static enum arm_type_qualifiers
> -
> arm_ternop_none_none_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS
> ]
> -  = { qualifier_none, qualifier_none, qualifier_none, qualifier_unsigned };
> -#define TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS \
> -  (arm_ternop_none_none_none_unone_qualifiers)
> -
>  static enum arm_type_qualifiers
>  arm_ternop_none_none_none_pred_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_none, qualifier_none, qualifier_none, qualifier_predicate };
> @@ -567,13 +561,6 @@
> arm_quadop_unone_unone_none_none_pred_qualifiers[SIMD_MAX_BUILTI
> N_ARGS]
>  #define QUADOP_UNONE_UNONE_NONE_NONE_PRED_QUALIFIERS \
>(arm_quadop_unone_unone_none_none_pred_qualifiers)
> 
> -static enum arm_type_qualifiers
> -
> arm_quadop_none_none_none_none_unone_qualifiers[SIMD_MAX_BUILTI
> N_ARGS]
> -  = { qualifier_none, qualifier_none, qualifier_none, qualifier_none,
> -qualifier_unsigned };
> -#define QUADOP_NONE_NONE_NONE_NONE_UNONE_QUALIFIERS \
> -  (arm_quadop_none_none_none_none_unone_qualifiers)
> -
>  static enum arm_type_qualifiers
> 
> arm_quadop_none_none_none_none_pred_qualifiers[SIMD_MAX_BUILTIN_
> ARGS]
>= { qualifier_none, qualifier_none, qualifier_none, qualifier_none,
> @@ -588,13 +575,6 @@
> arm_quadop_none_none_none_imm_pred_qualifiers[SIMD_MAX_BUILTIN_
> ARGS]
>  #define QUADOP_NONE_NONE_NONE_IMM_PRED_QUALIFIERS \
>(arm_quadop_none_none_none_imm_pred_qualifiers)
> 
> -static enum arm_type_qualifiers
> -
> arm_quadop_unone_unone_unone_unone_unone_qualifiers[SIMD_MAX_B
> UILTIN_ARGS]
> -  = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned,
> -qualifier_unsigned, qualifier_unsigned };
> -#define QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE_QUALIFIERS \
> -  (arm_quadop_unone_unone_unone_unone_unone_qualifiers)
> -
>  static enum arm_type_qualifiers
> 
> arm_quadop_unone_unone_unone_unone_pred_qualifiers[SIMD_MAX_BUI
> LTIN_ARGS]
>= { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned,
> diff --git a/gcc/config/arm/arm_mve_builtins.def
> b/gcc/config/arm/arm_mve_builtins.def
> index 7db6d47867e..1c8ee34f5cb 100644
> --- a/gcc/config/arm/arm_mve_builtins.def
> +++ b/gcc/config/arm/arm_mve_builtins.def
> @@ -87,8 +87,8 @@ VAR4 (BINOP_UNONE_UNONE_UNONE, vcreateq_u,
> v16qi, v8hi, v4si, v2di)
>  VAR4 (BINOP_NONE_UNONE_UNONE, vcreateq_s, v16qi, v8hi, v4si, v2di)

RE: [PATCH v3 12/15] arm: Convert more load/store MVE builtins to predicate qualifiers

2022-01-27 Thread Kyrylo Tkachov via Gcc-patches




> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: Thursday, January 13, 2022 2:56 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH v3 12/15] arm: Convert more load/store MVE builtins to
> predicate qualifiers
> 
> This patch covers a few builtins where we do not use the 
> iterator and thus we cannot use .
> 
> For v2di instructions, we keep the HI mode for predicates.

Ok.
Thanks,
Kyrill

> 
> 2022-01-13  Christophe Lyon  
> 
>   gcc/
>   PR target/100757
>   PR target/101325
>   * config/arm/arm-builtins.c (STRSBS_P_QUALIFIERS): Use predicate
>   qualifier.
>   (STRSBU_P_QUALIFIERS): Likewise.
>   (LDRGBS_Z_QUALIFIERS): Likewise.
>   (LDRGBU_Z_QUALIFIERS): Likewise.
>   (LDRGBWBXU_Z_QUALIFIERS): Likewise.
>   (LDRGBWBS_Z_QUALIFIERS): Likewise.
>   (LDRGBWBU_Z_QUALIFIERS): Likewise.
>   (STRSBWBS_P_QUALIFIERS): Likewise.
>   (STRSBWBU_P_QUALIFIERS): Likewise.
>   * config/arm/mve.md: Use VxBI instead of HI.
> 
> diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
> index 0b063b5f037..73678a00398 100644
> --- a/gcc/config/arm/arm-builtins.c
> +++ b/gcc/config/arm/arm-builtins.c
> @@ -689,13 +689,13 @@
> arm_strss_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>  static enum arm_type_qualifiers
>  arm_strsbs_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_void, qualifier_unsigned, qualifier_immediate,
> -  qualifier_none, qualifier_unsigned};
> +  qualifier_none, qualifier_predicate};
>  #define STRSBS_P_QUALIFIERS (arm_strsbs_p_qualifiers)
> 
>  static enum arm_type_qualifiers
>  arm_strsbu_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_void, qualifier_unsigned, qualifier_immediate,
> -  qualifier_unsigned, qualifier_unsigned};
> +  qualifier_unsigned, qualifier_predicate};
>  #define STRSBU_P_QUALIFIERS (arm_strsbu_p_qualifiers)
> 
>  static enum arm_type_qualifiers
> @@ -731,13 +731,13 @@ arm_ldrgbu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>  static enum arm_type_qualifiers
>  arm_ldrgbs_z_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_none, qualifier_unsigned, qualifier_immediate,
> -  qualifier_unsigned};
> +  qualifier_predicate};
>  #define LDRGBS_Z_QUALIFIERS (arm_ldrgbs_z_qualifiers)
> 
>  static enum arm_type_qualifiers
>  arm_ldrgbu_z_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_unsigned, qualifier_unsigned, qualifier_immediate,
> -  qualifier_unsigned};
> +  qualifier_predicate};
>  #define LDRGBU_Z_QUALIFIERS (arm_ldrgbu_z_qualifiers)
> 
>  static enum arm_type_qualifiers
> @@ -777,7 +777,7 @@
> arm_ldrgbwbxu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>  static enum arm_type_qualifiers
>  arm_ldrgbwbxu_z_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_unsigned, qualifier_unsigned, qualifier_immediate,
> -  qualifier_unsigned};
> +  qualifier_predicate};
>  #define LDRGBWBXU_Z_QUALIFIERS (arm_ldrgbwbxu_z_qualifiers)
> 
>  static enum arm_type_qualifiers
> @@ -793,13 +793,13 @@
> arm_ldrgbwbu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>  static enum arm_type_qualifiers
>  arm_ldrgbwbs_z_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_none, qualifier_unsigned, qualifier_immediate,
> -  qualifier_unsigned};
> +  qualifier_predicate};
>  #define LDRGBWBS_Z_QUALIFIERS (arm_ldrgbwbs_z_qualifiers)
> 
>  static enum arm_type_qualifiers
>  arm_ldrgbwbu_z_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_unsigned, qualifier_unsigned, qualifier_immediate,
> -  qualifier_unsigned};
> +  qualifier_predicate};
>  #define LDRGBWBU_Z_QUALIFIERS (arm_ldrgbwbu_z_qualifiers)
> 
>  static enum arm_type_qualifiers
> @@ -815,13 +815,13 @@
> arm_strsbwbu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>  static enum arm_type_qualifiers
>  arm_strsbwbs_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_unsigned, qualifier_unsigned, qualifier_const,
> -  qualifier_none, qualifier_unsigned};
> +  qualifier_none, qualifier_predicate};
>  #define STRSBWBS_P_QUALIFIERS (arm_strsbwbs_p_qualifiers)
> 
>  static enum arm_type_qualifiers
>  arm_strsbwbu_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_unsigned, qualifier_unsigned, qualifier_const,
> -  qualifier_unsigned, qualifier_unsigned};
> +  qualifier_unsigned, qualifier_predicate};
>  #define STRSBWBU_P_QUALIFIERS (arm_strsbwbu_p_qualifiers)
> 
>  static enum arm_type_qualifiers
> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
> index a8087815c22..9633b7187f6 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
> @@ -7282,7 +7282,7 @@ (define_insn
> "mve_vstrwq_scatter_base_p_v4si"
>   [(match_operand:V4SI 0 "s_register_operand" "w")
>(match_operand:SI 1 "immediate_operand" "i")
>(match_operand:V4SI 2 "s_register_operand" "w")
> -  (match_operand:HI 3 "vpr_register_operand" "Up")]
> +  (match_operand:V4BI 3

RE: [PATCH v3 09/15] arm: Fix vcond_mask expander for MVE (PR target/100757)

2022-01-27 Thread Kyrylo Tkachov via Gcc-patches

Hi Christophe,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: Thursday, January 13, 2022 2:56 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH v3 09/15] arm: Fix vcond_mask expander for MVE (PR
> target/100757)
> 
> The problem in this PR is that we call VPSEL with a mask of vector
> type instead of HImode. This happens because operand 3 in vcond_mask
> is the pre-computed vector comparison and has vector type.
> 
> This patch fixes it by implementing TARGET_VECTORIZE_GET_MASK_MODE,
> returning the appropriate VxBI mode when targeting MVE.  In turn, this
> implies implementing vec_cmp,
> vec_cmpu and vcond_mask_,
> and we can
> move vec_cmp, vec_cmpu and
> vcond_mask_ back to neon.md since they are not
> used by MVE anymore.  The new * patterns listed above are
> implemented in mve.md since they are only valid for MVE. However this
> may make maintenance/comparison more painful than having all of them
> in vec-common.md.
> 
> In the process, we can get rid of the recently added vcond_mve
> parameter of arm_expand_vector_compare.
> 
> Compared to neon.md's vcond_mask_ before my
> "arm:
> Auto-vectorization for MVE: vcmp" patch (r12-834), it keeps the VDQWH
> iterator added in r12-835 (to have V4HF/V8HF support), as well as the
> (! || flag_unsafe_math_optimizations) condition which
> was not present before r12-834 although SF modes were enabled by VDQW
> (I think this was a bug).
> 
> Using TARGET_VECTORIZE_GET_MASK_MODE has the advantage that we no
> longer need to generate vpsel with vectors of 0 and 1: the masks are
> now merged via scalar 'ands' instructions operating on 16-bit masks
> after converting the boolean vectors.
> 
> In addition, this patch fixes a problem in arm_expand_vcond() where
> the result would be a vector of 0 or 1 instead of operand 1 or 2.
> 
> Since we want to skip gcc.dg/signbit-2.c for MVE, we also add a new
> arm_mve effective target.
> 
> Reducing the number of iterations in pr100757-3.c from 32 to 8, we
> generate the code below:
> 
> float a[32];
> float fn1(int d) {
>   float c = 4.0f;
>   for (int b = 0; b < 8; b++)
> if (a[b] != 2.0f)
>   c = 5.0f;
>   return c;
> }
> 
> fn1:
>   ldr r3, .L3+48
>   vldr.64 d4, .L3  // q2=(2.0,2.0,2.0,2.0)
>   vldr.64 d5, .L3+8
>   vldrw.32q0, [r3] // q0=a(0..3)
>   addsr3, r3, #16
>   vcmp.f32eq, q0, q2   // cmp a(0..3) == (2.0,2.0,2.0,2.0)
>   vldrw.32q1, [r3] // q1=a(4..7)
>   vmrs r3, P0
>   vcmp.f32eq, q1, q2   // cmp a(4..7) == (2.0,2.0,2.0,2.0)
>   vmrsr2, P0  @ movhi
>   andsr3, r3, r2   // r3=select(a(0..3]) & select(a(4..7))
>   vldr.64 d4, .L3+16   // q2=(5.0,5.0,5.0,5.0)
>   vldr.64 d5, .L3+24
>   vmsr P0, r3
>   vldr.64 d6, .L3+32   // q3=(4.0,4.0,4.0,4.0)
>   vldr.64 d7, .L3+40
>   vpsel q3, q3, q2 // q3=vcond_mask(4.0,5.0)
>   vmov.32 r2, q3[1]// keep the scalar max
>   vmov.32 r0, q3[3]
>   vmov.32 r3, q3[2]
>   vmov.f32s11, s12
>   vmovs15, r2
>   vmovs14, r3
>   vmaxnm.f32  s15, s11, s15
>   vmaxnm.f32  s15, s15, s14
>   vmovs14, r0
>   vmaxnm.f32  s15, s15, s14
>   vmovr0, s15
>   bx  lr
>   .L4:
>   .align  3
>   .L3:
>   .word   1073741824  // 2.0f
>   .word   1073741824
>   .word   1073741824
>   .word   1073741824
>   .word   1084227584  // 5.0f
>   .word   1084227584
>   .word   1084227584
>   .word   1084227584
>   .word   1082130432  // 4.0f
>   .word   1082130432
>   .word   1082130432
>   .word   1082130432
> 
> 2022-01-13  Christophe Lyon  
> 
>   PR target/100757
>   gcc/
>   * config/arm/arm-protos.h (arm_get_mask_mode): New prototype.
>   (arm_expand_vector_compare): Update prototype.
>   * config/arm/arm.c (TARGET_VECTORIZE_GET_MASK_MODE): New.
>   (arm_vector_mode_supported_p): Add support for VxBI modes.
>   (arm_expand_vector_compare): Remove useless generation of vpsel.
>   (arm_expand_vcond): Fix select operands.
>   (arm_get_mask_mode): New.
>   * config/arm/mve.md (vec_cmp): New.
>   (vec_cmpu): New.
>   (vcond_mask_): New.
>   * config/arm/vec-common.md (vec_cmp)
>   (vec_cmpu):
> Move to ...
>   * config/arm/neon.md (vec_cmp)
>   (vec_cmpu): ...
> here
>   and disable for MVE.
>   * doc/sourcebuild.texi (arm_mve): Document new effective-target.
> 
>   gcc/testsuite/
>   * gcc.dg/signbit-2.c: Skip when targeting ARM/MVE.
>   * lib/target-supports.exp (check_effective_target_arm_mve): New.
> 
> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> index b978adf2038..a84613104b1 100644
> --- a/gcc/config/arm/arm-protos.h
> +++

[PATCH] openmp, nvptx: low-lat memory access traits

2022-01-27 Thread Andrew Stubbs

This patch adjusts the NVPTX low-latency allocator that I have 
previously posted (awaiting re-review). The patch assumes that all my 
previously posted patches are applied already.


Given that any memory allocated from the low-latency memory space cannot 
support the "access=all" allocator trait (because the hardware does not 
provide any means to do so) it seems reasonable that omp_alloc should 
fail (or fall back) when asked to do so. Unfortunately the "all" setting 
is the default when the trait is not specified explicitly, so it must 
also fail in that case also.


This patch implements the restriction accordingly. The validation 
applies only to the NVPTX configuration, so some future implementation 
for another target can do whatever it needs with "access".


Without explicitly saying so, this change means that the 
omp_low_latency_mem_alloc predefined allocator now implies 
"access=pteam" (at least on NVPTX).


OK for stage 1?

Thanks

Andrewopenmp, nvptx: low-lat memory access traits

The NVPTX low latency memory is not accessible outside the team that allocates
it, and therefore should be unavailable for allocators with the access trait
"all".  This change means that the omp_low_lat_mem_alloc predefined
allocator now implicitly implies the "pteam" trait.

libgomp/ChangeLog:

* allocator.c (MEMSPACE_VALIDATE): New macro.
(omp_aligned_alloc): Use MEMSPACE_VALIDATE.
(omp_aligned_calloc): Likewise.
(omp_realloc): Likewise.
* config/nvptx/allocator.c (nvptx_memspace_validate): New function.
(MEMSPACE_VALIDATE): New macro.
* testsuite/libgomp.c/allocators-4.c (main): Add access trait.
* testsuite/libgomp.c/allocators-6.c (main): Add access trait.
* testsuite/libgomp.c/allocators-7.c: New test.

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index b1f41ccc0d4..000ccc2dd9c 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -51,6 +51,9 @@
 #define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE, PIN) \
   (PIN ? NULL : free (ADDR))
 #endif
+#ifndef MEMSPACE_VALIDATE
+#define MEMSPACE_VALIDATE(MEMSPACE, ACCESS) 1
+#endif
 
 /* Map the predefined allocators to the correct memory space.
The index to this table is the omp_allocator_handle_t enum value.  */
@@ -279,6 +282,10 @@ retry:
   if (__builtin_add_overflow (size, new_size, _size))
 goto fail;
 
+  if (allocator_data
+  && !MEMSPACE_VALIDATE (allocator_data->memspace, allocator_data->access))
+goto fail;
+
   if (__builtin_expect (allocator_data
&& allocator_data->pool_size < ~(uintptr_t) 0, 0))
 {
@@ -505,6 +512,10 @@ retry:
   if (__builtin_add_overflow (size_temp, new_size, _size))
 goto fail;
 
+  if (allocator_data
+  && !MEMSPACE_VALIDATE (allocator_data->memspace, allocator_data->access))
+goto fail;
+
   if (__builtin_expect (allocator_data
&& allocator_data->pool_size < ~(uintptr_t) 0, 0))
 {
@@ -672,6 +683,10 @@ retry:
 goto fail;
   old_size = data->size;
 
+  if (allocator_data
+  && !MEMSPACE_VALIDATE (allocator_data->memspace, allocator_data->access))
+goto fail;
+
   if (__builtin_expect (allocator_data
&& allocator_data->pool_size < ~(uintptr_t) 0, 0))
 {
diff --git a/libgomp/config/nvptx/allocator.c b/libgomp/config/nvptx/allocator.c
index f740b97f6ac..0102680b717 100644
--- a/libgomp/config/nvptx/allocator.c
+++ b/libgomp/config/nvptx/allocator.c
@@ -358,6 +358,15 @@ nvptx_memspace_realloc (omp_memspace_handle_t memspace, 
void *addr,
 return realloc (addr, size);
 }
 
+static inline int
+nvptx_memspace_validate (omp_memspace_handle_t memspace, unsigned access)
+{
+  /* Disallow use of low-latency memory when it must be accessible by
+ all threads.  */
+  return (memspace != omp_low_lat_mem_space
+ || access != omp_atv_all);
+}
+
 #define MEMSPACE_ALLOC(MEMSPACE, SIZE, PIN) \
   nvptx_memspace_alloc (MEMSPACE, SIZE)
 #define MEMSPACE_CALLOC(MEMSPACE, SIZE, PIN) \
@@ -366,5 +375,7 @@ nvptx_memspace_realloc (omp_memspace_handle_t memspace, 
void *addr,
   nvptx_memspace_realloc (MEMSPACE, ADDR, OLDSIZE, SIZE)
 #define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE, PIN) \
   nvptx_memspace_free (MEMSPACE, ADDR, SIZE)
+#define MEMSPACE_VALIDATE(MEMSPACE, ACCESS) \
+  nvptx_memspace_validate (MEMSPACE, ACCESS)
 
 #include "../../allocator.c"
diff --git a/libgomp/testsuite/libgomp.c/allocators-4.c 
b/libgomp/testsuite/libgomp.c/allocators-4.c
index 9fa6aa1624f..cae27ea33c1 100644
--- a/libgomp/testsuite/libgomp.c/allocators-4.c
+++ b/libgomp/testsuite/libgomp.c/allocators-4.c
@@ -23,10 +23,11 @@ main ()
   #pragma omp target
   {
 /* Ensure that the memory we get *is* low-latency with a null-fallback.  */
-omp_alloctrait_t traits[1]
-  = { { omp_atk_fallback, omp_atv_null_fb } };
+omp_alloctrait_t traits[2]
+  = { { omp_atk_fallback, omp_atv_null_fb },
+  { omp_atk_access, omp_atv_pteam } };

RE: [PATCH v3 08/15] arm: Implement auto-vectorized MVE comparisons with vectors of boolean predicates

2022-01-27 Thread Kyrylo Tkachov via Gcc-patches

Hi Christophe,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: Thursday, January 13, 2022 2:56 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH v3 08/15] arm: Implement auto-vectorized MVE
> comparisons with vectors of boolean predicates
> 
> We make use of qualifier_predicate to describe MVE builtins
> prototypes, restricting to auto-vectorizable vcmp* and vpsel builtins,
> as they are exercised by the tests added earlier in the series.
> 
> Special handling is needed for mve_vpselq because it has a v2di
> variant, which has no natural VPR.P0 representation: we keep HImode
> for it.
> 
> The vector_compare expansion code is updated to use the right VxBI
> mode instead of HI for the result.
> 
> We extend the existing thumb2_movhi_vfp and thumb2_movhi_fp16
> patterns
> to use the new MVE_7_HI iterator which covers HI and the new VxBI
> modes, in conjunction with the new DB constraint for a constant vector
> of booleans.
> 
> 2022-01-13  Christophe Lyon 
>   Richard Sandiford  
> 
>   gcc/
>   PR target/100757
>   PR target/101325
>   * config/arm/arm-builtins.c
> (BINOP_PRED_UNONE_UNONE_QUALIFIERS)
>   (BINOP_PRED_NONE_NONE_QUALIFIERS)
>   (TERNOP_NONE_NONE_NONE_PRED_QUALIFIERS)
>   (TERNOP_UNONE_UNONE_UNONE_PRED_QUALIFIERS): New.
>   * config/arm/arm-protos.h (mve_const_bool_vec_to_hi): New.
>   * config/arm/arm.c (arm_hard_regno_mode_ok): Handle new VxBI
>   modes.
>   (arm_mode_to_pred_mode): New.
>   (arm_expand_vector_compare): Use the right VxBI mode instead of
>   HI.
>   (arm_expand_vcond): Likewise.
>   (simd_valid_immediate): Handle MODE_VECTOR_BOOL.
>   (mve_const_bool_vec_to_hi): New.
>   (neon_make_constant): Call mve_const_bool_vec_to_hi when
> needed.
>   * config/arm/arm_mve_builtins.def (vcmpneq_, vcmphiq_,
> vcmpcsq_)
>   (vcmpltq_, vcmpleq_, vcmpgtq_, vcmpgeq_, vcmpeqq_, vcmpneq_f)
>   (vcmpltq_f, vcmpleq_f, vcmpgtq_f, vcmpgeq_f, vcmpeqq_f,
> vpselq_u)
>   (vpselq_s, vpselq_f): Use new predicated qualifiers.
>   * config/arm/constraints.md (DB): New.
>   * config/arm/iterators.md (MVE_7, MVE_7_HI): New mode iterators.
>   (MVE_VPRED, MVE_vpred): New attribute iterators.
>   * config/arm/mve.md (@mve_vcmpq_)
>   (@mve_vcmpq_f,
> @mve_vpselq_)
>   (@mve_vpselq_f): Use MVE_VPRED instead of HI.
>   (@mve_vpselq_v2di): Define separately.
>   (mov): New expander for VxBI modes.
>   * config/arm/vfp.md (thumb2_movhi_vfp, thumb2_movhi_fp16):
> Use
>   MVE_7_HI iterator and add support for DB constraint.
> 
>   gcc/testsuite/
>   PR target/100757
>   PR target/101325
>   * gcc.dg/rtl/arm/mve-vxbi.c: New test.
> 
> diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
> index 2ccfa37c302..36d71ab1a13 100644
> --- a/gcc/config/arm/arm-builtins.c
> +++ b/gcc/config/arm/arm-builtins.c
> @@ -420,6 +420,12 @@
> arm_binop_unone_unone_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>  #define BINOP_UNONE_UNONE_UNONE_QUALIFIERS \
>(arm_binop_unone_unone_unone_qualifiers)
> 
> +static enum arm_type_qualifiers
> +arm_binop_pred_unone_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> +  = { qualifier_predicate, qualifier_unsigned, qualifier_unsigned };
> +#define BINOP_PRED_UNONE_UNONE_QUALIFIERS \
> +  (arm_binop_pred_unone_unone_qualifiers)
> +
>  static enum arm_type_qualifiers
>  arm_binop_unone_none_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_unsigned, qualifier_none, qualifier_immediate };
> @@ -438,6 +444,12 @@
> arm_binop_unone_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>  #define BINOP_UNONE_NONE_NONE_QUALIFIERS \
>(arm_binop_unone_none_none_qualifiers)
> 
> +static enum arm_type_qualifiers
> +arm_binop_pred_none_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> +  = { qualifier_predicate, qualifier_none, qualifier_none };
> +#define BINOP_PRED_NONE_NONE_QUALIFIERS \
> +  (arm_binop_pred_none_none_qualifiers)
> +
>  static enum arm_type_qualifiers
>  arm_binop_unone_unone_none_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_unsigned, qualifier_unsigned, qualifier_none };
> @@ -509,6 +521,12 @@
> arm_ternop_none_none_none_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS
> ]
>  #define TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS \
>(arm_ternop_none_none_none_unone_qualifiers)
> 
> +static enum arm_type_qualifiers
> +arm_ternop_none_none_none_pred_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> +  = { qualifier_none, qualifier_none, qualifier_none, qualifier_predicate };
> +#define TERNOP_NONE_NONE_NONE_PRED_QUALIFIERS \
> +  (arm_ternop_none_none_none_pred_qualifiers)
> +
>  static enum arm_type_qualifiers
> 
> arm_ternop_none_none_imm_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_none, qualifier_none, qualifier_immediate, qualifier_unsigned
> };
> @@ -528,6 +546,13 @@
> arm_ternop_unone_unone_unone_unone_qualifiers[SIMD_MAX_BUILTIN_A
> RGS]
>

RE: [PATCH v3 07/15] arm: Implement MVE predicates as vectors of booleans

2022-01-27 Thread Kyrylo Tkachov via Gcc-patches

Hi Christophe,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: Thursday, January 13, 2022 2:56 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH v3 07/15] arm: Implement MVE predicates as vectors of
> booleans
> 
> This patch implements support for vectors of booleans to support MVE
> predicates, instead of HImode.  Since the ABI mandates pred16_t (aka
> uint16_t) to represent predicates in intrinsics prototypes, we
> introduce a new "predicate" type qualifier so that we can map relevant
> builtins HImode arguments and return value to the appropriate vector
> of booleans (VxBI).
> 
> We have to update test_vector_ops_duplicate, because it iterates using
> an offset in bytes, where we would need to iterate in bits: we stop
> iterating when we reach the end of the vector of booleans.
> 
> In addition, we have to fix the underlying definition of vectors of
> booleans because ARM/MVE needs a different representation than
> AArch64/SVE. With ARM/MVE the 'true' bit is duplicated over the
> element size, so that a true element of V4BI is represented by
> '0b'.  This patch updates the aarch64 definition of VNx*BI as
> needed.
> 
> 2022-01-13  Christophe Lyon  
>   Richard Sandiford  
> 
>   gcc/
>   PR target/100757
>   PR target/101325
>   * config/aarch64/aarch64-modes.def (VNx16BI, VNx8BI, VNx4BI,
>   VNx2BI): Update definition.
>   * config/arm/arm-builtins.c (arm_init_simd_builtin_types): Add new
>   simd types.
>   (arm_init_builtin): Map predicate vectors arguments to HImode.
>   (arm_expand_builtin_args): Move HImode predicate arguments to
> VxBI
>   rtx. Move return value to HImode rtx.
>   * config/arm/arm-builtins.h (arm_type_qualifiers): Add
> qualifier_predicate.
>   * config/arm/arm-modes.def (B2I, B4I, V16BI, V8BI, V4BI): New
> modes.
>   * config/arm/arm-simd-builtin-types.def (Pred1x16_t,
>   Pred2x8_t,Pred4x4_t): New.
>   * emit-rtl.c (init_emit_once): Handle all boolean modes.
>   * genmodes.c (mode_data): Add boolean field.
>   (blank_mode): Initialize it.
>   (make_complex_modes): Fix handling of boolean modes.
>   (make_vector_modes): Likewise.
>   (VECTOR_BOOL_MODE): Use new COMPONENT parameter.
>   (make_vector_bool_mode): Likewise.
>   (BOOL_MODE): New.
>   (make_bool_mode): New.
>   (emit_insn_modes_h): Fix generation of boolean modes.
>   (emit_class_narrowest_mode): Likewise.
>   * machmode.def: Use new BOOL_MODE instead of
> FRACTIONAL_INT_MODE
>   to define BImode.
>   * rtx-vector-builder.c (rtx_vector_builder::find_cached_value):
>   Fix handling of constm1_rtx for VECTOR_BOOL.
>   * simplify-rtx.c (native_encode_rtx): Fix support for VECTOR_BOOL.
>   (native_decode_vector_rtx): Likewise.
>   (test_vector_ops_duplicate): Skip vec_merge test
>   with vectors of booleans.
>   * varasm.c (output_constant_pool_2): Likewise.

The arm parts look ok. I guess Richard is best placed to approve the midend 
parts, but I see he's on the ChangeLog so maybe he needs others to review them. 
But then again Richard is maintainer of the gen* machinery that's the most 
complicated part of the patch so he can self-approve 
Thanks,
Kyrill

> 
> diff --git a/gcc/config/aarch64/aarch64-modes.def
> b/gcc/config/aarch64/aarch64-modes.def
> index 976bf9b42be..8f399225a80 100644
> --- a/gcc/config/aarch64/aarch64-modes.def
> +++ b/gcc/config/aarch64/aarch64-modes.def
> @@ -47,10 +47,10 @@ ADJUST_FLOAT_FORMAT (HF, _half_format);
> 
>  /* Vector modes.  */
> 
> -VECTOR_BOOL_MODE (VNx16BI, 16, 2);
> -VECTOR_BOOL_MODE (VNx8BI, 8, 2);
> -VECTOR_BOOL_MODE (VNx4BI, 4, 2);
> -VECTOR_BOOL_MODE (VNx2BI, 2, 2);
> +VECTOR_BOOL_MODE (VNx16BI, 16, BI, 2);
> +VECTOR_BOOL_MODE (VNx8BI, 8, BI, 2);
> +VECTOR_BOOL_MODE (VNx4BI, 4, BI, 2);
> +VECTOR_BOOL_MODE (VNx2BI, 2, BI, 2);
> 
>  ADJUST_NUNITS (VNx16BI, aarch64_sve_vg * 8);
>  ADJUST_NUNITS (VNx8BI, aarch64_sve_vg * 4);
> diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
> index 9c645722230..2ccfa37c302 100644
> --- a/gcc/config/arm/arm-builtins.c
> +++ b/gcc/config/arm/arm-builtins.c
> @@ -1548,6 +1548,13 @@ arm_init_simd_builtin_types (void)
>arm_simd_types[Bfloat16x4_t].eltype = arm_bf16_type_node;
>arm_simd_types[Bfloat16x8_t].eltype = arm_bf16_type_node;
> 
> +  if (TARGET_HAVE_MVE)
> +{
> +  arm_simd_types[Pred1x16_t].eltype = unsigned_intHI_type_node;
> +  arm_simd_types[Pred2x8_t].eltype = unsigned_intHI_type_node;
> +  arm_simd_types[Pred4x4_t].eltype = unsigned_intHI_type_node;
> +}
> +
>for (i = 0; i < nelts; i++)
>  {
>tree eltype = arm_simd_types[i].eltype;
> @@ -1695,6 +1702,11 @@ arm_init_builtin (unsigned int fcode,
> arm_builtin_datum *d,
>if (qualifiers & qualifier_map_mode)
>   op_mode = d->mode;
> 
> +  /* MVE Predicates use

Re: [PATCH] internal_error - do not use leading capital letter

2022-01-27 Thread Iain Buclaw via Gcc-patches

Excerpts from Martin Liška's message of Januar 27, 2022 1:40 pm:
> That's follow up patch based on the discussion with Jakub.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/host-darwin.cc (segv_crash_handler):
>   Do not use leading capital letter.
>   (segv_handler): Likewise.
>   * ipa-sra.cc (verify_splitting_accesses): Likewise.
>   * varasm.cc (get_section): Likewise.
> 
> gcc/d/ChangeLog:
> 
>   * decl.cc (d_finish_decl): Do not use leading capital letter.
> ---
>   gcc/config/rs6000/host-darwin.cc | 4 ++--
>   gcc/d/decl.cc| 2 +-
>   gcc/ipa-sra.cc   | 4 ++--
>   gcc/varasm.cc| 2 +-
>   4 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/d/decl.cc b/gcc/d/decl.cc
> index c7a1e4652f8..5ecc6269780 100644
> --- a/gcc/d/decl.cc
> +++ b/gcc/d/decl.cc
> @@ -1597,7 +1597,7 @@ d_finish_decl (tree decl)
>   {
> tree name = DECL_ASSEMBLER_NAME (decl);
>   
> -   internal_error ("Mismatch between declaration %qE size (%wd) and "
> +   internal_error ("mismatch between declaration %qE size (%wd) and "
> "its initializer size (%wd).",
> IDENTIFIER_PRETTY_NAME (name)
> ? IDENTIFIER_PRETTY_NAME (name) : name,


This part is OK for me.

Iain.

RE: [PATCH v3 04/15] arm: Add GENERAL_AND_VPR_REGS regclass

2022-01-27 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Christophe
> Lyon via Gcc-patches
> Sent: Thursday, January 13, 2022 2:56 PM
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH v3 04/15] arm: Add GENERAL_AND_VPR_REGS regclass
> 
> At some point during the development of this patch series, it appeared
> that in some cases the register allocator wants “VPR or general”
> rather than “VPR or general or FP” (which is the same thing as
> ALL_REGS).  The series does not seem to require this anymore, but it
> seems to be a good thing to do anyway, to give the register allocator
> more freedom.
> 
> CLASS_MAX_NREGS and arm_hard_regno_nregs need adjustment to avoid a
> regression in gcc.dg/stack-usage-1.c when compiled with -mthumb
> -mfloat-abi=hard -march=armv8.1-m.main+mve.fp+fp.dp.

Given the discussions I've seen on this patch (thanks Andre and Richard) this 
is ok.
Though please rebase this as we've since renamed arm.c to arm.cc

Thanks,
Kyrill

> 
> 2022-01-13  Christophe Lyon  
> 
>   gcc/
>   * config/arm/arm.h (reg_class): Add GENERAL_AND_VPR_REGS.
>   (REG_CLASS_NAMES): Likewise.
>   (REG_CLASS_CONTENTS): Likewise.
>   (CLASS_MAX_NREGS): Handle VPR.
>   * config/arm/arm.c (arm_hard_regno_nregs): Handle VPR.
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index bb75921f32d..c3559ca8703 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -25287,6 +25287,9 @@ thumb2_asm_output_opcode (FILE * stream)
>  static unsigned int
>  arm_hard_regno_nregs (unsigned int regno, machine_mode mode)
>  {
> +  if (IS_VPR_REGNUM (regno))
> +return CEIL (GET_MODE_SIZE (mode), 2);
> +
>if (TARGET_32BIT
>&& regno > PC_REGNUM
>&& regno != FRAME_POINTER_REGNUM
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index dacce2b7f08..2416fb5ef64 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -1287,6 +1287,7 @@ enum reg_class
>SFP_REG,
>AFP_REG,
>VPR_REG,
> +  GENERAL_AND_VPR_REGS,
>ALL_REGS,
>LIM_REG_CLASSES
>  };
> @@ -1316,6 +1317,7 @@ enum reg_class
>"SFP_REG", \
>"AFP_REG", \
>"VPR_REG", \
> +  "GENERAL_AND_VPR_REGS", \
>"ALL_REGS" \
>  }
> 
> @@ -1344,6 +1346,7 @@ enum reg_class
>{ 0x, 0x, 0x, 0x0040 }, /* SFP_REG */
>   \
>{ 0x, 0x, 0x, 0x0080 }, /* AFP_REG */
>   \
>{ 0x, 0x, 0x, 0x0400 }, /* VPR_REG.  */
>   \
> +  { 0x5FFF, 0x, 0x, 0x0400 }, /*
> GENERAL_AND_VPR_REGS.  */ \
>{ 0x7FFF, 0x, 0x, 0x000F }  /* ALL_REGS.  */   
> \
>  }
> 
> @@ -1453,7 +1456,9 @@ extern const char
> *fp_sysreg_names[NB_FP_SYSREGS];
> ARM regs are UNITS_PER_WORD bits.
> FIXME: Is this true for iWMMX?  */
>  #define CLASS_MAX_NREGS(CLASS, MODE)  \
> -  (ARM_NUM_REGS (MODE))
> +  (CLASS == VPR_REG)   \
> +  ? CEIL (GET_MODE_SIZE (MODE), 2)\
> +  : (ARM_NUM_REGS (MODE))
> 
>  /* If defined, gives a class of registers that cannot be used as the
> operand of a SUBREG that changes the mode of the object illegally.  */
> --
> 2.25.1

Re: [PATCH] contrib: Put gcc-descr and gcc-undescr to file.

2022-01-27 Thread Jonathan Wakely via Gcc-patches

On Thu, 27 Jan 2022, 15:06 Martin Liška,  wrote:

> Hello.
>
> I've finished Martin's work and put the 2 aliases into files. The
> git-undescr.sh is basically
> unchanged, while I added better option parsing for git-descr.sh script so
> that it supports:
>
> $ git gcc-descr
> r12-6895-g14f339894db6ca
>
> $ git gcc-descr HEAD~10
> r12-6886-geaa59070343326
>
> $ git gcc-descr HEAD~10 --long
> r12-6886-geaa5907034332649c918f0579da805b6e786aa47
>
> $ git gcc-descr --short HEAD~10 --long
> r12-6886
>
> $ git gcc-descr --short --short --long HEAD~10
> r12-6886
>
> Ready to be installed?
>

> +expr match ${r:-no} '^r[0-9]\+$' >/dev/null && r=${r}-0-g$(git
rev-parse $c);


Please remove the ^ from the 'expr' regex. The regex for an expr match can
only match at the beginning, so the ^ is implicit, but some implementations
ignore it (e.g. on Linux) and others march it as a normal character (e.g.
macOS).

Re: [PATCH] contrib: Put gcc-descr and gcc-undescr to file.

2022-01-27 Thread Jakub Jelinek via Gcc-patches

On Thu, Jan 27, 2022 at 04:06:05PM +0100, Martin Liška wrote:
> I've finished Martin's work and put the 2 aliases into files. The 
> git-undescr.sh is basically
> unchanged, while I added better option parsing for git-descr.sh script so 
> that it supports:
> 
> $ git gcc-descr
> r12-6895-g14f339894db6ca
> 
> $ git gcc-descr HEAD~10
> r12-6886-geaa59070343326
> 
> $ git gcc-descr HEAD~10 --long
> r12-6886-geaa5907034332649c918f0579da805b6e786aa47
> 
> $ git gcc-descr --short HEAD~10 --long
> r12-6886
> 
> $ git gcc-descr --short --short --long HEAD~10
> r12-6886
> 
> Ready to be installed?

For backwards compatibility, I'd prefer --full to be an alias to --long,
and maybe the --short handling should short=yes; long=no and
similarly --long/--full handling should long=yes; short=no
so that --short --long is --long and --long --short is --short.

Otherwise LGTM.

> From feb3f83724cd0764f7ad3fbd1504c0d43266c88a Mon Sep 17 00:00:00 2001
> From: Martin Liska 
> Date: Thu, 27 Jan 2022 16:01:55 +0100
> Subject: [PATCH] contrib: Put gcc-descr and gcc-undescr to file.
> 
> contrib/ChangeLog:
> 
>   * git-descr.sh: New file.
>   * git-undescr.sh: New file.
>   Support optional arguments --long, --short and default
>   to 14 characters of git hash.
> 
> contrib/ChangeLog:
> 
>   * gcc-git-customization.sh: Use the created files.
>   * git-descr.sh: New file.
>   * git-undescr.sh: New file.
> 
> Co-Authored-By: Martin Jambor 
> ---
>  contrib/gcc-git-customization.sh |  4 ++--
>  contrib/git-descr.sh | 37 
>  contrib/git-undescr.sh   | 13 +++
>  3 files changed, 52 insertions(+), 2 deletions(-)
>  create mode 100755 contrib/git-descr.sh
>  create mode 100755 contrib/git-undescr.sh
> 
> diff --git a/contrib/gcc-git-customization.sh 
> b/contrib/gcc-git-customization.sh
> index 2eec17937af..b24948d9874 100755
> --- a/contrib/gcc-git-customization.sh
> +++ b/contrib/gcc-git-customization.sh
> @@ -22,8 +22,8 @@ git config alias.svn-rev '!f() { rev=$1; shift; git log 
> --all --grep="^From-SVN:
>  
>  # Add git commands to convert git commit to monotonically increasing 
> revision number
>  # and vice versa
> -git config alias.gcc-descr \!"f() { if test \${1:-no} = --full; then 
> c=\${2:-master}; r=\$(git describe --all --abbrev=40 --match 
> 'basepoints/gcc-[0-9]*' \$c | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-,r,p'); 
> expr match \${r:-no} '^r[0-9]\\+\$' >/dev/null && r=\${r}-0-g\$(git rev-parse 
> \${2:-master}); else c=\${1:-master}; r=\$(git describe --all --match 
> 'basepoints/gcc-[0-9]*' \$c | sed -n 
> 's,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)-\\([0-9]\\+\\)-g[0-9a-f]*\$,r\\2-\\3,p;s,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)\$,r\\2-0,p');
>  fi; if test -n \$r; then o=\$(git config --get gcc-config.upstream); 
> rr=\$(echo \$r | sed -n 
> 's,^r\\([0-9]\\+\\)-[0-9]\\+\\(-g[0-9a-f]\\+\\)\\?\$,\\1,p'); if git 
> rev-parse --verify --quiet \${o:-origin}/releases/gcc-\$rr >/dev/null; then 
> m=releases/gcc-\$rr; else m=master; fi; git merge-base --is-ancestor \$c 
> \${o:-origin}/\$m && \echo \${r}; fi; }; f"
> -git config alias.gcc-undescr \!"f() { o=\$(git config --get 
> gcc-config.upstream); r=\$(echo \$1 | sed -n 
> 's,^r\\([0-9]\\+\\)-[0-9]\\+\$,\\1,p'); n=\$(echo \$1 | sed -n 
> 's,^r[0-9]\\+-\\([0-9]\\+\\)\$,\\1,p'); test -z \$r && echo Invalid id \$1 && 
> exit 1; h=\$(git rev-parse --verify --quiet \${o:-origin}/releases/gcc-\$r); 
> test -z \$h && h=\$(git rev-parse --verify --quiet \${o:-origin}/master); 
> p=\$(git describe --all --match 'basepoints/gcc-'\$r \$h | sed -n 
> 's,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+-\\([0-9]\\+\\)-g[0-9a-f]*\$,\\2,p;s,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+\$,0,p');
>  git rev-parse --verify \$h~\$(expr \$p - \$n); }; f"
> +git config alias.gcc-descr '!f() { "`git rev-parse 
> --show-toplevel`/contrib/git-descr.sh" $@; } ; f'
> +git config alias.gcc-undescr '!f() { "`git rev-parse 
> --show-toplevel`/contrib/git-undescr.sh" $@; } ; f'
>  
>  git config alias.gcc-verify '!f() { "`git rev-parse 
> --show-toplevel`/contrib/gcc-changelog/git_check_commit.py" $@; } ; f'
>  git config alias.gcc-backport '!f() { "`git rev-parse 
> --show-toplevel`/contrib/git-backport.py" $@; } ; f'
> diff --git a/contrib/git-descr.sh b/contrib/git-descr.sh
> new file mode 100755
> index 000..0d3e42940b3
> --- /dev/null
> +++ b/contrib/git-descr.sh
> @@ -0,0 +1,37 @@
> +#!/bin/sh
> +
> +# Script to describe a GCC revision based on git hash
> +
> +short=no
> +long=no
> +c=master
> +
> +for arg in "$@"
> +do
> +case "$arg" in
> +  --short) short=yes
> + ;;
> +  --long) long=yes
> + ;;
> +  *) c=$arg
> +esac
> +done
> +
> +if test x$short = xyes; then
> +r=$(git describe --all --match 'basepoints/gcc-[0-9]*' $c | sed -n 
> 's,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)-\([0-9]\+\)-g[0-9a-f]*$,r\2-\3,p;s,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)$,r\2-0,p');
> +elif test

Re: [PATCH] c++: constrained partial spec using qualified name [PR92944]

2022-01-27 Thread Jason Merrill via Gcc-patches


On 1/26/22 12:49, Patrick Palka wrote:

In the nested_name_specifier branch within cp_parser_class_head, we need
to update TYPE with the result of maybe_process_partial_specialization
just like we do in the template_id_p branch.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


PR c++/92944

gcc/cp/ChangeLog:

* parser.cc (cp_parser_class_head): Update TYPE with the result
of maybe_process_partial_specialization in the
nested_name_specifier branch too.  Refactor nearby code to
accomodate that maybe_process_partial_specialization returns a
_TYPE, not a TYPE_DECL, and eliminate local variable CLASS_TYPE.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-partial-spec10.C: New test.
---
  gcc/cp/parser.cc   | 18 +++---
  .../g++.dg/cpp2a/concepts-partial-spec10.C |  6 ++
  2 files changed, 13 insertions(+), 11 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index ed219d79dc9..aa1768ffab1 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -26534,7 +26534,7 @@ cp_parser_class_head (cp_parser* parser,
  }
else if (nested_name_specifier)
  {
-  tree class_type;
+  type = TREE_TYPE (type);
  
/* Given:
  
@@ -26544,31 +26544,27 @@ cp_parser_class_head (cp_parser* parser,

 we will get a TYPENAME_TYPE when processing the definition of
 `S::T'.  We need to resolve it to the actual type before we
 try to define it.  */
-  if (TREE_CODE (TREE_TYPE (type)) == TYPENAME_TYPE)
+  if (TREE_CODE (type) == TYPENAME_TYPE)
{
- class_type = resolve_typename_type (TREE_TYPE (type),
- /*only_current_p=*/false);
- if (TREE_CODE (class_type) != TYPENAME_TYPE)
-   type = TYPE_NAME (class_type);
- else
+ type = resolve_typename_type (type, /*only_current_p=*/false);
+ if (TREE_CODE (type) == TYPENAME_TYPE)
{
  cp_parser_error (parser, "could not resolve typename type");
  type = error_mark_node;
}
}
  
-  if (maybe_process_partial_specialization (TREE_TYPE (type))

- == error_mark_node)
+  type = maybe_process_partial_specialization (type);
+  if (type == error_mark_node)
{
  type = NULL_TREE;
  goto done;
}
  
-  class_type = current_class_type;

/* Enter the scope indicated by the nested-name-specifier.  */
pushed_scope = push_scope (nested_name_specifier);
/* Get the canonical version of this type.  */
-  type = TYPE_MAIN_DECL (TREE_TYPE (type));
+  type = TYPE_MAIN_DECL (type);
/* Call push_template_decl if it seems like we should be defining a
 template either from the template headers or the type we're
 defining, so that we diagnose both extra and missing headers.  */
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C
new file mode 100644
index 000..8504a055ac5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec10.C
@@ -0,0 +1,6 @@
+// PR c++/92944
+// { dg-do compile { target c++20 } }
+
+namespace ns { template struct A { }; }
+template requires true struct ns::A { };
+template requires false struct ns::A { };

Re: [PATCH] c++: non-dependent immediate member fn call [PR99895]

2022-01-27 Thread Jason Merrill via Gcc-patches


On 1/27/22 09:02, Patrick Palka wrote:

On Wed, 26 Jan 2022, Patrick Palka wrote:


On Wed, 26 Jan 2022, Patrick Palka wrote:


On Wed, Jan 19, 2022 at 2:19 PM Jason Merrill  wrote:


On 1/19/22 11:15, Patrick Palka wrote:

Here we're emitting a bogus error during ahead of time evaluation of a
non-dependent immediate member function call such as a.f(args) because
the defacto templated form for such a call is (a.f)(args) but we're
trying to evaluate it using the intermediate CALL_EXPR built by
build_over_call, which has the non-member form f(a, args).  The defacto
member form is built in build_new_method_call, so it seems we should
handle the immediate call there instead.


Hmm, there's already a bunch of code in build_over_call to try to fix up
the object argument, and there seem to be many places other than
build_new_method_call that call build_over_call for member functions; I
think it's probably better to build the needed COMPONENT_REF in
build_over_call.


Ah, but in build_over_call the arguments (including the implicit
object argument) are potentially wrapped in a NON_DEPENDENT_EXPR.  So
even if we built a COMPONENT_REF in build_over_call, we'd  have to
keep rebuilding the entire CALL_EXPR (including the COMPONENT_REF) in
terms of the original arguments in build_new_method_call, IIUC.

On a related note, the NON_DEPENDENT_EXPR wrapping seems to be a
problem for non-member functions too, because the constexpr engine
punts on them:

   struct fixed_string { };

   static consteval void size(fixed_string) { }

   template
   void VerifyHash(fixed_string s) {
 size(s);  // error: expression ‘s’ is not a constant expression
(because it's wrapped in NON_DEPENDENT_EXPR)
   }

I wonder if this means we should be evaluating non-dependent
non-member immediate calls elsewhere, e.g. in finish_call_expr?  Or
perhaps we should stop punting on NON_DEPENDENT_EXPR during constexpr
evaluation?


Actually, for that particular example, we probably should just avoid
wrapping PARM_DECL in a NON_DEPENDENT_EXPR...


Here's a patch that makes build_over_call build the COMPONENT_REF-using
form for non-dependent member calls, and in passing makes us avoid
wrapping PARM_DECL in a NON_DEPENDENT_EXPR.

-- >8 --

Subject: [PATCH] c++: non-dependent immediate member fn call [PR99895]

Here we're emitting a bogus error during ahead of time evaluation of a
non-dependent immediate member function call such as a.f(args) because
the defacto templated form for such a call is (a.f)(args) but we're
trying to evaluate it using the intermediate CALL_EXPR built by
build_over_call, which has the non-member form f(a, args).  The defacto
member form is built in build_new_method_call, so it seems we should
handle the immediate call there instead, or perhaps make build_over_call
build the correct form in the first place.

Giiven that there are many spots other than build_new_method_call that
call build_over_call for member functions, this patch takes the latter
approach.

In passing, this patch makes us avoid wrapping PARM_DECL in
NON_DEPENDENT_EXPR for benefit of the third testcase below.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


PR c++/99895

gcc/cp/ChangeLog:

* call.cc (build_over_call): For a non-dependent member call,
build up a CALL_EXPR using a COMPONENT_REF callee, as in
build_new_method_call.
* pt.cc (build_non_dependent_expr): Don't wrap PARM_DECL either.
* tree.cc (build_min_non_dep_op_overload): Adjust accordingly
after the build_over_call change.

gcc/ChangeLog:

* tree.cc (build_call_vec): Add const to second parameter.
* tree.h (build_call_vec): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/consteval-memfn1.C: New test.
* g++.dg/cpp2a/consteval-memfn2.C: New test.
* g++.dg/cpp2a/consteval28.C: New test.
---
  gcc/cp/call.cc| 38 +++
  gcc/cp/pt.cc  |  6 ++-
  gcc/cp/tree.cc|  5 ++-
  gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C | 27 +
  gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C | 34 +
  gcc/testsuite/g++.dg/cpp2a/consteval28.C  | 10 +
  gcc/tree.cc   |  2 +-
  gcc/tree.h|  2 +-
  8 files changed, 94 insertions(+), 30 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval28.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index bc157cdd1fb..b2e89c5d783 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -9204,11 +9204,6 @@ build_over_call (struct z_candidate *cand, int flags, 
tsubst_flags_t complain)
   errors will be deferred until the template is instantiated.  */
if (processing_template_decl)
  {
-

[PATCH] contrib: Put gcc-descr and gcc-undescr to file.

2022-01-27 Thread Martin Liška


Hello.

I've finished Martin's work and put the 2 aliases into files. The 
git-undescr.sh is basically
unchanged, while I added better option parsing for git-descr.sh script so that 
it supports:

$ git gcc-descr
r12-6895-g14f339894db6ca

$ git gcc-descr HEAD~10
r12-6886-geaa59070343326

$ git gcc-descr HEAD~10 --long
r12-6886-geaa5907034332649c918f0579da805b6e786aa47

$ git gcc-descr --short HEAD~10 --long
r12-6886

$ git gcc-descr --short --short --long HEAD~10
r12-6886

Ready to be installed?
Thanks,
MartinFrom feb3f83724cd0764f7ad3fbd1504c0d43266c88a Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 27 Jan 2022 16:01:55 +0100
Subject: [PATCH] contrib: Put gcc-descr and gcc-undescr to file.

contrib/ChangeLog:

	* git-descr.sh: New file.
	* git-undescr.sh: New file.
	Support optional arguments --long, --short and default
	to 14 characters of git hash.

contrib/ChangeLog:

	* gcc-git-customization.sh: Use the created files.
	* git-descr.sh: New file.
	* git-undescr.sh: New file.

Co-Authored-By: Martin Jambor 
---
 contrib/gcc-git-customization.sh |  4 ++--
 contrib/git-descr.sh | 37 
 contrib/git-undescr.sh   | 13 +++
 3 files changed, 52 insertions(+), 2 deletions(-)
 create mode 100755 contrib/git-descr.sh
 create mode 100755 contrib/git-undescr.sh

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index 2eec17937af..b24948d9874 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -22,8 +22,8 @@ git config alias.svn-rev '!f() { rev=$1; shift; git log --all --grep="^From-SVN:
 
 # Add git commands to convert git commit to monotonically increasing revision number
 # and vice versa
-git config alias.gcc-descr \!"f() { if test \${1:-no} = --full; then c=\${2:-master}; r=\$(git describe --all --abbrev=40 --match 'basepoints/gcc-[0-9]*' \$c | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-,r,p'); expr match \${r:-no} '^r[0-9]\\+\$' >/dev/null && r=\${r}-0-g\$(git rev-parse \${2:-master}); else c=\${1:-master}; r=\$(git describe --all --match 'basepoints/gcc-[0-9]*' \$c | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)-\\([0-9]\\+\\)-g[0-9a-f]*\$,r\\2-\\3,p;s,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)\$,r\\2-0,p'); fi; if test -n \$r; then o=\$(git config --get gcc-config.upstream); rr=\$(echo \$r | sed -n 's,^r\\([0-9]\\+\\)-[0-9]\\+\\(-g[0-9a-f]\\+\\)\\?\$,\\1,p'); if git rev-parse --verify --quiet \${o:-origin}/releases/gcc-\$rr >/dev/null; then m=releases/gcc-\$rr; else m=master; fi; git merge-base --is-ancestor \$c \${o:-origin}/\$m && \echo \${r}; fi; }; f"
-git config alias.gcc-undescr \!"f() { o=\$(git config --get gcc-config.upstream); r=\$(echo \$1 | sed -n 's,^r\\([0-9]\\+\\)-[0-9]\\+\$,\\1,p'); n=\$(echo \$1 | sed -n 's,^r[0-9]\\+-\\([0-9]\\+\\)\$,\\1,p'); test -z \$r && echo Invalid id \$1 && exit 1; h=\$(git rev-parse --verify --quiet \${o:-origin}/releases/gcc-\$r); test -z \$h && h=\$(git rev-parse --verify --quiet \${o:-origin}/master); p=\$(git describe --all --match 'basepoints/gcc-'\$r \$h | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+-\\([0-9]\\+\\)-g[0-9a-f]*\$,\\2,p;s,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+\$,0,p'); git rev-parse --verify \$h~\$(expr \$p - \$n); }; f"
+git config alias.gcc-descr '!f() { "`git rev-parse --show-toplevel`/contrib/git-descr.sh" $@; } ; f'
+git config alias.gcc-undescr '!f() { "`git rev-parse --show-toplevel`/contrib/git-undescr.sh" $@; } ; f'
 
 git config alias.gcc-verify '!f() { "`git rev-parse --show-toplevel`/contrib/gcc-changelog/git_check_commit.py" $@; } ; f'
 git config alias.gcc-backport '!f() { "`git rev-parse --show-toplevel`/contrib/git-backport.py" $@; } ; f'
diff --git a/contrib/git-descr.sh b/contrib/git-descr.sh
new file mode 100755
index 000..0d3e42940b3
--- /dev/null
+++ b/contrib/git-descr.sh
@@ -0,0 +1,37 @@
+#!/bin/sh
+
+# Script to describe a GCC revision based on git hash
+
+short=no
+long=no
+c=master
+
+for arg in "$@"
+do
+case "$arg" in
+  --short) short=yes
+	;;
+  --long) long=yes
+	;;
+  *) c=$arg
+esac
+done
+
+if test x$short = xyes; then
+r=$(git describe --all --match 'basepoints/gcc-[0-9]*' $c | sed -n 's,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)-\([0-9]\+\)-g[0-9a-f]*$,r\2-\3,p;s,^\(tags/\)\?basepoints/gcc-\([0-9]\+\)$,r\2-0,p');
+elif test x$long = xyes; then
+r=$(git describe --all --abbrev=40 --match 'basepoints/gcc-[0-9]*' $c | sed -n 's,^\(tags/\)\?basepoints/gcc-,r,p')
+else
+r=$(git describe --all --abbrev=14 --match 'basepoints/gcc-[0-9]*' $c | sed -n 's,^\(tags/\)\?basepoints/gcc-,r,p');
+expr match ${r:-no} '^r[0-9]\+$' >/dev/null && r=${r}-0-g$(git rev-parse $c);
+fi;
+if test -n $r; then
+o=$(git config --get gcc-config.upstream);
+rr=$(echo $r | sed -n 's,^r\([0-9]\+\)-[0-9]\+\(-g[0-9a-f]\+\)\?$,\1,p');
+if git rev-parse --verify --quiet ${o:-origin}/releases/gcc-$rr >/dev/null; then
+	m=releases/gcc-$rr;
+else
+

Re: [PATCH] libstdc++: fix typo in acinclude.m4.

2022-01-27 Thread Martin Liška


On 1/27/22 15:25, Jonathan Wakely wrote:

I have the same patch ready to push, but I'm away from the computer. Please 
push.


;) I've just pushed that to master.

Martin

Re: [committed] libstdc++: Avoid symlink race in filesystem::remove_all [PR104161]

2022-01-27 Thread Jonathan Wakely via Gcc-patches

On Wed, 26 Jan 2022, 22:12 Jonathan Wakely via Libstdc++, <
libstd...@gcc.gnu.org> wrote:

> On Wed, 26 Jan 2022 at 22:08, Dimitar Dimitrov  wrote:
> >
> > On Tue, Jan 25, 2022 at 09:09:51PM +, Jonathan Wakely via
> Gcc-patches wrote:
> > > Tested x86_64-linux, pushed to trunk. Backports to follow.
> > >
> > >
> > > This adds a new internal flag to the filesystem::directory_iterator
> > > constructor that makes it fail if the path is a symlink that resolves
> to
> > > a directory. This prevents filesystem::remove_all from following a
> > > symlink to a directory, rather than deleting the symlink itself.
> > >
> > > We can also use that new flag in recursive_directory_iterator to ensure
> > > that we don't follow symlinks if the follow_directory_symlink option is
> > > not set.
> > >
> > > This also moves an error check in filesystem::remove_all after the
> while
> > > loop, so that errors from the directory_iterator constructor are
> > > reproted, instead of continuing to the filesystem::remove call below.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > >   PR libstdc++/104161
> > >   * acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for
> > >   fdopendir.
> > >   * config.h.in: Regenerate.
> > >   * configure: Regenerate.
> > >   * src/c++17/fs_dir.cc (_Dir): Add nofollow flag to constructor
> > >   and pass it to base class constructor.
> > >   (directory_iterator): Pass nofollow flag to _Dir constructor.
> > >   (fs::recursive_directory_iterator::increment): Likewise.
> > >   * src/c++17/fs_ops.cc (do_remove_all): Use nofollow option for
> > >   directory_iterator constructor. Move error check outside loop.
> > >   * src/filesystem/dir-common.h (_Dir_base): Add nofollow flag to
> > >   constructor and when it's set use ::open with O_NOFOLLOW and
> > >   O_DIRECTORY.
> > >   * src/filesystem/dir.cc (_Dir): Add nofollow flag to constructor
> > >   and pass it to base class constructor.
> > >   (directory_iterator): Pass nofollow flag to _Dir constructor.
> > >   (fs::recursive_directory_iterator::increment): Likewise.
> > >   * src/filesystem/ops.cc (remove_all): Use nofollow option for
> > >   directory_iterator constructor. Move error check outside loop.
> > > ---
> > >  libstdc++-v3/acinclude.m4| 12 ++
> > >  libstdc++-v3/config.h.in |  3 ++
> > >  libstdc++-v3/configure   | 55 
> > >  libstdc++-v3/src/c++17/fs_dir.cc | 13 --
> > >  libstdc++-v3/src/c++17/fs_ops.cc | 12 +++---
> > >  libstdc++-v3/src/filesystem/dir-common.h | 48 -
> > >  libstdc++-v3/src/filesystem/dir.cc   | 13 --
> > >  libstdc++-v3/src/filesystem/ops.cc   |  6 +--
> > >  8 files changed, 134 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> > > index d996477254c..7b6b807114a 100644
> > > --- a/libstdc++-v3/acinclude.m4
> > > +++ b/libstdc++-v3/acinclude.m4
> > > @@ -4735,6 +4735,18 @@ dnl
> > >if test $glibcxx_cv_truncate = yes; then
> > >  AC_DEFINE(HAVE_TRUNCATE, 1, [Define if truncate is available in
> .])
> > >fi
> > > +dnl
> > > +  AC_CACHE_CHECK([for fdopendir],
> > > +glibcxx_cv_fdopendir, [dnl
> > > +GCC_TRY_COMPILE_OR_LINK(
> > > +  [#include ],
> > > +  [::fdopendir(1);],
> > > +  [glibcxx_cv_fdopendir=yes],
> > > +  [glibcxx_cv_fdopendir=no])
> > > +  ])
> > > +  if test $glibcxx_cv_truncate = yes; then
> >
> > This is a typo. Should check glibcxx_cv_fdopendir.
>
> Oops, thanks! Copy
>

Martin Liška is fixing this now (thanks!)

Re: [PATCH] libstdc++: fix typo in acinclude.m4.

2022-01-27 Thread Jonathan Wakely via Gcc-patches

On Thu, 27 Jan 2022, 13:49 Martin Liška,  wrote:

> Hi.
>
> May I install an obvious typo that causes build error (described in the
> PR).
>
> Ready to be installed?
>

I have the same patch ready to push, but I'm away from the computer. Please
push.



Thanks,
> Martin
>
> PR libstdc++/104259
>
> libstdc++-v3/ChangeLog:
>
> * acinclude.m4: Fix typo.
> * configure: Regenerate.
> ---
>   libstdc++-v3/acinclude.m4 | 2 +-
>   libstdc++-v3/configure| 2 +-
>   2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> index 7b6b807114a..a93252076b9 100644
> --- a/libstdc++-v3/acinclude.m4
> +++ b/libstdc++-v3/acinclude.m4
> @@ -4744,7 +4744,7 @@ dnl
> [glibcxx_cv_fdopendir=yes],
> [glibcxx_cv_fdopendir=no])
> ])
> -  if test $glibcxx_cv_truncate = yes; then
> +  if test $glibcxx_cv_fdopendir = yes; then
>   AC_DEFINE(HAVE_FDOPENDIR, 1, [Define if fdopendir is available in
> .])
> fi
>   dnl
> diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
> index 5c6e51f5c11..2b2f85782b1 100755
> --- a/libstdc++-v3/configure
> +++ b/libstdc++-v3/configure
> @@ -77080,7 +77080,7 @@ fi
>   fi
>   { $as_echo "$as_me:${as_lineno-$LINENO}: result: $glibcxx_cv_fdopendir"
> >&5
>   $as_echo "$glibcxx_cv_fdopendir" >&6; }
> -  if test $glibcxx_cv_truncate = yes; then
> +  if test $glibcxx_cv_fdopendir = yes; then
>
>   $as_echo "#define HAVE_FDOPENDIR 1" >>confdefs.h
>
> --
> 2.34.1
>
>

Re: [PATCH][GCC13] Don't force side effects for hardware vector element broadcast

2022-01-27 Thread Richard Biener via Gcc-patches

On Thu, Jan 27, 2022 at 2:59 PM Maciej W. Rozycki  wrote:
>
> On Thu, 27 Jan 2022, Richard Biener wrote:
>
> > > Index: gcc/gcc/c/c-typeck.cc
> > > ===
> > > --- gcc.orig/gcc/c/c-typeck.cc
> > > +++ gcc/gcc/c/c-typeck.cc
> > > @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.
> > >  #include "gomp-constants.h"
> > >  #include "spellcheck-tree.h"
> > >  #include "gcc-rich-location.h"
> > > +#include "optabs-query.h"
> > >  #include "stringpool.h"
> > >  #include "attribs.h"
> > >  #include "asan.h"
> > > @@ -11923,7 +11924,9 @@ build_binary_op (location_t location, en
> > >bool maybe_const = true;
> > >tree sc;
> > >sc = c_fully_fold (op0, false, _const);
> > > -  sc = save_expr (sc);
> > > + if (optab_handler (vec_duplicate_optab,
> > > +TYPE_MODE (type1)) == CODE_FOR_nothing)
> > > +   sc = save_expr (sc);
> >
> > This doesn't make much sense - I suppose the CONSTRUCTOR retains
> > TREE_SIDE_EFFECTS but such flag has no meaning on GIMPLE
> > and thus should have been cleared during gimplification or in the end
> > ignored by RTL expansion.
>
>  This is how the expression built here eventually looks in
> `store_constructor':
>
> (gdb) print exp
> $41 = 
> (gdb) pt
>   type  type  size 
> unit-size 
> align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x75cf1260 precision:32
> pointer_to_this >
> sizes-gimplified V4SF
> size 
> unit-size 
> align:128 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x75d19648 nunits:4 context  v4sf-dup.c>>
> side-effects length:4
> val 
> visited var 
> def_stmt GIMPLE_NOP
> version:2>
> val 
> visited var 
> def_stmt GIMPLE_NOP
> version:2>
> val 
> visited var 
> def_stmt GIMPLE_NOP
> version:2>
> val 
> visited var 
> def_stmt GIMPLE_NOP
> version:2>>
> (gdb)
>
> The `side-effects' flag prevents this conditional from executing:
>
> /* Try using vec_duplicate_optab for uniform vectors.  */
> if (!TREE_SIDE_EFFECTS (exp)
> && VECTOR_MODE_P (mode)
> && eltmode == GET_MODE_INNER (mode)
> && ((icode = optab_handler (vec_duplicate_optab, mode))
> != CODE_FOR_nothing)
> && (elt = uniform_vector_p (exp))
> && !VECTOR_TYPE_P (TREE_TYPE (elt)))
>   {
> class expand_operand ops[2];
> create_output_operand ([0], target, mode);
> create_input_operand ([1], expand_normal (elt), eltmode);
> expand_insn (icode, 2, ops);
> if (!rtx_equal_p (target, ops[0].value))
>   emit_move_insn (target, ops[0].value);
> break;
>   }
>
> I don't know what's supposed to clear the flag (and what the purpose of
> setting it in the first place would be then).

It's probably safe to remove the !TREE_SIDE_EFFECTS check above
but already gimplification should have made sure all side-effects are
pushed to separate stmts.  gimplifiation usually calls recompute_side_effects
but that doesn't seem to touch CONSTRUCTORs.  But I do remember fixing
some spurious TREE_SIDE_EFFECTS on CTORs before.

Might be worth verifying in verify_gimple_assign_single that CTORs
do not have TREE_SIDE_EFFECTS set (unless this is a clobber).

>
> > You do not add a testcase so it's hard to see what goes wrong where exactly.
>
>  This only happens with an out-of-tree microarchitecture and I was unable
> to find an in-tree target that ever uses the `vec_duplicateM' standard RTL
> pattern, so I wasn't able to produce a test case that would trigger with
> upstream code.  Code is essentially like:
>
> typedef float v4sf __attribute__ ((vector_size (16)));
>
> v4sf
> odd_even (v4sf x, float y)
> {
>   return x + f;
> }
>
>  I considered the Aarch64 one the most likely candidate as it is the one
> commit be4c1d4a42c5 has been for, but as I noted in the description it
> does not appear to use it either.  It uses `vec_initMN' instead and does
> the broadcast by hand in the backend.
>
>  Maybe I'm missing something.
>
>   Maciej

Re: [PATCH] c++: non-dependent immediate member fn call [PR99895]

2022-01-27 Thread Patrick Palka via Gcc-patches

On Wed, 26 Jan 2022, Patrick Palka wrote:

> On Wed, 26 Jan 2022, Patrick Palka wrote:
> 
> > On Wed, Jan 19, 2022 at 2:19 PM Jason Merrill  wrote:
> > >
> > > On 1/19/22 11:15, Patrick Palka wrote:
> > > > Here we're emitting a bogus error during ahead of time evaluation of a
> > > > non-dependent immediate member function call such as a.f(args) because
> > > > the defacto templated form for such a call is (a.f)(args) but we're
> > > > trying to evaluate it using the intermediate CALL_EXPR built by
> > > > build_over_call, which has the non-member form f(a, args).  The defacto
> > > > member form is built in build_new_method_call, so it seems we should
> > > > handle the immediate call there instead.
> > >
> > > Hmm, there's already a bunch of code in build_over_call to try to fix up
> > > the object argument, and there seem to be many places other than
> > > build_new_method_call that call build_over_call for member functions; I
> > > think it's probably better to build the needed COMPONENT_REF in
> > > build_over_call.
> > 
> > Ah, but in build_over_call the arguments (including the implicit
> > object argument) are potentially wrapped in a NON_DEPENDENT_EXPR.  So
> > even if we built a COMPONENT_REF in build_over_call, we'd  have to
> > keep rebuilding the entire CALL_EXPR (including the COMPONENT_REF) in
> > terms of the original arguments in build_new_method_call, IIUC.
> > 
> > On a related note, the NON_DEPENDENT_EXPR wrapping seems to be a
> > problem for non-member functions too, because the constexpr engine
> > punts on them:
> > 
> >   struct fixed_string { };
> > 
> >   static consteval void size(fixed_string) { }
> > 
> >   template
> >   void VerifyHash(fixed_string s) {
> > size(s);  // error: expression ‘s’ is not a constant expression
> > (because it's wrapped in NON_DEPENDENT_EXPR)
> >   }
> > 
> > I wonder if this means we should be evaluating non-dependent
> > non-member immediate calls elsewhere, e.g. in finish_call_expr?  Or
> > perhaps we should stop punting on NON_DEPENDENT_EXPR during constexpr
> > evaluation?
> 
> Actually, for that particular example, we probably should just avoid
> wrapping PARM_DECL in a NON_DEPENDENT_EXPR...

Here's a patch that makes build_over_call build the COMPONENT_REF-using
form for non-dependent member calls, and in passing makes us avoid
wrapping PARM_DECL in a NON_DEPENDENT_EXPR.

-- >8 --

Subject: [PATCH] c++: non-dependent immediate member fn call [PR99895]

Here we're emitting a bogus error during ahead of time evaluation of a
non-dependent immediate member function call such as a.f(args) because
the defacto templated form for such a call is (a.f)(args) but we're
trying to evaluate it using the intermediate CALL_EXPR built by
build_over_call, which has the non-member form f(a, args).  The defacto
member form is built in build_new_method_call, so it seems we should
handle the immediate call there instead, or perhaps make build_over_call
build the correct form in the first place.

Giiven that there are many spots other than build_new_method_call that
call build_over_call for member functions, this patch takes the latter
approach.

In passing, this patch makes us avoid wrapping PARM_DECL in
NON_DEPENDENT_EXPR for benefit of the third testcase below.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/99895

gcc/cp/ChangeLog:

* call.cc (build_over_call): For a non-dependent member call,
build up a CALL_EXPR using a COMPONENT_REF callee, as in
build_new_method_call.
* pt.cc (build_non_dependent_expr): Don't wrap PARM_DECL either.
* tree.cc (build_min_non_dep_op_overload): Adjust accordingly
after the build_over_call change.

gcc/ChangeLog:

* tree.cc (build_call_vec): Add const to second parameter.
* tree.h (build_call_vec): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/consteval-memfn1.C: New test.
* g++.dg/cpp2a/consteval-memfn2.C: New test.
* g++.dg/cpp2a/consteval28.C: New test.
---
 gcc/cp/call.cc| 38 +++
 gcc/cp/pt.cc  |  6 ++-
 gcc/cp/tree.cc|  5 ++-
 gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C | 27 +
 gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C | 34 +
 gcc/testsuite/g++.dg/cpp2a/consteval28.C  | 10 +
 gcc/tree.cc   |  2 +-
 gcc/tree.h|  2 +-
 8 files changed, 94 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval-memfn2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval28.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index bc157cdd1fb..b2e89c5d783 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -9204,11 +9204,6 @@ build_over_call (struct z_candidate *cand, int flags,

Re: [PATCH][GCC13] Don't force side effects for hardware vector element broadcast

2022-01-27 Thread Maciej W. Rozycki

On Thu, 27 Jan 2022, Richard Biener wrote:

> > Index: gcc/gcc/c/c-typeck.cc
> > ===
> > --- gcc.orig/gcc/c/c-typeck.cc
> > +++ gcc/gcc/c/c-typeck.cc
> > @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.
> >  #include "gomp-constants.h"
> >  #include "spellcheck-tree.h"
> >  #include "gcc-rich-location.h"
> > +#include "optabs-query.h"
> >  #include "stringpool.h"
> >  #include "attribs.h"
> >  #include "asan.h"
> > @@ -11923,7 +11924,9 @@ build_binary_op (location_t location, en
> >bool maybe_const = true;
> >tree sc;
> >sc = c_fully_fold (op0, false, _const);
> > -  sc = save_expr (sc);
> > + if (optab_handler (vec_duplicate_optab,
> > +TYPE_MODE (type1)) == CODE_FOR_nothing)
> > +   sc = save_expr (sc);
> 
> This doesn't make much sense - I suppose the CONSTRUCTOR retains
> TREE_SIDE_EFFECTS but such flag has no meaning on GIMPLE
> and thus should have been cleared during gimplification or in the end
> ignored by RTL expansion.

 This is how the expression built here eventually looks in 
`store_constructor':

(gdb) print exp
$41 = 
(gdb) pt
 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x75cf1260 precision:32
pointer_to_this >
sizes-gimplified V4SF
size 
unit-size 
align:128 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
0x75d19648 nunits:4 context >
side-effects length:4
val 
visited var 
def_stmt GIMPLE_NOP
version:2>
val 
visited var 
def_stmt GIMPLE_NOP
version:2>
val 
visited var 
def_stmt GIMPLE_NOP
version:2>
val 
visited var 
def_stmt GIMPLE_NOP
version:2>>
(gdb)

The `side-effects' flag prevents this conditional from executing:

/* Try using vec_duplicate_optab for uniform vectors.  */
if (!TREE_SIDE_EFFECTS (exp)
&& VECTOR_MODE_P (mode)
&& eltmode == GET_MODE_INNER (mode)
&& ((icode = optab_handler (vec_duplicate_optab, mode))
!= CODE_FOR_nothing)
&& (elt = uniform_vector_p (exp))
&& !VECTOR_TYPE_P (TREE_TYPE (elt)))
  {
class expand_operand ops[2];
create_output_operand ([0], target, mode);
create_input_operand ([1], expand_normal (elt), eltmode);
expand_insn (icode, 2, ops);
if (!rtx_equal_p (target, ops[0].value))
  emit_move_insn (target, ops[0].value);
break;
  }

I don't know what's supposed to clear the flag (and what the purpose of 
setting it in the first place would be then).

> You do not add a testcase so it's hard to see what goes wrong where exactly.

 This only happens with an out-of-tree microarchitecture and I was unable 
to find an in-tree target that ever uses the `vec_duplicateM' standard RTL 
pattern, so I wasn't able to produce a test case that would trigger with 
upstream code.  Code is essentially like:

typedef float v4sf __attribute__ ((vector_size (16)));

v4sf
odd_even (v4sf x, float y)
{
  return x + f;
}

 I considered the Aarch64 one the most likely candidate as it is the one
commit be4c1d4a42c5 has been for, but as I noted in the description it 
does not appear to use it either.  It uses `vec_initMN' instead and does 
the broadcast by hand in the backend.

 Maybe I'm missing something.

  Maciej

[PATCH] libstdc++: fix typo in acinclude.m4.

2022-01-27 Thread Martin Liška


Hi.

May I install an obvious typo that causes build error (described in the PR).

Ready to be installed?
Thanks,
Martin

PR libstdc++/104259

libstdc++-v3/ChangeLog:

* acinclude.m4: Fix typo.
* configure: Regenerate.
---
 libstdc++-v3/acinclude.m4 | 2 +-
 libstdc++-v3/configure| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 7b6b807114a..a93252076b9 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -4744,7 +4744,7 @@ dnl
   [glibcxx_cv_fdopendir=yes],
   [glibcxx_cv_fdopendir=no])
   ])
-  if test $glibcxx_cv_truncate = yes; then
+  if test $glibcxx_cv_fdopendir = yes; then
 AC_DEFINE(HAVE_FDOPENDIR, 1, [Define if fdopendir is available in 
.])
   fi
 dnl
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index 5c6e51f5c11..2b2f85782b1 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -77080,7 +77080,7 @@ fi
 fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $glibcxx_cv_fdopendir" >&5
 $as_echo "$glibcxx_cv_fdopendir" >&6; }
-  if test $glibcxx_cv_truncate = yes; then
+  if test $glibcxx_cv_fdopendir = yes; then
 
 $as_echo "#define HAVE_FDOPENDIR 1" >>confdefs.h
 
--

2.34.1

Re: [PATCH][GCC13] Don't force side effects for hardware vector element broadcast

2022-01-27 Thread Richard Biener via Gcc-patches

On Thu, Jan 27, 2022 at 1:16 PM Maciej W. Rozycki  wrote:
>
> Do not mark vector element broadcast resulting from OpenCL operations as
> having side effects for targets that have a suitable hardware operation,
> so that the `vec_duplicateM' standard RTL pattern can be directly used
> for them.  This does not happen currently, because any RTL pattern named
> `vec_duplicateM' cannot express the side effects presumed, and therefore
> a `!TREE_SIDE_EFFECTS' condition placed in `store_constructor' correctly
> prevents any `vec_duplicateM' pattern from being used.
>
> This side-effect annotation was introduced for the C frontend with:
> , and then:
>  ("Scalar
> vector binary operation"), and finally landed with commit 0e3a99ae913c
> ("c-typeck.c (scalar_to_vector): New function."), along with original
> support for scalar-to-vector OpenCL operations.
>
> Support for these operations was then gradually added to C++ with:
> , and then:
>  ("[C++] Mixed
> scalar-vector operations"), landing with commit a212e43fca22 ("re PR
> c++/54427 (Expose more vector extensions)"), followed by:
> , and then:
>  ("[C++]
> Handle ?: for vectors"), landing with commit 93100c6b5b45 ("re PR
> c++/54427 (Expose more vector extensions)").
>
> And side-effect annotation for C++ was retrofitted, again gradually,
> with:  ("[C++]
> Missing save_expr in vector-scalar ops"), landing with commit
> 6698175d1591 ("typeck.c (cp_build_binary_op): Call save_expr before
> build_vector_from_val."), and then:
>  ("vector
> conditional expression with scalar arguments"), landing with commit
> 07298ffd6f7c ("call.c (build_conditional_expr_1): Handle the case with
> 1 vector and 2 scalars.")
>
> This was all long before we gained support for the `vec_duplicateM'
> standard RTL pattern, with commit be4c1d4a42c5 ("Add VEC_DUPLICATE_EXPR
> and associated optab") and it may have had sense for where the element
> broadcast operation was always open-coded.  However where provided by
> hardware the element broadcast operation is just like any other vector
> operation, presumably cheap, and whether to repeat it or not should be
> based on the cost of the operation and not hardwired.
>
> So there is no point in artificially marking the operation as having a
> side effect for the lone purpose of preventing repetition.  This does
> not currently prevent targets like the Aarch64 and MIPS ones from using
> the hardware element broadcast operation, but it requires the backend to
> wire it explicitly via the `vec_initMN' standard RTL pattern rather than
> relying on the middle end to pick it up via the `vec_duplicateM' pattern
> automatically.  This change lifts this limitation.
>
> gcc/
> * c/c-typeck.cc (build_binary_op) Do not annotate vector element
> broadcast operations as having a side effect if there is such an
> operation provided by the backend.
> * cp/call.cc (build_conditional_expr): Likewise.
> * cp/typeck.cc (cp_build_binary_op): Likewise.
> ---
> Hi,
>
>  I've come across this while working on an out-of-tree microarchitecture,
> which does not require a `vec_initMN' RTL pattern, however has hardware
> support suitable for cheap `vec_duplicateM'.  I could not persuade the
> middle end to make use of the new `vec_duplicateM' pattern however until I
> removed the side-effect anbnotation.  So I think this is a move in the
> right direction even if we do not have support for said microarchitecture
> at the moment.
>
>  For some reason I could not persuade our current Aarch64 code to use the
> `vec_duplicateM' pattern directly either.  This is because at the time
> `store_constructor' is called the vector mode for the element broadcast
> operation is say `V4SFmode' (for a vector of floats), however there is no
> corresponding `vec_duplicatev4sf' pattern.  Code in the `vec_initv4sfsf'
> pattern does the right thing though, referring to `aarch64_simd_dupv4sf'
> by hand instead.  There is a `vec_duplicatevnx4sf' pattern corresponding
> to `VNx4SFmode', but I was unable to figure out how to activate it.
>
>  This change has passed regression-testing with the `aarch64-linux-gnu'
> target (with `-march=armv9-a') and QEMU in the user emulation mode across
> all the GCC languages and libraries.
>
>  NB there are indentation issues with code surrounding places I have
> modified, but I have chosen not to address them so as not to obfuscate
> this change.  Likewise there are unnecessary compound statements used in
> the switch statements in the C++ frontend parts

Re: [PATCH] rs6000: Fix constraint v with rs6000_constraints[RS6000_CONSTRAINT_v]

2022-01-27 Thread Segher Boessenkool

On Thu, Jan 27, 2022 at 07:24:53PM +0800, Kewen.Lin wrote:
> on 2022/1/27 上午1:57, Segher Boessenkool wrote:
> > I like your original patch better.  But for stage 1, sorry.
> 
> Thanks Segher!  Is it ok to commit it then?  Or I'll repost this once
> next stage1 starts.

Either is fine in this case.


Segher

Re: [PATCH] internal_error - do not use leading capital letter

2022-01-27 Thread Jan Hubicka via Gcc-patches

> That's follow up patch based on the discussion with Jakub.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/host-darwin.cc (segv_crash_handler):
>   Do not use leading capital letter.
>   (segv_handler): Likewise.
>   * ipa-sra.cc (verify_splitting_accesses): Likewise.
>   * varasm.cc (get_section): Likewise.
> 
> gcc/d/ChangeLog:
> 
>   * decl.cc (d_finish_decl): Do not use leading capital letter.
> ---
>  gcc/config/rs6000/host-darwin.cc | 4 ++--
>  gcc/d/decl.cc| 2 +-
>  gcc/ipa-sra.cc   | 4 ++--
>  gcc/varasm.cc| 2 +-
>  4 files changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/config/rs6000/host-darwin.cc 
> b/gcc/config/rs6000/host-darwin.cc
> index 541f7e1c81c..efb1965004e 100644
> --- a/gcc/config/rs6000/host-darwin.cc
> +++ b/gcc/config/rs6000/host-darwin.cc
> @@ -128,7 +128,7 @@ segv_handler (int sig ATTRIBUTE_UNUSED,
>fprintf (stderr, "[address=%08lx pc=%08x]\n",
>  uc->uc_mcontext->MC_FLD(es).MC_FLD(dar),
>  uc->uc_mcontext->MC_FLD(ss).MC_FLD(srr0));
> -  internal_error ("Segmentation Fault");
> +  internal_error ("egmentation fault");
segmentation or perhaps eggmentation? :)

Honza

[PATCH] internal_error - do not use leading capital letter

2022-01-27 Thread Martin Liška


That's follow up patch based on the discussion with Jakub.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* config/rs6000/host-darwin.cc (segv_crash_handler):
Do not use leading capital letter.
(segv_handler): Likewise.
* ipa-sra.cc (verify_splitting_accesses): Likewise.
* varasm.cc (get_section): Likewise.

gcc/d/ChangeLog:

* decl.cc (d_finish_decl): Do not use leading capital letter.
---
 gcc/config/rs6000/host-darwin.cc | 4 ++--
 gcc/d/decl.cc| 2 +-
 gcc/ipa-sra.cc   | 4 ++--
 gcc/varasm.cc| 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rs6000/host-darwin.cc b/gcc/config/rs6000/host-darwin.cc
index 541f7e1c81c..efb1965004e 100644
--- a/gcc/config/rs6000/host-darwin.cc
+++ b/gcc/config/rs6000/host-darwin.cc
@@ -58,7 +58,7 @@ extern int sigaltstack(const struct sigaltstack *, struct 
sigaltstack *);
 static void
 segv_crash_handler (int sig ATTRIBUTE_UNUSED)
 {
-  internal_error ("Segmentation Fault (code)");
+  internal_error ("segmentation fault (code)");
 }
 
 static void

@@ -128,7 +128,7 @@ segv_handler (int sig ATTRIBUTE_UNUSED,
   fprintf (stderr, "[address=%08lx pc=%08x]\n",
   uc->uc_mcontext->MC_FLD(es).MC_FLD(dar),
   uc->uc_mcontext->MC_FLD(ss).MC_FLD(srr0));
-  internal_error ("Segmentation Fault");
+  internal_error ("egmentation fault");
   exit (FATAL_EXIT_CODE);
 }
 
diff --git a/gcc/d/decl.cc b/gcc/d/decl.cc

index c7a1e4652f8..5ecc6269780 100644
--- a/gcc/d/decl.cc
+++ b/gcc/d/decl.cc
@@ -1597,7 +1597,7 @@ d_finish_decl (tree decl)
{
  tree name = DECL_ASSEMBLER_NAME (decl);
 
-	  internal_error ("Mismatch between declaration %qE size (%wd) and "

+ internal_error ("mismatch between declaration %qE size (%wd) and "
  "its initializer size (%wd).",
  IDENTIFIER_PRETTY_NAME (name)
  ? IDENTIFIER_PRETTY_NAME (name) : name,
diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc
index 969160f9806..f8a4549c9b0 100644
--- a/gcc/ipa-sra.cc
+++ b/gcc/ipa-sra.cc
@@ -2493,10 +2493,10 @@ verify_splitting_accesses (cgraph_node *node, bool 
certain_must_exist)
 
   bool certain_access_present = !certain_must_exist;

   if (overlapping_certain_accesses_p (desc, _access_present))
-   internal_error ("Function %qs, parameter %u, has IPA-SRA accesses "
+   internal_error ("function %qs, parameter %u, has IPA-SRA accesses "
"which overlap", node->dump_name (), pidx);
   if (!certain_access_present)
-   internal_error ("Function %s, parameter %u, is used but does not "
+   internal_error ("function %qs, parameter %u, is used but does not "
"have any certain IPA-SRA access",
node->dump_name (), pidx);
 }
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 5bc30f0c26e..330ec293711 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -312,7 +312,7 @@ get_section (const char *name, unsigned int flags, tree 
decl,
   else
 {
   if (not_existing)
-   internal_error ("Section already exists: %qs", name);
+   internal_error ("section already exists: %qs", name);
 
   sect = *slot;

   /* It is fine if one of the sections has SECTION_NOTYPE as long as
--
2.34.1

[PATCH][GCC13] Don't force side effects for hardware vector element broadcast

2022-01-27 Thread Maciej W. Rozycki

Do not mark vector element broadcast resulting from OpenCL operations as 
having side effects for targets that have a suitable hardware operation, 
so that the `vec_duplicateM' standard RTL pattern can be directly used 
for them.  This does not happen currently, because any RTL pattern named 
`vec_duplicateM' cannot express the side effects presumed, and therefore 
a `!TREE_SIDE_EFFECTS' condition placed in `store_constructor' correctly 
prevents any `vec_duplicateM' pattern from being used.

This side-effect annotation was introduced for the C frontend with: 
, and then: 
 ("Scalar 
vector binary operation"), and finally landed with commit 0e3a99ae913c 
("c-typeck.c (scalar_to_vector): New function."), along with original 
support for scalar-to-vector OpenCL operations.

Support for these operations was then gradually added to C++ with: 
, and then: 
 ("[C++] Mixed 
scalar-vector operations"), landing with commit a212e43fca22 ("re PR 
c++/54427 (Expose more vector extensions)"), followed by: 
, and then: 
 ("[C++] 
Handle ?: for vectors"), landing with commit 93100c6b5b45 ("re PR 
c++/54427 (Expose more vector extensions)").

And side-effect annotation for C++ was retrofitted, again gradually, 
with:  ("[C++] 
Missing save_expr in vector-scalar ops"), landing with commit 
6698175d1591 ("typeck.c (cp_build_binary_op): Call save_expr before 
build_vector_from_val."), and then: 
 ("vector 
conditional expression with scalar arguments"), landing with commit 
07298ffd6f7c ("call.c (build_conditional_expr_1): Handle the case with 
1 vector and 2 scalars.")

This was all long before we gained support for the `vec_duplicateM' 
standard RTL pattern, with commit be4c1d4a42c5 ("Add VEC_DUPLICATE_EXPR 
and associated optab") and it may have had sense for where the element 
broadcast operation was always open-coded.  However where provided by 
hardware the element broadcast operation is just like any other vector 
operation, presumably cheap, and whether to repeat it or not should be 
based on the cost of the operation and not hardwired.

So there is no point in artificially marking the operation as having a 
side effect for the lone purpose of preventing repetition.  This does 
not currently prevent targets like the Aarch64 and MIPS ones from using 
the hardware element broadcast operation, but it requires the backend to 
wire it explicitly via the `vec_initMN' standard RTL pattern rather than 
relying on the middle end to pick it up via the `vec_duplicateM' pattern 
automatically.  This change lifts this limitation.

gcc/
* c/c-typeck.cc (build_binary_op) Do not annotate vector element 
broadcast operations as having a side effect if there is such an 
operation provided by the backend.
* cp/call.cc (build_conditional_expr): Likewise.
* cp/typeck.cc (cp_build_binary_op): Likewise.
---
Hi,

 I've come across this while working on an out-of-tree microarchitecture, 
which does not require a `vec_initMN' RTL pattern, however has hardware 
support suitable for cheap `vec_duplicateM'.  I could not persuade the 
middle end to make use of the new `vec_duplicateM' pattern however until I 
removed the side-effect anbnotation.  So I think this is a move in the 
right direction even if we do not have support for said microarchitecture 
at the moment.

 For some reason I could not persuade our current Aarch64 code to use the 
`vec_duplicateM' pattern directly either.  This is because at the time 
`store_constructor' is called the vector mode for the element broadcast 
operation is say `V4SFmode' (for a vector of floats), however there is no 
corresponding `vec_duplicatev4sf' pattern.  Code in the `vec_initv4sfsf' 
pattern does the right thing though, referring to `aarch64_simd_dupv4sf' 
by hand instead.  There is a `vec_duplicatevnx4sf' pattern corresponding 
to `VNx4SFmode', but I was unable to figure out how to activate it.

 This change has passed regression-testing with the `aarch64-linux-gnu' 
target (with `-march=armv9-a') and QEMU in the user emulation mode across 
all the GCC languages and libraries.

 NB there are indentation issues with code surrounding places I have 
modified, but I have chosen not to address them so as not to obfuscate 
this change.  Likewise there are unnecessary compound statements used in 
the switch statements in the C++ frontend parts modified.

 Any questions or comments?  Otherwise OK once GCC 13 has opened?

  Maciej
---
 gcc/c/c-typeck.cc |9 +++--
 gcc/cp/call.cc|   19

Re: * SPAM * [PATCH] PR fortran/104128 - ICE in gfc_widechar_to_char, at fortran/scanner.c:199

2022-01-27 Thread Mikael Morin


Le 23/01/2022 à 22:08, Harald Anlauf via Fortran a écrit :

Dear Fortranners,

conversions between different character kinds using TRANSFER exhibit
inconsistencies that can occur between expr->representation.string
(which is char*) on the one hand, and expr->->value.character.string.

One issue (in target-memory.cc) is easily fixed by simply passing
a conversion flag that was likely forgotten in the past.

The other issue happens in gfc_copy_expr.  Before we unconditionally
converted an existing representation.string to wide char, which is
definitely wrong.  Restricting that code path to default character
kind fixed the problems I could find and produces dumps that looked
fine to me.  Maybe some expert here can find a better fix.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?


This was submitted on time before stage4, so it seems to remain valid.
OK.


Maybe 11-branch?


This is not a regression; I would rather not.

Thanks for the patch.

Re: [PATCH] rs6000: Move the hunk affecting VSX/ALTIVEC ahead [PR103627]

2022-01-27 Thread Kewen.Lin via Gcc-patches

Hi Segher,

on 2022/1/27 上午6:42, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Dec 23, 2021 at 10:12:19AM +0800, Kewen.Lin wrote:
>>  PR target/103627
>>  * config/rs6000/rs6000.c (rs6000_option_override_internal): Move the
>>  hunk affecting VSX and ALTIVEC to the appropriate place.
>>
>> gcc/testsuite/ChangeLog:
>>
>>  PR target/103627
>>  * gcc.target/powerpc/pr103627-3.c: New test.
> 
> 
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -3955,6 +3955,15 @@ rs6000_option_override_internal (bool global_init_p)
>>else if (TARGET_ALTIVEC)
>>  rs6000_isa_flags |= (OPTION_MASK_PPC_GFXOPT & ~ignore_masks);
>>
>> +  /* Disable VSX and Altivec silently if the user switched cpus to power7 
>> in a
>> + target attribute or pragma which automatically enables both options,
>> + unless the altivec ABI was set.  This is set by default for 64-bit, but
>> + not for 32-bit.  Don't move this before the above code using 
>> ignore_masks,
>> + since it can reset the cleared VSX/ALTIVEC flag again.  */
>> +  if (main_target_opt != NULL && !main_target_opt->x_rs6000_altivec_abi)
>> +rs6000_isa_flags &= ~((OPTION_MASK_VSX | OPTION_MASK_ALTIVEC)
>> +  & ~rs6000_isa_flags_explicit);
> 
> Could you at the same time get rid of the != NULL please?
>   if (bla != NULL)
> is sillier than
>   if (bla != 0)
> which is about the same as
>   if (!!bla)
> but that is certainly better than
>   if (bla != 0 != 0)
> although I am not sure about the more stylish
>   if (bla != 0 != 0 != 0 != 0 != 0)
> but what is wrong with
>   if (bla)
> ?  :-)
> 
>> +/* There are no error messages for either LE or BE 64bit.  */
>> +/* { dg-require-effective-target be }*/
> 
> (space before */)
> 
>> +/* { dg-require-effective-target ilp32 } */
>> +/* We don't have one powerpc.*_ok for Power6, use altivec_ok 
>> conservatively.  */
>> +/* { dg-require-effective-target powerpc_altivec_ok } */
>> +/* { dg-options "-mdejagnu-cpu=power6" } */
> 
> It is okay always, no _ok at all please.
> 
> Okay for trunk with those things (but do test of course).  Thanks!
> 
> 

Thanks for the review comments!  I've addressed them in the attached patch.
Besides, I updated the case a bit by adding -mno-avoid-indexed-addresses,
otherwise the associated case would have one unexpected warning which is
supposed to be fixed by [1].  But even with this adjustment, this patch
still relies on [2], otherwise the associated case will get ICE instead.
btw, it can survives in all the testings as before together with [2].

So I'll hold this to commit until [2] gets landed.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587449.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589325.html

BR,
Kewen
>From 90565d869395df08a65d521ddad8684a99e844a5 Mon Sep 17 00:00:00 2001
From: Kewen Lin 
Date: Thu, 27 Jan 2022 00:25:29 -0600
Subject: [PATCH 3/3] rs6000: Move the hunk affecting VSX/ALTIVEC ahead
 [PR103627]

The modified hunk can update VSX and ALTIVEC flag, we have some codes
to check/warn for some flags related to VSX and ALTIVEC sitting where
the hunk is proprosed to be moved to.  Without this adjustment, the
VSX and ALTIVEC update is too late, it can cause the incompatibility
and result in unexpected behaviors, the associated test case is one
typical case.

Since we already have the code which sets TARGET_FLOAT128_TYPE and lays
after the moved place, and OPTION_MASK_FLOAT128_KEYWORD will rely on
TARGET_FLOAT128_TYPE, so it just simply remove them.

gcc/ChangeLog:

PR target/103627
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Move the
hunk affecting VSX and ALTIVEC to appropriate place.

gcc/testsuite/ChangeLog:

PR target/103627
* gcc.target/powerpc/pr103627-3.c: New test.
---
 gcc/config/rs6000/rs6000.cc   | 21 ---
 gcc/testsuite/gcc.target/powerpc/pr103627-3.c | 20 ++
 2 files changed, 29 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103627-3.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 634937e052f..f90fd8955cd 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -3943,6 +3943,15 @@ rs6000_option_override_internal (bool global_init_p)
   else if (TARGET_ALTIVEC)
 rs6000_isa_flags |= (OPTION_MASK_PPC_GFXOPT & ~ignore_masks);
 
+  /* Disable VSX and Altivec silently if the user switched cpus to power7 in a
+ target attribute or pragma which automatically enables both options,
+ unless the altivec ABI was set.  This is set by default for 64-bit, but
+ not for 32-bit.  Don't move this before the above code using ignore_masks,
+ since it can reset the cleared VSX/ALTIVEC flag again.  */
+  if (main_target_opt && !main_target_opt->x_rs6000_altivec_abi)
+rs6000_isa_flags &= ~((OPTION_MASK_VSX | OPTION_MASK_ALTIVEC)
+ &

Re: [PATCH] Improve wording for -freport-bug option.

2022-01-27 Thread Jakub Jelinek via Gcc-patches

On Thu, Jan 27, 2022 at 12:38:47PM +0100, Martin Liška wrote:
> Yep, we have many more examples where a leading capital letter is used:
> 
> gcc/config/cris/cris.cc:  internal_error ("MULT case in %");
> gcc/config/cris/cris.h: do { if (!(x)) internal_error ("CRIS-port assertion 
> failed: %s", #x); } while (0)
> gcc/config/mmix/mmix.cc:internal_error ("MMIX Internal: Last named vararg 
> would not fit in a register");
> gcc/config/mmix/mmix.cc:internal_error ("MMIX Internal: Bad register: 
> %d", regno);
> gcc/config/mmix/mmix.cc:internal_error ("MMIX Internal: Bad register: 
> %d", regno);
> gcc/config/mmix/mmix.cc:  internal_error ("MMIX Internal: Missing %qc 
> case in %", code);
> gcc/config/mmix/mmix.cc:internal_error ("MMIX Internal: Bad register: 
> %d", regno);
> gcc/config/mmix/mmix.cc:  internal_error ("MMIX Internal: %s is not a 
> shiftable integer", s);
> gcc/config/mmix/mmix.cc:  internal_error ("MMIX Internal: %s is not a 
> shiftable integer", s);
> gcc/config/rs6000/host-darwin.cc:  internal_error ("Segmentation Fault 
> (code)");
> gcc/config/rs6000/host-darwin.cc:  internal_error ("Segmentation Fault");
> gcc/d/decl.cc:internal_error ("Mismatch between declaration %qE size 
> (%wd) and "
> gcc/fortran/arith.cc:   gfc_internal_error ("Fix min_int calculation");
> gcc/fortran/data.cc:  gfc_internal_error ("TODO: Vector sections in data 
> statements");
> gcc/fortran/decl.cc:gfc_internal_error ("Cannot set pointee 
> array spec.");
> gcc/fortran/decl.cc:  gfc_internal_error ("Missing symbol");
> gcc/fortran/decl.cc:gfc_internal_error ("Cannot set Cray pointee 
> array spec.");
> gcc/fortran/decl.cc:  gfc_internal_error ("Failed to create structure 
> type '%s' at %C", name);
> gcc/fortran/frontend-passes.cc: gfc_internal_error ("Illegal id in 
> copy_walk_reduction_arg");
> gcc/fortran/frontend-passes.cc:   gfc_internal_error ("Scalarization 
> using DIMEN_RANGE unimplemented");
> gcc/fortran/interface.cc:  gfc_internal_error ("Unable to find symbol %qs", 
> sym->name);
> gcc/fortran/intrinsic.cc:  gfc_internal_error ("Invalid standard code on 
> intrinsic %qs (%d)",
> gcc/fortran/intrinsic.cc:  gfc_internal_error ("Cannot convert %qs to %qs at 
> %L", type_name,
> gcc/fortran/simplify.cc:gfc_internal_error ("IBITS: Bad bit");
> gcc/fortran/simplify.cc:gfc_internal_error ("Reshaped array too large 
> at %C");
> gcc/fortran/simplify.cc:gfc_internal_error ("Bad type in 
> gfc_simplify_sign");
> gcc/fortran/simplify.cc:gfc_internal_error ("Failure getting length 
> of a constant array.");
> gcc/fortran/simplify.cc:gfc_internal_error ("Failure getting length of a 
> constant array.");
> gcc/fortran/target-memory.cc:  gfc_internal_error ("Invalid expression in 
> gfc_element_size.");
> gcc/fortran/target-memory.cc:  gfc_internal_error ("Invalid expression in 
> gfc_target_encode_expr.");
> gcc/fortran/target-memory.cc:  gfc_internal_error ("Invalid expression in 
> gfc_target_interpret_expr.");
> gcc/fortran/trans-intrinsic.cc:  gfc_internal_error ("Intrinsic function 
> %qs (%d) not recognized",
> gcc/fortran/trans-io.cc:  gfc_internal_error ("Bad IO basetype (%d)", 
> ts->type);
> gcc/ipa-sra.cc:  internal_error ("IPA-SRA access verification failed");
> gcc/ipa-sra.cc: internal_error ("Function %qs, parameter %u, has IPA-SRA 
> accesses "
> gcc/ipa-sra.cc: internal_error ("Function %s, parameter %u, is used but does 
> not "
> gcc/rtl.cc:  internal_error ("RTL check: expected code '%s', have '%s' in %s, 
> at %s:%d",
> gcc/tree-into-ssa.cc:   internal_error ("SSA corruption");
> gcc/tree-outof-ssa.cc:internal_error ("SSA corruption");
> gcc/tree-ssa-coalesce.cc:  internal_error ("SSA corruption");
> gcc/varasm.cc:  internal_error ("Section already exists: %qs", name);
> 
> I can prepare a separate patch for next stage1 if you want?

Ok.  MMIX, MULT, CRIS, IPA-SRA, SSA all look desirable to be
capitalized.  And I'd leave out gcc/fortran/, it has very different rules
and I think generally capitalizes diagnostics.

>   PR web/104254
> 
> gcc/ChangeLog:
> 
>   * diagnostic.cc (diagnostic_initialize):
>   Initialize report_bug flag.
>   (diagnostic_action_after_output):
>   Explain that -freport-bug option can be used for pre-processed
>   file creation.  Make the message shorter.
>   (error_recursion): Rename Internal to internal.
>   * diagnostic.h (struct diagnostic_context): New field.
>   * opts.cc (common_handle_option): Init the field here.

Ok, thanks.

Jakub

Re: [PATCH] PR fortran/84784 - ICEs: verify_gimple failed with -fdefault-integer-8

2022-01-27 Thread Mikael Morin


Le 26/01/2022 à 21:59, Harald Anlauf via Fortran a écrit :

Dear Fortranners,

the use of -fdefault-integer-8 exhibits several cases where
we missed to convert the result of an intrinsic from the
declared to the effective resulting type.

The attached obvious patch fixes this for IMAGE_STATUS,
TEAM_NUMBER, and POPCNT/POPPAR.

OK for mainline if regtesting passes on x86_64-pc-linux-gnu?


This is not a regression, but should be safe.
Can you add a test of POPPAR, similar to that of POPCNT?
OK with that change.

[PATCH][pushed] analyzer: fix -Wformat warnings on i686

2022-01-27 Thread Martin Liška


Pushed as obvious.

Martin

PR analyzer/104247

gcc/analyzer/ChangeLog:

* constraint-manager.cc (bounded_ranges_manager::log_stats):
Cast to long for format purpose.
* region-model-manager.cc (log_uniq_map): Likewise.
---
 gcc/analyzer/constraint-manager.cc   | 2 +-
 gcc/analyzer/region-model-manager.cc | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/analyzer/constraint-manager.cc 
b/gcc/analyzer/constraint-manager.cc
index 88b0988513a..ac1e4feaee5 100644
--- a/gcc/analyzer/constraint-manager.cc
+++ b/gcc/analyzer/constraint-manager.cc
@@ -1031,7 +1031,7 @@ void
 bounded_ranges_manager::log_stats (logger *logger, bool show_objs) const
 {
   LOG_SCOPE (logger);
-  logger->log ("  # %s: %li", "ranges", m_map.elements ());
+  logger->log ("  # %s: %li", "ranges", (long)m_map.elements ());
   if (!show_objs)
 return;
 
diff --git a/gcc/analyzer/region-model-manager.cc b/gcc/analyzer/region-model-manager.cc

index e765e7f484f..ba835cba22c 100644
--- a/gcc/analyzer/region-model-manager.cc
+++ b/gcc/analyzer/region-model-manager.cc
@@ -1573,7 +1573,7 @@ static void
 log_uniq_map (logger *logger, bool show_objs, const char *title,
  const hash_map _map)
 {
-  logger->log ("  # %s: %li", title, uniq_map.elements ());
+  logger->log ("  # %s: %li", title, (long)uniq_map.elements ());
   if (!show_objs)
 return;
   auto_vec vec_objs (uniq_map.elements ());
@@ -1597,7 +1597,7 @@ static void
 log_uniq_map (logger *logger, bool show_objs, const char *title,
  const consolidation_map )
 {
-  logger->log ("  # %s: %li", title, map.elements ());
+  logger->log ("  # %s: %li", title, (long)map.elements ());
   if (!show_objs)
 return;
 
--

2.34.1

Re: [PATCH] Improve wording for -freport-bug option.

2022-01-27 Thread Martin Liška


On 1/27/22 12:06, Jakub Jelinek wrote:

On Thu, Jan 27, 2022 at 11:59:51AM +0100, Martin Liška wrote:

@@ -1988,7 +1994,7 @@ error_recursion (diagnostic_context *context)
  pp_newline_and_flush (context->printer);
  
fnotice (stderr,

-  "Internal compiler error: Error reporting routines re-entered.\n");
+  "internal compiler error: Error reporting routines re-entered.\n");


Why do we capitalize the "E" in there?


That should not be capitalized. Fixed in attached patch.


Looking for internal_error calls, I think we mostly use lower case after
"internal compiler error: ", except for some cases where it should be
capitalized like "SSA corruption" etc., but
varasm.cc:  internal_error ("Section already exists: %qs", name);
looks wrong.


Yep, we have many more examples where a leading capital letter is used:

gcc/config/cris/cris.cc:  internal_error ("MULT case in %");
gcc/config/cris/cris.h: do { if (!(x)) internal_error ("CRIS-port assertion failed: 
%s", #x); } while (0)
gcc/config/mmix/mmix.cc:internal_error ("MMIX Internal: Last named vararg would 
not fit in a register");
gcc/config/mmix/mmix.cc:internal_error ("MMIX Internal: Bad register: 
%d", regno);
gcc/config/mmix/mmix.cc:internal_error ("MMIX Internal: Bad register: 
%d", regno);
gcc/config/mmix/mmix.cc:  internal_error ("MMIX Internal: Missing %qc case in 
%", code);
gcc/config/mmix/mmix.cc:internal_error ("MMIX Internal: Bad register: 
%d", regno);
gcc/config/mmix/mmix.cc:  internal_error ("MMIX Internal: %s is not a shiftable 
integer", s);
gcc/config/mmix/mmix.cc:  internal_error ("MMIX Internal: %s is not a shiftable 
integer", s);
gcc/config/rs6000/host-darwin.cc:  internal_error ("Segmentation Fault (code)");
gcc/config/rs6000/host-darwin.cc:  internal_error ("Segmentation Fault");
gcc/d/decl.cc:internal_error ("Mismatch between declaration %qE size (%wd) and 
"
gcc/fortran/arith.cc:   gfc_internal_error ("Fix min_int calculation");
gcc/fortran/data.cc:  gfc_internal_error ("TODO: Vector sections in data 
statements");
gcc/fortran/decl.cc:gfc_internal_error ("Cannot set pointee array 
spec.");
gcc/fortran/decl.cc:  gfc_internal_error ("Missing symbol");
gcc/fortran/decl.cc:gfc_internal_error ("Cannot set Cray pointee array 
spec.");
gcc/fortran/decl.cc:  gfc_internal_error ("Failed to create structure type '%s' 
at %C", name);
gcc/fortran/frontend-passes.cc: gfc_internal_error ("Illegal id in 
copy_walk_reduction_arg");
gcc/fortran/frontend-passes.cc:   gfc_internal_error ("Scalarization using 
DIMEN_RANGE unimplemented");
gcc/fortran/interface.cc:  gfc_internal_error ("Unable to find symbol %qs", 
sym->name);
gcc/fortran/intrinsic.cc:  gfc_internal_error ("Invalid standard code on 
intrinsic %qs (%d)",
gcc/fortran/intrinsic.cc:  gfc_internal_error ("Cannot convert %qs to %qs at 
%L", type_name,
gcc/fortran/simplify.cc:gfc_internal_error ("IBITS: Bad bit");
gcc/fortran/simplify.cc:gfc_internal_error ("Reshaped array too large at 
%C");
gcc/fortran/simplify.cc:gfc_internal_error ("Bad type in 
gfc_simplify_sign");
gcc/fortran/simplify.cc:gfc_internal_error ("Failure getting length of a 
constant array.");
gcc/fortran/simplify.cc:gfc_internal_error ("Failure getting length of a 
constant array.");
gcc/fortran/target-memory.cc:  gfc_internal_error ("Invalid expression in 
gfc_element_size.");
gcc/fortran/target-memory.cc:  gfc_internal_error ("Invalid expression in 
gfc_target_encode_expr.");
gcc/fortran/target-memory.cc:  gfc_internal_error ("Invalid expression in 
gfc_target_interpret_expr.");
gcc/fortran/trans-intrinsic.cc:  gfc_internal_error ("Intrinsic function %qs 
(%d) not recognized",
gcc/fortran/trans-io.cc:  gfc_internal_error ("Bad IO basetype (%d)", 
ts->type);
gcc/ipa-sra.cc:  internal_error ("IPA-SRA access verification failed");
gcc/ipa-sra.cc: internal_error ("Function %qs, parameter %u, has IPA-SRA accesses 
"
gcc/ipa-sra.cc: internal_error ("Function %s, parameter %u, is used but does not 
"
gcc/rtl.cc:  internal_error ("RTL check: expected code '%s', have '%s' in %s, at 
%s:%d",
gcc/tree-into-ssa.cc:   internal_error ("SSA corruption");
gcc/tree-outof-ssa.cc:internal_error ("SSA corruption");
gcc/tree-ssa-coalesce.cc:  internal_error ("SSA corruption");
gcc/varasm.cc:  internal_error ("Section already exists: %qs", name);

I can prepare a separate patch for next stage1 if you want?

Thanks,
Martin



Jakub

From 6126a82dea18c849e0440a9f5cded8f2a7c65f14 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 27 Jan 2022 11:22:42 +0100
Subject: [PATCH] Improve wording for -freport-bug option.

	PR web/104254

gcc/ChangeLog:

	* diagnostic.cc (diagnostic_initialize):
	Initialize report_bug flag.
	(diagnostic_action_after_output):
	Explain that -freport-bug option can be used for pre-processed
	file creation.  Make

Re: [PATCH] rs6000: Disable MMA if no P9 VECTOR support [PR103627]

2022-01-27 Thread Kewen.Lin via Gcc-patches

Hi Segher,

on 2022/1/27 上午6:19, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Dec 23, 2021 at 10:09:27AM +0800, Kewen.Lin wrote:
>> As PR103627 shows, there is an unexpected case where !TARGET_VSX
>> and TARGET_MMA co-exist.  As ISA3.1 claims, SIMD is a requirement
>> for MMA.  By looking into the ICE, I noticed that the current
>> MMA implementation depends on vector pairs load/store, but since
>> we don't have a separated option to control Power10 vector, this
>> patch is to check for Power9 vector instead.
>>
>> Bootstrapped and regtested on powerpc64le-linux-gnu P9 and
>> powerpc64-linux-gnu P8.
>>
>> Is it ok for trunk?
> 
> No, sorry.
> 
>> +  /* MMA requires SIMD support as ISA 3.1 claims and our implementation
>> + such as "*movoo" uses vector pair access which are only supported
>> + from ISA 3.1.  But since we don't have one separated option to
>> + control Power10 vector, check for Power9 vector instead.  */
>> +  if (TARGET_MMA && !TARGET_P9_VECTOR)
>> +{
>> +  if ((rs6000_isa_flags_explicit & OPTION_MASK_MMA) != 0)
>> +error ("%qs requires %qs", "-mmma", "-mpower9-vector");
>> +  rs6000_isa_flags &= ~OPTION_MASK_MMA;
>> +}
> 
> -mpower9-vector is a workaround that should go away.  TARGET_MMA should
> require ISA 3.1 (or POWER10) directly, and require VSX.


OK, I see.  Thanks for the detailed explanation below.  I guess that's why
we don't have one option like "-mpower10-vector" any more?

I posted patch v2 guarded with VSX at the below link instead
 
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589325.html
 
Hope it looks better to you.  :) 

BR,
Kewen

> 
>> +/* { dg-options "-mdejagnu-cpu=power10 -mno-power9-vector" } */
> 
> It should be impossible to select this at all.  Either you have vectors
> or you don't, but it should be impossible to selectively disable part of
> vector support.  That way madness lies.
> 
> We can allow no VSRs, only VRs, or all VSRs.  There is precedent for
> that (see -msoft-float for example, which means "do not use FPRs") --
> when compiling certain code we want to disallow whole register banks.
> But disallowing or allowing some instructions separately is not a good
> idea: it doesn't give any useful extra functionality, it is hard to use,
> and it is a lot of extra work for us (it is impossible to test, already,
> too many combinations).
> 
> 
> Segher
>

Re: [PATCH] rs6000: Fix constraint v with rs6000_constraints[RS6000_CONSTRAINT_v]

2022-01-27 Thread Kewen.Lin via Gcc-patches

on 2022/1/27 上午1:57, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Jan 26, 2022 at 10:26:45AM +0800, Kewen.Lin wrote:
>> on 2022/1/14 上午12:31, David Edelsohn wrote:
>> Yeah, but IMHO it still can confuse new comers at first glance.
> 
> Yes, or at least cause to read (well, grep) the whole backend and
> scratch heads.
> 
 2) Bootstrapped and tested one below patch to remove all the code using
 RS6000_CONSTRAINT_v on powerpc64le-linux-gnu P10 and P9,
 powerpc64-linux-gnu P8 and P7 with no regressions.
> 
>>> I would prefer that we not make gratuitous changes to this code, but
>>> maybe Segher has a different opinion.
>>
>> Thanks David for the comments!
>>
>> Hi Segher, what's your preference on this?
> 
> I like your original patch better.  But for stage 1, sorry.
> 

Thanks Segher!  Is it ok to commit it then?  Or I'll repost this once
next stage1 starts.

BR,
Kewen

> Indeed using ALTIVEC_REGS directly in the define_regiater_constraint
> works fine, but it isn't as clear as it could be that is correct.
> 
> 
> Segher
>

[PATCH v2] rs6000: Disable MMA if no VSX support [PR103627]

2022-01-27 Thread Kewen.Lin via Gcc-patches

Hi,

As PR103627 shows, there is an unexpected case where !TARGET_VSX
and TARGET_MMA co-exist.  As ISA3.1 claims, SIMD is a requirement
for MMA.  By looking into the ICE, I noticed that the current
MMA implementation depends on vector pairs load/store which use
VSX register, but we don't have a separated option to control
Power10 vector support and Segher pointed out "-mpower9-vector is
a workaround that should go away" and more explanations in [1].
So comparing to v1[2], this patch makes MMA require VSX instead.

Bootstrapped and regtested on powerpc64le-linux-gnu P9 & P10 and
powerpc64-linux-gnu P8.

Is it ok for trunk?

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589303.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587310.html

BR,
Kewen
-
gcc/ChangeLog:

PR target/103627
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Disable
MMA if !TARGET_VSX.

gcc/testsuite/ChangeLog:

PR target/103627
* gcc.target/powerpc/pr103627-1.c: New test.
* gcc.target/powerpc/pr103627-2.c: New test.


---
 gcc/config/rs6000/rs6000.cc   | 10 ++
 gcc/testsuite/gcc.target/powerpc/pr103627-1.c | 16 
 gcc/testsuite/gcc.target/powerpc/pr103627-2.c | 16 
 3 files changed, 42 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103627-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103627-2.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index a07db3a48bc..634937e052f 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -4491,6 +4491,16 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_MMA;
 }
 
+  /* MMA requires SIMD support as ISA 3.1 claims and our implementation
+ such as "*movoo" uses vector pair access which use VSX registers.
+ So make MMA require VSX support here.  */
+  if (TARGET_MMA && !TARGET_VSX)
+{
+  if ((rs6000_isa_flags_explicit & OPTION_MASK_MMA) != 0)
+   error ("%qs requires %qs", "-mmma", "-mvsx");
+  rs6000_isa_flags &= ~OPTION_MASK_MMA;
+}
+
   if (!TARGET_PCREL && TARGET_PCREL_OPT)
 rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT;
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr103627-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr103627-1.c
new file mode 100644
index 000..5cecf515e58
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103627-1.c
@@ -0,0 +1,16 @@
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -mno-vsx" } */
+
+/* Verify compiler emits error message instead of ICE.  */
+
+extern float *dest;
+extern __vector_quad src;
+
+int
+foo ()
+{
+  __builtin_mma_disassemble_acc (dest, );
+  /* { dg-error "'__builtin_mma_disassemble_acc' requires the '-mmma' option" 
"" { target *-*-* } .-1 } */
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr103627-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr103627-2.c
new file mode 100644
index 000..89ae4f607bf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103627-2.c
@@ -0,0 +1,16 @@
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -mmma -mno-vsx" } */
+
+/* Verify the emitted error message.  */
+
+extern float *dest;
+extern __vector_quad src;
+
+int
+foo ()
+{
+  __builtin_mma_disassemble_acc (dest, );
+  /* { dg-error "'-mmma' requires '-mvsx'" "mma" { target *-*-* } 0 } */
+  return 0;
+}
+
-- 
2.27.0

Re: PING^2 [PATCH] rs6000: Fix an assertion in update_target_cost_per_stmt [PR103702]

2022-01-27 Thread Kewen.Lin via Gcc-patches

on 2022/1/26 下午3:28, Richard Biener wrote:
> On Wed, Jan 26, 2022 at 3:15 AM Kewen.Lin via Gcc-patches
>  wrote:
>>
>> Gentle ping:
>>
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587309.html
> 
> OK.
> 

Thanks Richi!  Rebased, re-tested and committed as r12-6891.

BR,
Kewen

> Thanks,
> Richard.
> 
>> BR,
>> Kewen
>>
>>> on 2021/12/23 上午10:06, Kewen.Lin via Gcc-patches wrote:
 Hi,

 This patch is to fix one wrong assertion which is too aggressive.
 Vectorizer can do vec_construct costing for the vector type which
 only has one unit.  For the failed case, the passed-in vector type
 is "vector(1) int", though it doesn't end up with any construction
 eventually.  We have to handle this kind of input in function
 rs6000_cost_data::update_target_cost_per_stmt.

 Bootstrapped and regtested on powerpc64le-linux-gnu P9 and
 powerpc64-linux-gnu P8.

 Is it ok for trunk?

 BR,
 Kewen
 -
 gcc/ChangeLog:

  PR target/103702
  * config/rs6000/rs6000.c
  (rs6000_cost_data::update_target_cost_per_stmt): Fix one wrong
  assertion with early return.

 gcc/testsuite/ChangeLog:

  PR target/103702
  * gcc.target/powerpc/pr103702.c: New test.
 ---
  gcc/config/rs6000/rs6000.c  |  7 --
  gcc/testsuite/gcc.target/powerpc/pr103702.c | 24 +
  2 files changed, 29 insertions(+), 2 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103702.c

 diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
 index 0b09713b2f5..37f07fe5358 100644
 --- a/gcc/config/rs6000/rs6000.c
 +++ b/gcc/config/rs6000/rs6000.c
 @@ -5461,8 +5461,11 @@ rs6000_cost_data::update_target_cost_per_stmt 
 (vect_cost_for_stmt kind,
  {
tree vectype = STMT_VINFO_VECTYPE (stmt_info);
unsigned int nunits = vect_nunits_for_cost (vectype);
 -  /* We don't expect strided/elementwise loads for just 1 nunit.  */
 -  gcc_assert (nunits > 1);
 +  /* As PR103702 shows, it's possible that vectorizer wants to do
 + costings for only one unit here, it's no need to do any
 + penalization for it, so simply early return here.  */
 +  if (nunits == 1)
 +return;
/* i386 port adopts nunits * stmt_cost as the penalized cost
   for this kind of penalization, we used to follow it but
   found it could result in an unreliable body cost especially
 diff --git a/gcc/testsuite/gcc.target/powerpc/pr103702.c 
 b/gcc/testsuite/gcc.target/powerpc/pr103702.c
 new file mode 100644
 index 000..585946fd64b
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/powerpc/pr103702.c
 @@ -0,0 +1,24 @@
 +/* We don't have one powerpc.*_ok for Power6, use altivec_ok 
 conservatively.  */
 +/* { dg-require-effective-target powerpc_altivec_ok } */
 +/* { dg-options "-mdejagnu-cpu=power6 -O2 -ftree-loop-vectorize 
 -fno-tree-scev-cprop" } */
 +
 +/* Verify there is no ICE.  */
 +
 +unsigned short a, e;
 +int *b, *d;
 +int c;
 +extern int fn2 ();
 +void
 +fn1 ()
 +{
 +  void *f;
 +  for (;;)
 +{
 +  fn2 ();
 +  b = f;
 +  e = 0;
 +  for (; e < a; ++e)
 +b[e] = d[e * c];
 +}
 +}
 +
 --
 2.27.0

>>

Re: [PATCH] Improve wording for -freport-bug option.

2022-01-27 Thread Jakub Jelinek via Gcc-patches

On Thu, Jan 27, 2022 at 11:59:51AM +0100, Martin Liška wrote:
> @@ -1988,7 +1994,7 @@ error_recursion (diagnostic_context *context)
>  pp_newline_and_flush (context->printer);
>  
>fnotice (stderr,
> -"Internal compiler error: Error reporting routines re-entered.\n");
> +"internal compiler error: Error reporting routines re-entered.\n");

Why do we capitalize the "E" in there?
Looking for internal_error calls, I think we mostly use lower case after
"internal compiler error: ", except for some cases where it should be
capitalized like "SSA corruption" etc., but
varasm.cc:  internal_error ("Section already exists: %qs", name);
looks wrong.

Jakub

Re: [PATCH] Improve wording for -freport-bug option.

2022-01-27 Thread Martin Liška


On 1/27/22 11:47, Jakub Jelinek wrote:

On Thu, Jan 27, 2022 at 11:37:29AM +0100, Martin Liška wrote:

@@ -665,12 +667,12 @@ diagnostic_action_after_output (diagnostic_context 
*context,
if (context->abort_on_error)
  real_abort ();
-   fnotice (stderr, "Please submit a full bug report,\n"
-"with preprocessed source if appropriate.\n");
-   if (count > 0)
- fnotice (stderr,
-  ("Please include the complete backtrace "
-   "with any bug report.\n"));
+   const char *details
+ = context->report_bug ? "" : "(by using -freport-bug) ";
+   const char *ending = count > 0  ? "and the complete backtrace" : "";
+
+   fnotice (stderr, "\nPlease submit a full bug report, "
+"with preprocessed source %s%s.\n", details, ending);


This is highly translation unfriendly.
You need to use separate full strings for all the different cases.

Jakub



I see, there's updated version that should be fine.

$ ./xgcc -B. ~/Programming/testcases/pr100019.C -c
...
Please submit a full bug report, with preprocessed source (by using 
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.

$ ./xgcc -B. ~/Programming/testcases/pr100019.C -c -freport-bug
...
Please submit a full bug report, with preprocessed source.
Please include the complete backtrace with any bug report.
See  for instructions.
Preprocessed source stored into /tmp/ccReyRI7.out file, please attach this to 
your bugreport.


MartinFrom a438f0b40dfbc4580ce5de2a6fba004d1fdb3930 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 27 Jan 2022 11:22:42 +0100
Subject: [PATCH] Improve wording for -freport-bug option.

	PR web/104254

gcc/ChangeLog:

	* diagnostic.cc (diagnostic_initialize):
	Initialize report_bug flag.
	(diagnostic_action_after_output):
	Explain that -freport-bug option can be used for pre-processed
	file creation.  Make the message shorter.
	(error_recursion): Rename Internal to internal.
	* diagnostic.h (struct diagnostic_context): New field.
	* opts.cc (common_handle_option): Init the field here.
---
 gcc/diagnostic.cc | 22 ++
 gcc/diagnostic.h  |  3 +++
 gcc/opts.cc   |  4 
 3 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index f5f43d5e9a7..29f21a16afe 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -219,6 +219,8 @@ diagnostic_initialize (diagnostic_context *context, int n_opts)
   context->show_line_numbers_p = false;
   context->min_margin_width = 0;
   context->show_ruler_p = false;
+  context->report_bug = false;
+
   if (const char *var = getenv ("GCC_EXTRA_DIAGNOSTIC_OUTPUT"))
 {
   if (!strcmp (var, "fixits-v1"))
@@ -665,12 +667,16 @@ diagnostic_action_after_output (diagnostic_context *context,
 	if (context->abort_on_error)
 	  real_abort ();
 
-	fnotice (stderr, "Please submit a full bug report,\n"
-		 "with preprocessed source if appropriate.\n");
+	if (context->report_bug)
+	  fnotice (stderr, "\nPlease submit a full bug report, "
+		   "with preprocessed source.\n");
+	else
+	  fnotice (stderr, "\nPlease submit a full bug report, "
+		   "with preprocessed source (by using -freport-bug).\n");
+
 	if (count > 0)
-	  fnotice (stderr,
-		   ("Please include the complete backtrace "
-		"with any bug report.\n"));
+	  fnotice (stderr, "Please include the complete backtrace "
+		   "with any bug report.\n");
 	fnotice (stderr, "See %s for instructions.\n", bug_report_url);
 
 	exit (ICE_EXIT_CODE);
@@ -1437,8 +1443,8 @@ num_digits (int value)
 
 /* Given a partial pathname as input, return another pathname that
shares no directory elements with the pathname of __FILE__.  This
-   is used by fancy_abort() to print `Internal compiler error in expr.cc'
-   instead of `Internal compiler error in ../../GCC/gcc/expr.cc'.  */
+   is used by fancy_abort() to print `internal compiler error in expr.cc'
+   instead of `internal compiler error in ../../GCC/gcc/expr.cc'.  */
 
 const char *
 trim_filename (const char *name)
@@ -1988,7 +1994,7 @@ error_recursion (diagnostic_context *context)
 pp_newline_and_flush (context->printer);
 
   fnotice (stderr,
-	   "Internal compiler error: Error reporting routines re-entered.\n");
+	   "internal compiler error: Error reporting routines re-entered.\n");
 
   /* Call diagnostic_action_after_output to get the "please submit a bug
  report" message.  */
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index ccaa33b5817..3ca32979dde 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -340,6 +340,9 @@ struct diagnostic_context
  source output.  */
   bool show_ruler_p;
 
+  /* True if -freport-bug option is used.  */
+  bool report_bug;
+
   /* Used to specify additional diagnostic output to be emitted after the
  rest of the diagnostic.  This is for implementing

Re: [PATCH] Improve wording for -freport-bug option.

2022-01-27 Thread Jakub Jelinek via Gcc-patches

On Thu, Jan 27, 2022 at 11:37:29AM +0100, Martin Liška wrote:
> @@ -665,12 +667,12 @@ diagnostic_action_after_output (diagnostic_context 
> *context,
>   if (context->abort_on_error)
> real_abort ();
> - fnotice (stderr, "Please submit a full bug report,\n"
> -  "with preprocessed source if appropriate.\n");
> - if (count > 0)
> -   fnotice (stderr,
> -("Please include the complete backtrace "
> - "with any bug report.\n"));
> + const char *details
> +   = context->report_bug ? "" : "(by using -freport-bug) ";
> + const char *ending = count > 0  ? "and the complete backtrace" : "";
> +
> + fnotice (stderr, "\nPlease submit a full bug report, "
> +  "with preprocessed source %s%s.\n", details, ending);

This is highly translation unfriendly.
You need to use separate full strings for all the different cases.

Jakub

Re: [wwwdocs] Document __builtin_dynamic_object_size addition for GCC 12

2022-01-27 Thread Jakub Jelinek via Gcc-patches

On Thu, Jan 27, 2022 at 04:00:01PM +0530, Siddhesh Poyarekar wrote:
> Signed-off-by: Siddhesh Poyarekar 
> ---
>  htdocs/gcc-12/changes.html | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
> index c69b301e..c6baee75 100644
> --- a/htdocs/gcc-12/changes.html
> +++ b/htdocs/gcc-12/changes.html
> @@ -157,6 +157,8 @@ a work-in-progress.
>A new built-in function, __builtin_assoc_barrier, was 
> added.
>It can be used to inhibit re-association of floating-point
>expressions.
> +  Support for __builtin_dynamic_object_size compatible with
> +  the clang language extension was added.
>New warnings:
>  
>-Wbidi-chars warns about potentially misleading UTF-8

Ok.

Jakub

[PATCH] Improve wording for -freport-bug option.

2022-01-27 Thread Martin Liška


Hi.

The patch addresses a few things related to ICE report:

- 'Internal compiler error: Error reporting routines re-entered.' - unify 
'internal' wording
- I shortened:

'''
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
'''

- let's recommend using -freport-bug.

Now one can see:

$ ./xgcc -B. ~/Programming/testcases/pr100019.C -c
...
Please submit a full bug report, with preprocessed source (by using 
-freport-bug) and the complete backtrace.
See  for instructions.

$ ./xgcc -B. ~/Programming/testcases/pr100019.C -c -freport-bug
...
Please submit a full bug report, with preprocessed source and the complete 
backtrace.
See  for instructions.
Preprocessed source stored into /tmp/ccLgekCL.out file, please attach this to 
your bugreport.

Ready to be installed?
Thanks,
Martin


PR web/104254

gcc/ChangeLog:

* diagnostic.cc (diagnostic_initialize):
Initialize report_bug flag.
(diagnostic_action_after_output):
Explain that -freport-bug option can be used for pre-processed
file creation.  Make the message shorter.
(error_recursion): Rename Internal to internal.
* diagnostic.h (struct diagnostic_context): New field.
* opts.cc (common_handle_option): Init the field here.
---
 gcc/diagnostic.cc | 20 +++-
 gcc/diagnostic.h  |  3 +++
 gcc/opts.cc   |  4 
 3 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index f5f43d5e9a7..7acdee49551 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -219,6 +219,8 @@ diagnostic_initialize (diagnostic_context *context, int 
n_opts)
   context->show_line_numbers_p = false;
   context->min_margin_width = 0;
   context->show_ruler_p = false;
+  context->report_bug = false;
+
   if (const char *var = getenv ("GCC_EXTRA_DIAGNOSTIC_OUTPUT"))
 {
   if (!strcmp (var, "fixits-v1"))
@@ -665,12 +667,12 @@ diagnostic_action_after_output (diagnostic_context 
*context,
if (context->abort_on_error)
  real_abort ();
 
-	fnotice (stderr, "Please submit a full bug report,\n"

-"with preprocessed source if appropriate.\n");
-   if (count > 0)
- fnotice (stderr,
-  ("Please include the complete backtrace "
-   "with any bug report.\n"));
+   const char *details
+ = context->report_bug ? "" : "(by using -freport-bug) ";
+   const char *ending = count > 0  ? "and the complete backtrace" : "";
+
+   fnotice (stderr, "\nPlease submit a full bug report, "
+"with preprocessed source %s%s.\n", details, ending);
fnotice (stderr, "See %s for instructions.\n", bug_report_url);
 
 	exit (ICE_EXIT_CODE);

@@ -1437,8 +1439,8 @@ num_digits (int value)
 
 /* Given a partial pathname as input, return another pathname that

shares no directory elements with the pathname of __FILE__.  This
-   is used by fancy_abort() to print `Internal compiler error in expr.cc'
-   instead of `Internal compiler error in ../../GCC/gcc/expr.cc'.  */
+   is used by fancy_abort() to print `internal compiler error in expr.cc'
+   instead of `internal compiler error in ../../GCC/gcc/expr.cc'.  */
 
 const char *

 trim_filename (const char *name)
@@ -1988,7 +1990,7 @@ error_recursion (diagnostic_context *context)
 pp_newline_and_flush (context->printer);
 
   fnotice (stderr,

-  "Internal compiler error: Error reporting routines re-entered.\n");
+  "internal compiler error: Error reporting routines re-entered.\n");
 
   /* Call diagnostic_action_after_output to get the "please submit a bug

  report" message.  */
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index ccaa33b5817..3ca32979dde 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -340,6 +340,9 @@ struct diagnostic_context
  source output.  */
   bool show_ruler_p;
 
+  /* True if -freport-bug option is used.  */

+  bool report_bug;
+
   /* Used to specify additional diagnostic output to be emitted after the
  rest of the diagnostic.  This is for implementing
  -fdiagnostics-parseable-fixits and GCC_EXTRA_DIAGNOSTIC_OUTPUT.  */
diff --git a/gcc/opts.cc b/gcc/opts.cc
index f21c821ab2e..7d30deb8e4b 100644
--- a/gcc/opts.cc
+++ b/gcc/opts.cc
@@ -3158,6 +3158,10 @@ common_handle_option (struct gcc_options *opts,
dc->tabstop = value;
   break;
 
+case OPT_freport_bug:

+  dc->report_bug = value;
+  break;
+
 default:
   /* If the flag was handled in a standard way, assume the lack of
 processing here is intentional.  */
--
2.34.1

[PATCH] Fix aarch64/104201: branch-protection-attr.c fails after quoting difference

2022-01-27 Thread apinski--- via Gcc-patches

From: Andrew Pinski 

After the quoting changes in r12-6521-g03a1a86b5ee40d4e240, 
branch-protection-attr.c
fails due to expecting a different quoting type for "leaf".
This patch changes the quoting from "" to '' as that is what is used now.

Committed as obvious after a test of the testcase.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/branch-protection-attr.c: Fix quoting for
the expected error message on line 5 of leaf.
---
 gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c 
b/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c
index 229ce1ca7be..1d6e55f3907 100644
--- a/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c
+++ b/gcc/testsuite/gcc.target/aarch64/branch-protection-attr.c
@@ -4,7 +4,7 @@ void __attribute__ ((target("branch-protection=leaf")))
 foo1 ()
 {
 }
-/* { dg-error {invalid protection type \("leaf"\) in 
'target\("branch-protection="\)' pragma or attribute} "" { target *-*-* } 5 } */
+/* { dg-error {invalid protection type \('leaf'\) in 
'target\("branch-protection="\)' pragma or attribute} "" { target *-*-* } 5 } */
 /* { dg-error {pragma or attribute 'target\("branch-protection=leaf"\)' is not 
valid} "" { target *-*-* } 5 } */
 
 void __attribute__ ((target("branch-protection=none+pac-ret")))
-- 
2.17.1

[wwwdocs] Document __builtin_dynamic_object_size addition for GCC 12

2022-01-27 Thread Siddhesh Poyarekar

Signed-off-by: Siddhesh Poyarekar 
---
 htdocs/gcc-12/changes.html | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index c69b301e..c6baee75 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -157,6 +157,8 @@ a work-in-progress.
   A new built-in function, __builtin_assoc_barrier, was added.
   It can be used to inhibit re-association of floating-point
   expressions.
+  Support for __builtin_dynamic_object_size compatible with
+  the clang language extension was added.
   New warnings:
 
   -Wbidi-chars warns about potentially misleading UTF-8
-- 
2.34.1

RE: [PATCH][1/4][committed] aarch64: Add support for Armv8.8-a memory operations and memcpy expansion

2022-01-27 Thread Kyrylo Tkachov via Gcc-patches

Hi Richard,

Sorry for the delay in getting back to this. I'm now working on a patch to 
adjust this.

> -Original Message-
> From: Richard Sandiford 
> Sent: Tuesday, December 14, 2021 10:48 AM
> To: Kyrylo Tkachov via Gcc-patches 
> Cc: Kyrylo Tkachov 
> Subject: Re: [PATCH][1/4][committed] aarch64: Add support for Armv8.8-a
> memory operations and memcpy expansion
> 
> Kyrylo Tkachov via Gcc-patches  writes:
> > @@ -23568,6 +23568,28 @@
> aarch64_copy_one_block_and_progress_pointers (rtx *src, rtx *dst,
> >*dst = aarch64_progress_pointer (*dst);
> >  }
> >
> > +/* Expand a cpymem using the MOPS extension.  OPERANDS are taken
> > +   from the cpymem pattern.  Return true iff we succeeded.  */
> > +static bool
> > +aarch64_expand_cpymem_mops (rtx *operands)
> > +{
> > +  if (!TARGET_MOPS)
> > +return false;
> > +  rtx addr_dst = XEXP (operands[0], 0);
> > +  rtx addr_src = XEXP (operands[1], 0);
> > +  rtx sz_reg = operands[2];
> > +
> > +  if (!REG_P (sz_reg))
> > +sz_reg = force_reg (DImode, sz_reg);
> > +  if (!REG_P (addr_dst))
> > +addr_dst = force_reg (DImode, addr_dst);
> > +  if (!REG_P (addr_src))
> > +addr_src = force_reg (DImode, addr_src);
> > +  emit_insn (gen_aarch64_cpymemdi (addr_dst, addr_src, sz_reg));
> > +
> > +  return true;
> > +}
> 
> On this, I think it would be better to adjust the original src and dst
> MEMs if possible, since they contain metadata about the size of the
> access and alias set information.  It looks like the code above
> generates an instruction with a wild read and a wild write instead.
> 

Hmm, do you mean adjust the address of the MEMs in operands with something like 
replace_equiv_address_nv?

> It should be possible to do that with a define_expand/define_insn
> pair, where the define_expand takes two extra operands for the MEMs,
> but the define_insn contains the same operands as now.

Could you please expand on this a bit? This path is reached from the cpymemdi 
expander that already takes the two MEMs as operands and generates the 
aarch64_cpymemdi define_insn that uses just the address registers as operands. 
Should we carry the MEMs around in the define_insn as well after expand?


> 
> Since the instruction clobbers the three registers, I think we have to
> use copy_to_reg (unconditionally) to force a fresh register.  The ultimate
> caller is not expecting the values of the registers in the original
> address to change.

Thanks,
Kyrill

> 
> Thanks,
> Richard
> 
> 
> 
> > +
> >  /* Expand cpymem, as if from a __builtin_memcpy.  Return true if
> > we succeed, otherwise return false, indicating that a libcall to
> > memcpy should be emitted.  */
> > @@ -23581,19 +23603,25 @@ aarch64_expand_cpymem (rtx *operands)
> >rtx base;
> >machine_mode cur_mode = BLKmode;
> >
> > -  /* Only expand fixed-size copies.  */
> > +  /* Variable-sized memcpy can go through the MOPS expansion if
> available.  */
> >if (!CONST_INT_P (operands[2]))
> > -return false;
> > +return aarch64_expand_cpymem_mops (operands);
> >
> >unsigned HOST_WIDE_INT size = INTVAL (operands[2]);
> >
> > -  /* Try to inline up to 256 bytes.  */
> > -  unsigned HOST_WIDE_INT max_copy_size = 256;
> > +  /* Try to inline up to 256 bytes or use the MOPS threshold if available. 
> >  */
> > +  unsigned HOST_WIDE_INT max_copy_size
> > += TARGET_MOPS ? aarch64_mops_memcpy_size_threshold : 256;
> >
> >bool size_p = optimize_function_for_size_p (cfun);
> >
> > +  /* Large constant-sized cpymem should go through MOPS when possible.
> > + It should be a win even for size optimization in the general case.
> > + For speed optimization the choice between MOPS and the SIMD
> sequence
> > + depends on the size of the copy, rather than number of instructions,
> > + alignment etc.  */
> >if (size > max_copy_size)
> > -return false;
> > +return aarch64_expand_cpymem_mops (operands);
> >
> >int copy_bits = 256;
> >
> > @@ -23643,9 +23671,9 @@ aarch64_expand_cpymem (rtx *operands)
> >nops += 2;
> >n -= mode_bits;
> >
> > -  /* Emit trailing copies using overlapping unaligned accesses - this 
> > is
> > -smaller and faster.  */
> > -  if (n > 0 && n < copy_bits / 2)
> > +  /* Emit trailing copies using overlapping unaligned accesses
> > +   (when !STRICT_ALIGNMENT) - this is smaller and faster.  */
> > +  if (n > 0 && n < copy_bits / 2 && !STRICT_ALIGNMENT)
> > {
> >   machine_mode next_mode = smallest_mode_for_size (n,
> MODE_INT);
> >   int n_bits = GET_MODE_BITSIZE (next_mode).to_constant ();
> > @@ -23657,9 +23685,25 @@ aarch64_expand_cpymem (rtx *operands)
> >  }
> >rtx_insn *seq = get_insns ();
> >end_sequence ();
> > +  /* MOPS sequence requires 3 instructions for the memory copying + 1 to
> move
> > + the constant size into a register.  */
> > +  unsigned mops_cost = 3 + 1;
> > +
> > +  /* If MOPS is available at this point we don't consider the

Re: [PATCH 3/7] arm: Add option for mitigating against Cortex-A CPU erratum for AES

2022-01-27 Thread Jakub Jelinek via Gcc-patches

On Thu, Jan 20, 2022 at 11:27:20AM +, Richard Earnshaw via Gcc-patches 
wrote:
> gcc/ChangeLog:
> 
>   * config/arm/arm.opt (mfix-cortex-a57-aes-1742098): New command-line
>   option.
>   (mfix-cortex-a72-aes-1655431): New option alias.

> --- a/gcc/config/arm/arm.opt
> +++ b/gcc/config/arm/arm.opt
> @@ -272,6 +272,16 @@ mfix-cmse-cve-2021-35465
>  Target Var(fix_vlldm) Init(2)
>  Mitigate issues with VLLDM on some M-profile devices (CVE-2021-35465).
>  
> +mfix-cortex-a57-aes-1742098
> +Target Var(fix_aes_erratum_1742098) Init(2) Save
> +Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72.
> +Arm erratum #1742098
> +
> +mfix-cortex-a72-aes-1655431
> +Target Alias(mfix-cortex-a57-aes-1742098)
> +Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72.
> +Arm erratum #1655431
> +
>  munaligned-access
>  Target Var(unaligned_access) Init(2) Save
>  Enable unaligned word and halfword accesses to packed data.

This breaks:
Running /usr/src/gcc/gcc/testsuite/gcc.misc-tests/help.exp ...
FAIL: compiler driver --help=target option(s): "^ +-.*[^:.]$" absent from 
output: "  -mfix-cortex-a57-aes-1742098 Mitigate issues with AES instructions 
on Cortex-A57 and Cortex-A72. Arm erratum #1742098"

help.exp with help of lib/options.exp tests whether all non-empty descriptions 
of
options are terminated with . or :.

The following patch fixes that, ok for trunk?

2022-01-27  Jakub Jelinek  

* config/arm/arm.opt (mfix-cortex-a57-aes-1742098,
mfix-cortex-a72-aes-1655431): Ensure description ends with full stop.

--- gcc/config/arm/arm.opt.jj   2022-01-21 22:43:22.879719389 +0100
+++ gcc/config/arm/arm.opt  2022-01-27 11:02:29.457553296 +0100
@@ -274,13 +274,13 @@ Mitigate issues with VLLDM on some M-pro
 
 mfix-cortex-a57-aes-1742098
 Target Var(fix_aes_erratum_1742098) Init(2) Save
-Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72.
-Arm erratum #1742098
+Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72
+(Arm erratum #1742098).
 
 mfix-cortex-a72-aes-1655431
 Target Alias(mfix-cortex-a57-aes-1742098)
-Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72.
-Arm erratum #1655431
+Mitigate issues with AES instructions on Cortex-A57 and Cortex-A72
+(Arm erratum #1655431).
 
 munaligned-access
 Target Var(unaligned_access) Init(2) Save

Jakub

Re: [PATCH 1/2] Check negative combined step

2022-01-27 Thread Richard Biener via Gcc-patches

On Thu, 27 Jan 2022, Jiufu Guo wrote:

> Hi Richard,
> 
> Richard Biener  writes:
> 
> > On Tue, 25 Jan 2022, Richard Biener wrote:
> >
> >> On Tue, 25 Jan 2022, Jiufu Guo wrote:
> >>
> ...
> >> > >
> >> > > The point is that we may not change the iteration number at which
> >> > > overflow occurs since that alters the result of the < compare.
> >> > > Only if we know there is no overflow with the present expression
> >> > > during the loop evaluation we can do the transform and then only
> >> > > if we do not introduce overflow.
> >> > Exactly, this is also what I mean :)
> >> > 
> >> > >
> >> > > We are basically doing the transform that fold_comparison
> >> > > in fold-const.cc does:
> >> > >
> >> > >   /* Transform comparisons of the form X +- C1 CMP Y +- C2 to
> >> > >  X CMP Y +- C2 +- C1 for signed X, Y.  This is valid if
> >> > >  the resulting offset is smaller in absolute value than the
> >> > >  original one and has the same sign.  */
> >> > >   if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg0))
> >> > >   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0))
> >> > > ...
> >> > >
> >> > This transform seems not the scenario which we are care about in
> >> > number_of_iterations_cond.
> >> > For example, 'X + 1 < Y + 4' ==> 'X < Y + 3' would be correct if
> >> > no overflow happen.
> >> > But for loop, the niter for '{X, 1} < {Y, 4}' would be totally
> >> > different with niter for '{X, 0} < {Y, 3}'.
> >> > for (iv0 = X, iv1 = Y; iv0 < iv1; iv0 += 1, iv1 += 4)
> >> >in this loop, iv1 walks to overflow faster, step is 4.
> >> > vs.
> >> > for (iv0 = X, iv1 = Y; iv0 < iv1; iv1 += 3) (iv1 overflow slow)
> >> >in this loop, iv1 overflows slower, step is 3.
> >> 
> >> Huh?  But we are _exactly_ doing this, analyzing {X, + 1} < {Y, + 4}
> >> as X < {Y, + 3} (well, OK, we're only trying {X, -3} which now
> >> fails - we should try the other way around as well).
> >> 
> >> > Actually, the transformation 'X + 1 < Y + 4' ==> 'X < Y + 3',
> >> > may not always correct.  e.g. for below code, X=6, and Y=2147483645
> >> > it may output "b0 + 1 < b1 + 4 is true".
> >> 
> >> But Y + 4 overflows with 2147483645 so X + 1 < Y + 4 invokes UB
> >> and we can ignore this situation.
> >> 
> ...
> >> > >> > Or it may also ok if we can compute an assumption, under which
> >> > >> > all three ivs are not overflowed/wrapped.
> >> > >> >
> >> > >> >> only guaranteed if the absolute value of C1 - C2 is smaller than
> >> > >> >> that of C1 and if it has the same sign.
> >> > >> > I'm thinking this in another way:
> >> > >> > When trying to do this transform in number_of_iterations_cond,
> >> > >> > GT/GE is inverted to LT/LE, then the compare code would be:
> >> > >> > LT/LE or NE.
> >> > >> >
> >> > >> > For LT/LE, like {X, C1} < {Y, C2}, we can look it as iv0 is
> >> > >> > chasing iv1.  We would able to assume X < Y (may_be_zero would
> >> > >> > be set later via number_of_iterations_lt/le).
> >> > >> > 1. If C1 < C2, iv0 can never catch up iv1. For examples:
> >> > >> > {X, 1} < {Y, 2}; {X, -2} < {Y, -1}; {X, -2} < {Y, 1}.
> >> > >> > And there must be at least one overflow/wrap in iv0,iv1, or iv.
> >> > >> > This indicates, if the sign of (C1 - C1) is negative, then the
> >> > >> > transform would be incorrect.
> >> > >> > 2. If C1 > C2, we still need to make sure all the ivs (iv0,
> >> > >> > iv1 and combined iv) are not wrapped.
> >> > >> > For C2 > 0, {Y,C2} should not cross MAX before {X, C1} catch up.
> >> > >> >the assumption may like : (MAX-Y)/C2 > (Y-X)/(C1-C1)
> >> > >> There is still some cases: iv0 step is too large, then iv0 wraps
> >> > >> first, e.g. {MAX-5, 10} < {MAX-3, 1}. For this, the assumption
> >> > >> would need to and with (MAX-X)/C1 > (Y-X)/(C1-C1).
> >> > >> 
> >> > >> > For C1 < 0, {X,C1} should not down cross MIN
> >> > >> >the assumption may like : (X-MIN)/-C1 > (Y-X)/(C1-C1)
> >> > >>   Also add the assumption: (Y-MIN)/-C2 > (Y-X)/(C1-C1)
> >> > >> 
> >> > >> > For C1 > 0 and C2 < 0, iv0 and iv1 are walking to each other,
> >> > >> > it would be almost safe.
> >> > >> For this case, we may still add the assumption to avoid wraping
> >> > >> at the first iteration.
> >> > >> 
> >> > >> BR,
> >> > >> Jiufu
> >> > >> 
> >> > >> >
> >> > >> > For NE, it seems more interesting. The transformation depends
> >> > >> > on 3 things: 1. the relation between X and Y; 2 the sign
> >> > >> > of (C1-C2); 3. if iv0 and iv1 can be equal finally.
> >> > >> > The 3rd one may be more special.
> >> > >> > The good news is, number_of_iterations_ne seems able to handle NE.
> >> > >> >
> >> > >> >>
> >> > >> >> With the same reasoning we then know the new IV0 doesn't overflow.
> >> > >> >>
> >> > >> >> So something like the following.  IIRC I've proposed sth similar
> >> > >> >> a while back.  I'm going to give it some testing, collecting
> >> > >> >> testcases from related PRs.
> >> > >> >>
> >> > >> >> diff --git a/gcc/tree-ssa-loop-niter.cc 
> >> > >> >> b/gcc/tree-ssa-loop-niter.cc
> >> > >>

Re: [PATCH] reassoc: Fix up inter-bb range optimization [PR104196]

2022-01-27 Thread Richard Biener via Gcc-patches

On Thu, 27 Jan 2022, Jakub Jelinek wrote:

> Hi!
> 
> As mentioned in the PR, reassoc1 miscompiles following testcase.
> We have:
>   if (a.1_1 >= 0)
> goto ; [59.00%]
>   else
> goto ; [41.00%]
> 
>[local count: 440234144]:
>   _3 = -2147483647 - a.1_1;
>   _9 = a.1_1 != -2147479551;
>   _4 = _3 == 1;
>   _8 = _4 | _9;
>   if (_8 != 0)
> goto ; [34.51%]
>   else
> goto ; [65.49%]
> 
> and the inter-bb range test optimization treats it as:
>   if ((a.1_1 >= 0)
>   | (-2147483647 - a.1_1 == 1)
>   | (a.1_1 != -2147479551))
> goto bb5;
>   else
> goto bb3;
> and recognizes that a.1_1 >= 0 is redundant with a.1_1 != -2147479551
> and so will optimize it into:
>   if (0
>   | (-2147483647 - a.1_1 == 1)
>   | (a.1_1 != -2147479551))
> goto bb5;
>   else
> goto bb3;
> When merging 2 comparisons, we use update_range_test which picks one
> of the comparisons as the one holding the result (currently always
> the RANGE one rather than all the OTHERRANGE* ones) and adjusts the
> others to be true or false.
> The problem with doing that is that means the
>   _3 = -2147483647 - a.1_1;
> stmt with undefined behavior on overflow used to be conditional before
> but now is unconditional.  reassoc performs a no_side_effect_bb check
> which among other checks punts on gimple_has_side_effects and
> gimple_assign_rhs_could_trap_p stmts as well as ones that have uses of
> their lhs outside of the same bb, but it doesn't punt for this potential
> signed overflow case.
> 
> Now, for this testcase, it can be fixed in update_range_test by being
> smarter and choosing the other comparison to modify.  This is achieved
> by storing into oe->id index of the bb with GIMPLE_COND the
> comparison feeds into not just for the cases where the comparison is
> the GIMPLE_COND itself, but in all cases, and then finding oe->id that
> isn't dominated by others.  If we find such, use that oe for the merge
> test and if successful, swap range->idx and swap_with->idx.
> So for the above case we optimize it into:
>   if ((a.1_1 != -2147479551)
>   | (-2147483647 - a.1_1 == 1)
>   | 0)
> goto bb5;
>   else
> goto bb3;
> instead.
> 
> Unfortunately, this doesn't work in all cases,
> optimize_range_tests_to_bit_test and
> optimize_range_tests_cmp_bitwise optimizations use non-NULL seq
> to update_range_test and they carefully choose a particular comparison
> because the sequence then uses SSA_NAMEs that may be defined only in
> their blocks.  For that case, the patch keeps using the chosen comparison
> but if the merge is successful, rewrites stmts with signed overflow behavior
> into defined overflow.
> For this I ran into a problem, rewrite_to_defined_overflow returns a
> sequence that includes the original stmt with modified arguments, this means
> it needs to be gsi_remove first.  Unfortunately, gsi_remove regardless of
> the remove_permanently argument creates debug temps for the lhs, which I
> think is quite undesirable here.  So I've added an argument (default to
> false) to rewrite_to_defined_overflow to do the modification in place
> without the need to remove the stmt.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2022-01-27  Jakub Jelinek  
> 
>   PR tree-optimization/104196
>   * gimple-fold.h (rewrite_to_defined_overflow): Add IN_PLACE argument.
>   * gimple-fold.cc (rewrite_to_defined_overflow): Likewise.  If true,
>   return NULL and emit needed stmts before and after stmt.
>   * tree-ssa-reassoc.cc (update_range_test): For inter-bb range opt
>   pick as operand_entry that will hold the merged test the one feeding
>   earliest condition, ensure that by swapping range->idx with some
>   other range's idx if needed.  If seq is non-NULL, don't actually swap
>   it but instead rewrite stmts with undefined overflow in between
>   the two locations.
>   (maybe_optimize_range_tests): Set ops[]->id to bb->index with the
>   corresponding condition even if they have non-NULL ops[]->op.
>   Formatting fix.
> 
>   * gcc.c-torture/execute/pr104196.c: New test.
> 
> --- gcc/gimple-fold.h.jj  2022-01-18 11:58:59.615981585 +0100
> +++ gcc/gimple-fold.h 2022-01-26 18:46:04.524911097 +0100
> @@ -62,7 +62,7 @@ extern tree gimple_fold_indirect_ref (tr
>  extern bool gimple_fold_builtin_sprintf (gimple_stmt_iterator *);
>  extern bool gimple_fold_builtin_snprintf (gimple_stmt_iterator *);
>  extern bool arith_code_with_undefined_signed_overflow (tree_code);
> -extern gimple_seq rewrite_to_defined_overflow (gimple *);
> +extern gimple_seq rewrite_to_defined_overflow (gimple *, bool = false);
>  extern void replace_call_with_value (gimple_stmt_iterator *, tree);
>  extern tree tree_vec_extract (gimple_stmt_iterator *, tree, tree, tree, 
> tree);
>  
> --- gcc/gimple-fold.cc.jj 2022-01-21 11:01:12.473449208 +0100
> +++ gcc/gimple-fold.cc2022-01-26 18:52:52.094085990

Re: [PATCH 1/2] Check negative combined step

2022-01-27 Thread Jiufu Guo via Gcc-patches

Hi Richard,

Richard Biener  writes:

> On Tue, 25 Jan 2022, Richard Biener wrote:
>
>> On Tue, 25 Jan 2022, Jiufu Guo wrote:
>>
...
>> > >
>> > > The point is that we may not change the iteration number at which
>> > > overflow occurs since that alters the result of the < compare.
>> > > Only if we know there is no overflow with the present expression
>> > > during the loop evaluation we can do the transform and then only
>> > > if we do not introduce overflow.
>> > Exactly, this is also what I mean :)
>> > 
>> > >
>> > > We are basically doing the transform that fold_comparison
>> > > in fold-const.cc does:
>> > >
>> > >   /* Transform comparisons of the form X +- C1 CMP Y +- C2 to
>> > >  X CMP Y +- C2 +- C1 for signed X, Y.  This is valid if
>> > >  the resulting offset is smaller in absolute value than the
>> > >  original one and has the same sign.  */
>> > >   if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg0))
>> > >   && TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0))
>> > > ...
>> > >
>> > This transform seems not the scenario which we are care about in
>> > number_of_iterations_cond.
>> > For example, 'X + 1 < Y + 4' ==> 'X < Y + 3' would be correct if
>> > no overflow happen.
>> > But for loop, the niter for '{X, 1} < {Y, 4}' would be totally
>> > different with niter for '{X, 0} < {Y, 3}'.
>> > for (iv0 = X, iv1 = Y; iv0 < iv1; iv0 += 1, iv1 += 4)
>> >in this loop, iv1 walks to overflow faster, step is 4.
>> > vs.
>> > for (iv0 = X, iv1 = Y; iv0 < iv1; iv1 += 3) (iv1 overflow slow)
>> >in this loop, iv1 overflows slower, step is 3.
>> 
>> Huh?  But we are _exactly_ doing this, analyzing {X, + 1} < {Y, + 4}
>> as X < {Y, + 3} (well, OK, we're only trying {X, -3} which now
>> fails - we should try the other way around as well).
>> 
>> > Actually, the transformation 'X + 1 < Y + 4' ==> 'X < Y + 3',
>> > may not always correct.  e.g. for below code, X=6, and Y=2147483645
>> > it may output "b0 + 1 < b1 + 4 is true".
>> 
>> But Y + 4 overflows with 2147483645 so X + 1 < Y + 4 invokes UB
>> and we can ignore this situation.
>> 
...
>> > >> > Or it may also ok if we can compute an assumption, under which
>> > >> > all three ivs are not overflowed/wrapped.
>> > >> >
>> > >> >> only guaranteed if the absolute value of C1 - C2 is smaller than
>> > >> >> that of C1 and if it has the same sign.
>> > >> > I'm thinking this in another way:
>> > >> > When trying to do this transform in number_of_iterations_cond,
>> > >> > GT/GE is inverted to LT/LE, then the compare code would be:
>> > >> > LT/LE or NE.
>> > >> >
>> > >> > For LT/LE, like {X, C1} < {Y, C2}, we can look it as iv0 is
>> > >> > chasing iv1.  We would able to assume X < Y (may_be_zero would
>> > >> > be set later via number_of_iterations_lt/le).
>> > >> > 1. If C1 < C2, iv0 can never catch up iv1. For examples:
>> > >> > {X, 1} < {Y, 2}; {X, -2} < {Y, -1}; {X, -2} < {Y, 1}.
>> > >> > And there must be at least one overflow/wrap in iv0,iv1, or iv.
>> > >> > This indicates, if the sign of (C1 - C1) is negative, then the
>> > >> > transform would be incorrect.
>> > >> > 2. If C1 > C2, we still need to make sure all the ivs (iv0,
>> > >> > iv1 and combined iv) are not wrapped.
>> > >> > For C2 > 0, {Y,C2} should not cross MAX before {X, C1} catch up.
>> > >> >the assumption may like : (MAX-Y)/C2 > (Y-X)/(C1-C1)
>> > >> There is still some cases: iv0 step is too large, then iv0 wraps
>> > >> first, e.g. {MAX-5, 10} < {MAX-3, 1}. For this, the assumption
>> > >> would need to and with (MAX-X)/C1 > (Y-X)/(C1-C1).
>> > >> 
>> > >> > For C1 < 0, {X,C1} should not down cross MIN
>> > >> >the assumption may like : (X-MIN)/-C1 > (Y-X)/(C1-C1)
>> > >>   Also add the assumption: (Y-MIN)/-C2 > (Y-X)/(C1-C1)
>> > >> 
>> > >> > For C1 > 0 and C2 < 0, iv0 and iv1 are walking to each other,
>> > >> > it would be almost safe.
>> > >> For this case, we may still add the assumption to avoid wraping
>> > >> at the first iteration.
>> > >> 
>> > >> BR,
>> > >> Jiufu
>> > >> 
>> > >> >
>> > >> > For NE, it seems more interesting. The transformation depends
>> > >> > on 3 things: 1. the relation between X and Y; 2 the sign
>> > >> > of (C1-C2); 3. if iv0 and iv1 can be equal finally.
>> > >> > The 3rd one may be more special.
>> > >> > The good news is, number_of_iterations_ne seems able to handle NE.
>> > >> >
>> > >> >>
>> > >> >> With the same reasoning we then know the new IV0 doesn't overflow.
>> > >> >>
>> > >> >> So something like the following.  IIRC I've proposed sth similar
>> > >> >> a while back.  I'm going to give it some testing, collecting
>> > >> >> testcases from related PRs.
>> > >> >>
>> > >> >> diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
>> > >> >> index b767056aeb0..74fa4f66ee2 100644
>> > ...
>> > >> >> +  if (TREE_CODE (step) != INTEGER_CST
>> > >> >> + || !iv0->no_overflow || !iv1->no_overflow)
>> > >> >> +   {
>> > >> > I was also trying to leverage no_overflow of iv0 and iv1. While it

[PATCH] reassoc: Fix up inter-bb range optimization [PR104196]

2022-01-27 Thread Jakub Jelinek via Gcc-patches

Hi!

As mentioned in the PR, reassoc1 miscompiles following testcase.
We have:
  if (a.1_1 >= 0)
goto ; [59.00%]
  else
goto ; [41.00%]

   [local count: 440234144]:
  _3 = -2147483647 - a.1_1;
  _9 = a.1_1 != -2147479551;
  _4 = _3 == 1;
  _8 = _4 | _9;
  if (_8 != 0)
goto ; [34.51%]
  else
goto ; [65.49%]

and the inter-bb range test optimization treats it as:
  if ((a.1_1 >= 0)
  | (-2147483647 - a.1_1 == 1)
  | (a.1_1 != -2147479551))
goto bb5;
  else
goto bb3;
and recognizes that a.1_1 >= 0 is redundant with a.1_1 != -2147479551
and so will optimize it into:
  if (0
  | (-2147483647 - a.1_1 == 1)
  | (a.1_1 != -2147479551))
goto bb5;
  else
goto bb3;
When merging 2 comparisons, we use update_range_test which picks one
of the comparisons as the one holding the result (currently always
the RANGE one rather than all the OTHERRANGE* ones) and adjusts the
others to be true or false.
The problem with doing that is that means the
  _3 = -2147483647 - a.1_1;
stmt with undefined behavior on overflow used to be conditional before
but now is unconditional.  reassoc performs a no_side_effect_bb check
which among other checks punts on gimple_has_side_effects and
gimple_assign_rhs_could_trap_p stmts as well as ones that have uses of
their lhs outside of the same bb, but it doesn't punt for this potential
signed overflow case.

Now, for this testcase, it can be fixed in update_range_test by being
smarter and choosing the other comparison to modify.  This is achieved
by storing into oe->id index of the bb with GIMPLE_COND the
comparison feeds into not just for the cases where the comparison is
the GIMPLE_COND itself, but in all cases, and then finding oe->id that
isn't dominated by others.  If we find such, use that oe for the merge
test and if successful, swap range->idx and swap_with->idx.
So for the above case we optimize it into:
  if ((a.1_1 != -2147479551)
  | (-2147483647 - a.1_1 == 1)
  | 0)
goto bb5;
  else
goto bb3;
instead.

Unfortunately, this doesn't work in all cases,
optimize_range_tests_to_bit_test and
optimize_range_tests_cmp_bitwise optimizations use non-NULL seq
to update_range_test and they carefully choose a particular comparison
because the sequence then uses SSA_NAMEs that may be defined only in
their blocks.  For that case, the patch keeps using the chosen comparison
but if the merge is successful, rewrites stmts with signed overflow behavior
into defined overflow.
For this I ran into a problem, rewrite_to_defined_overflow returns a
sequence that includes the original stmt with modified arguments, this means
it needs to be gsi_remove first.  Unfortunately, gsi_remove regardless of
the remove_permanently argument creates debug temps for the lhs, which I
think is quite undesirable here.  So I've added an argument (default to
false) to rewrite_to_defined_overflow to do the modification in place
without the need to remove the stmt.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-01-27  Jakub Jelinek  

PR tree-optimization/104196
* gimple-fold.h (rewrite_to_defined_overflow): Add IN_PLACE argument.
* gimple-fold.cc (rewrite_to_defined_overflow): Likewise.  If true,
return NULL and emit needed stmts before and after stmt.
* tree-ssa-reassoc.cc (update_range_test): For inter-bb range opt
pick as operand_entry that will hold the merged test the one feeding
earliest condition, ensure that by swapping range->idx with some
other range's idx if needed.  If seq is non-NULL, don't actually swap
it but instead rewrite stmts with undefined overflow in between
the two locations.
(maybe_optimize_range_tests): Set ops[]->id to bb->index with the
corresponding condition even if they have non-NULL ops[]->op.
Formatting fix.

* gcc.c-torture/execute/pr104196.c: New test.

--- gcc/gimple-fold.h.jj2022-01-18 11:58:59.615981585 +0100
+++ gcc/gimple-fold.h   2022-01-26 18:46:04.524911097 +0100
@@ -62,7 +62,7 @@ extern tree gimple_fold_indirect_ref (tr
 extern bool gimple_fold_builtin_sprintf (gimple_stmt_iterator *);
 extern bool gimple_fold_builtin_snprintf (gimple_stmt_iterator *);
 extern bool arith_code_with_undefined_signed_overflow (tree_code);
-extern gimple_seq rewrite_to_defined_overflow (gimple *);
+extern gimple_seq rewrite_to_defined_overflow (gimple *, bool = false);
 extern void replace_call_with_value (gimple_stmt_iterator *, tree);
 extern tree tree_vec_extract (gimple_stmt_iterator *, tree, tree, tree, tree);
 
--- gcc/gimple-fold.cc.jj   2022-01-21 11:01:12.473449208 +0100
+++ gcc/gimple-fold.cc  2022-01-26 18:52:52.094085990 +0100
@@ -8539,11 +8539,12 @@ arith_code_with_undefined_signed_overflo
its operand, carrying out the operation in the corresponding unsigned
type and converting the result back to the original type.
 
-   Returns a sequence of statements that replace STMT

76 matches

Mail list logo