Re: [PATCH v2] match.pd: rewrite select to branchless expression

2022-11-10 Thread Prathamesh Kulkarni via Gcc-patches
On Fri, 11 Nov 2022 at 07:58, Michael Collison  wrote:
>
> This patches transforms ((x & 0x1) == 0) ? y : z  y -into
> (-(typeof(y))(x & 0x1) & z)  y, where op is a '^' or a '|'. It also
> transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x ,
> 0x1)) & z ) op y.
>
> Matching this patterns allows GCC to generate branchless code for one of
> the functions in coremark.
>
> Bootstrapped and tested on x86 and RISC-V. Okay?
>
> Michael.
>
> 2022-11-10  Michael Collison  
>
>  * match.pd ((x & 0x1) == 0) ? y : z  y
>  -> (-(typeof(y))(x & 0x1) & z)  y.
>
> 2022-11-10  Michael Collison 
>
>  * gcc.dg/tree-ssa/branchless-cond.c: New test.
>
> ---
>
> Changes in v2:
>
> - Rewrite comment to use C syntax
>
> - Guard against 1-bit types
>
> - Simplify pattern by using zero_one_valued_p
>
>   gcc/match.pd  | 24 +
>   .../gcc.dg/tree-ssa/branchless-cond.c | 26 +++
>   2 files changed, 50 insertions(+)
>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 194ba8f5188..258531e9046 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
> (max @2 @1))
>
> +/* ((x & 0x1) == 0) ? y : z  y -> (-(typeof(y))(x & 0x1) & z)  y */
> +(for op (bit_xor bit_ior)
> + (simplify
> +  (cond (eq zero_one_valued_p@0
> +integer_zerop)
> +@1
> +(op:c @2 @1))
> +  (if (INTEGRAL_TYPE_P (type)
> +   && TYPE_PRECISION (type) > 1
> +   && (INTEGRAL_TYPE_P (TREE_TYPE (@0
> +   (op (bit_and (negate (convert:type @0)) @2) @1
> +
> +/* ((x & 0x1) == 0) ? z  y : y -> (-(typeof(y))(x & 0x1) & z)  y */
> +(for op (bit_xor bit_ior)
> + (simplify
> +  (cond (ne zero_one_valued_p@0
> +integer_zerop)
> +   (op:c @2 @1)
> +@1)
> +  (if (INTEGRAL_TYPE_P (type)
> +   && TYPE_PRECISION (type) > 1
> +   && (INTEGRAL_TYPE_P (TREE_TYPE (@0
> +   (op (bit_and (negate (convert:type @0)) @2) @1
> +
>   /* Simplifications of shift and rotates.  */
>
>   (for rotate (lrotate rrotate)
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> new file mode 100644
> index 000..68087ae6568
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +int f1(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) == 0) ? y : z ^ y;
> +}
> +
> +int f2(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) != 0) ? z ^ y : y;
> +}
> +
> +int f3(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) == 0) ? y : z | y;
> +}
> +
> +int f4(unsigned int x, unsigned int y, unsigned int z)
> +{
> +  return ((x & 1) != 0) ? z | y : y;
> +}
Sorry to nitpick -- Since the pattern gates on INTEGRAL_TYPE_P, would
it be a good idea
to have these tests for other integral types too besides int like
{char, short, long} ?

Thanks,
Prathamesh
> +
> +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */
> +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */
> --
> 2.34.1
>


[PATCH] c++: Implement CWG 2654 - Un-deprecation of compound volatile assignments

2022-11-10 Thread Jakub Jelinek via Gcc-patches
Hi!

Again, because stage1 close is near, posting the following patch
to implement CWG 2654.

Ok for trunk if it passes bootstrap/regtest and is voted into C++23
and C++20 as a DR?

2022-11-11  Jakub Jelinek  

* typeck.cc (cp_build_modify_expr): Implement CWG 2654
- Un-deprecation of compound volatile assignments.  Remove
-Wvolatile warning about compound volatile assignments.

* g++.dg/cpp2a/volatile1.C (fn2, fn3, racoon): Adjust expected
diagnostics.
* g++.dg/cpp2a/volatile3.C (fn2, fn3, racoon): Likewise.
* g++.dg/cpp2a/volatile5.C (f): Likewise.

--- gcc/cp/typeck.cc.jj 2022-11-09 11:22:42.617628059 +0100
+++ gcc/cp/typeck.cc2022-11-10 23:19:00.394228067 +0100
@@ -9513,19 +9513,6 @@ cp_build_modify_expr (location_t loc, tr
 && MAYBE_CLASS_TYPE_P (TREE_TYPE (lhstype)))
|| MAYBE_CLASS_TYPE_P (lhstype)));
 
- /* An expression of the form E1 op= E2.  [expr.ass] says:
-"Such expressions are deprecated if E1 has volatile-qualified
-type and op is not one of the bitwise operators |, &, ^."
-We warn here rather than in cp_genericize_r because
-for compound assignments we are supposed to warn even if the
-assignment is a discarded-value expression.  */
- if (modifycode != BIT_AND_EXPR
- && modifycode != BIT_IOR_EXPR
- && modifycode != BIT_XOR_EXPR
- && (TREE_THIS_VOLATILE (lhs) || CP_TYPE_VOLATILE_P (lhstype)))
-   warning_at (loc, OPT_Wvolatile,
-   "compound assignment with %-qualified left "
-   "operand is deprecated");
  /* Preevaluate the RHS to make sure its evaluation is complete
 before the lvalue-to-rvalue conversion of the LHS:
 
--- gcc/testsuite/g++.dg/cpp2a/volatile1.C.jj   2022-08-16 13:15:22.739043862 
+0200
+++ gcc/testsuite/g++.dg/cpp2a/volatile1.C  2022-11-10 23:23:18.949717772 
+0100
@@ -74,17 +74,17 @@ fn2 ()
   decltype(i = vi = 42) x3 = i;
 
   // Compound assignments.
-  vi += i; // { dg-warning "assignment with .volatile.-qualified left operand 
is deprecated" "" { target c++20 } }
-  vi -= i; // { dg-warning "assignment with .volatile.-qualified left operand 
is deprecated" "" { target c++20 } }
-  vi %= i; // { dg-warning "assignment with .volatile.-qualified left operand 
is deprecated" "" { target c++20 } }
+  vi += i; // { dg-bogus "assignment with .volatile.-qualified left operand is 
deprecated" }
+  vi -= i; // { dg-bogus "assignment with .volatile.-qualified left operand is 
deprecated" }
+  vi %= i; // { dg-bogus "assignment with .volatile.-qualified left operand is 
deprecated" }
   vi ^= i; // { dg-bogus "assignment with .volatile.-qualified left operand is 
deprecated" }
   vi |= i; // { dg-bogus "assignment with .volatile.-qualified left operand is 
deprecated" }
   vi &= i; // { dg-bogus "assignment with .volatile.-qualified left operand is 
deprecated" }
-  vi /= i; // { dg-warning "assignment with .volatile.-qualified left operand 
is deprecated" "" { target c++20 } }
+  vi /= i; // { dg-bogus "assignment with .volatile.-qualified left operand is 
deprecated" }
   vi = vi += 42; // { dg-warning "assignment with .volatile.-qualified left 
operand is deprecated" "" { target c++20 } }
   vi += vi = 42; // { dg-warning "assignment with .volatile.-qualified left 
operand is deprecated" "" { target c++20 } }
   i *= vi;
-  decltype(vi -= 42) x2 = vi; // { dg-warning "assignment with 
.volatile.-qualified left operand is deprecated" "" { target c++20 } }
+  decltype(vi -= 42) x2 = vi; // { dg-bogus "assignment with 
.volatile.-qualified left operand is deprecated" }
 
   // Structured bindings.
   int a[] = { 10, 5 };
@@ -107,12 +107,12 @@ fn3 ()
   volatile U u;
   u.c = 42;
   i = u.c = 42; // { dg-warning "assignment with .volatile.-qualified left 
operand is deprecated" "" { target c++20 } }
-  u.c += 42; // { dg-warning "assignment with .volatile.-qualified left 
operand is deprecated" "" { target c++20 } }
+  u.c += 42; // { dg-bogus "assignment with .volatile.-qualified left operand 
is deprecated" }
 
   volatile T t;
   t.a = 3;
   j = t.a = 3; // { dg-warning "assignment with .volatile.-qualified left 
operand is deprecated" "" { target c++20 } }
-  t.a += 3; // { dg-warning "assignment with .volatile.-qualified left operand 
is deprecated" "" { target c++20 } }
+  t.a += 3; // { dg-bogus "assignment with .volatile.-qualified left operand 
is deprecated" }
 
   volatile int *src = 
   *src; // No assignment, don't warn.
@@ -135,7 +135,7 @@ void raccoon ()
   volatile T t, u;
   t = 42;
   u = t = 42; // { dg-warning "assignment with .volatile.-qualified left 
operand is deprecated" "" { target c++20 } }
-  t += 42; // { dg-warning "assignment with .volatile.-qualified left operand 
is deprecated" "" { target c++20 } }
+  t += 42; // { dg-bogus "assignment with 

[PATCH] c++: Implement C++23 P2589R1 - - static operator[]

2022-11-10 Thread Jakub Jelinek via Gcc-patches
Hi!

As stage1 is very close, here is a patch that implements the static
operator[] paper.
One thing that doesn't work properly is the same problem as I've filed
yesterday for static operator() - PR107624 - that side-effects of
the postfix-expression on which the call or subscript operator are
applied are thrown away, I assume we have to add them into COMPOUND_EXPR
somewhere after we find out that the we've chosen a static member function
operator.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
provided the paper gets voted into C++23?

2022-11-11  Jakub Jelinek  

gcc/c-family/
* c-cppbuiltin.cc (c_cpp_builtins): Bump C++23
__cpp_multidimensional_subscript macro value to 202211L.
gcc/cp/
* decl.cc (grok_op_properties): Implement C++23 P2589R1
- static operator[].  Handle operator[] similarly to operator()
- allow static member functions, but pedwarn on it for C++20 and
older.  Unlike operator(), perform rest of checks on it though for
C++20.
* call.cc (add_operator_candidates): For operator[] with class
typed first parameter, pass that parameter as first_arg and
an adjusted arglist without that parameter.
gcc/testsuite/
* g++.dg/cpp23/subscript9.C: New test.
* g++.dg/cpp23/feat-cxx2b.C: Expect a newer
__cpp_multidimensional_subscript value.
* g++.old-deja/g++.bugs/900210_10.C: Don't expect an error
for C++23 or later.

--- gcc/c-family/c-cppbuiltin.cc.jj 2022-10-14 09:35:56.182990495 +0200
+++ gcc/c-family/c-cppbuiltin.cc2022-11-10 22:29:12.539832741 +0100
@@ -1075,7 +1075,7 @@ c_cpp_builtins (cpp_reader *pfile)
  cpp_define (pfile, "__cpp_size_t_suffix=202011L");
  cpp_define (pfile, "__cpp_if_consteval=202106L");
  cpp_define (pfile, "__cpp_constexpr=202110L");
- cpp_define (pfile, "__cpp_multidimensional_subscript=202110L");
+ cpp_define (pfile, "__cpp_multidimensional_subscript=202211L");
  cpp_define (pfile, "__cpp_named_character_escapes=202207L");
  cpp_define (pfile, "__cpp_static_call_operator=202207L");
  cpp_define (pfile, "__cpp_implicit_move=202207L");
--- gcc/cp/decl.cc.jj   2022-11-08 09:54:37.313400209 +0100
+++ gcc/cp/decl.cc  2022-11-10 21:26:06.891359343 +0100
@@ -15377,7 +15377,15 @@ grok_op_properties (tree decl, bool comp
  an enumeration, or a reference to an enumeration.  13.4.0.6 */
   if (! methodp || DECL_STATIC_FUNCTION_P (decl))
 {
-  if (operator_code == CALL_EXPR)
+  if (operator_code == TYPE_EXPR
+ || operator_code == COMPONENT_REF
+ || operator_code == NOP_EXPR)
+   {
+ error_at (loc, "%qD must be a non-static member function", decl);
+ return false;
+   }
+
+  if (operator_code == CALL_EXPR || operator_code == ARRAY_REF)
{
  if (! DECL_STATIC_FUNCTION_P (decl))
{
@@ -15386,52 +15394,41 @@ grok_op_properties (tree decl, bool comp
}
  if (cxx_dialect < cxx23
  /* For lambdas we diagnose static lambda specifier elsewhere.  */
- && ! LAMBDA_FUNCTION_P (decl)
+ && (operator_code == ARRAY_REF || ! LAMBDA_FUNCTION_P (decl))
  /* For instantiations, we have diagnosed this already.  */
  && ! DECL_USE_TEMPLATE (decl))
pedwarn (loc, OPT_Wc__23_extensions, "%qD may be a static member "
- "function only with %<-std=c++23%> or %<-std=gnu++23%>", decl);
- /* There are no further restrictions on the arguments to an
-overloaded "operator ()".  */
- return true;
-   }
-  if (operator_code == TYPE_EXPR
- || operator_code == COMPONENT_REF
- || operator_code == ARRAY_REF
- || operator_code == NOP_EXPR)
-   {
- error_at (loc, "%qD must be a non-static member function", decl);
- return false;
+"function only with %<-std=c++23%> or %<-std=gnu++23%>",
+decl);
}
-
-  if (DECL_STATIC_FUNCTION_P (decl))
+  else if (DECL_STATIC_FUNCTION_P (decl))
{
  error_at (loc, "%qD must be either a non-static member "
"function or a non-member function", decl);
  return false;
}
-
-  for (tree arg = argtypes; ; arg = TREE_CHAIN (arg))
-   {
- if (!arg || arg == void_list_node)
-   {
- if (complain)
-   error_at(loc, "%qD must have an argument of class or "
-"enumerated type", decl);
- return false;
-   }
+  else
+   for (tree arg = argtypes; ; arg = TREE_CHAIN (arg))
+ {
+   if (!arg || arg == void_list_node)
+ {
+   if (complain)
+ error_at (loc, "%qD must have an argument of class or "
+   "enumerated type", decl);
+   

[committed] libstdc++: Fix tests with non-const operator==

2022-11-10 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

These tests fail in strict -std=c++20 mode but their equality ops don't
need to be non-const, it looks like an accident.

This fixes two FAILs with -std=c++20:
FAIL: 20_util/tuple/swap.cc (test for excess errors)
FAIL: 26_numerics/valarray/87641.cc (test for excess errors)

libstdc++-v3/ChangeLog:

* testsuite/20_util/tuple/swap.cc (MoveOnly::operator==): Add
const qualifier.
* testsuite/26_numerics/valarray/87641.cc (X::operator==):
Likewise.
---
 libstdc++-v3/testsuite/20_util/tuple/swap.cc | 2 +-
 libstdc++-v3/testsuite/26_numerics/valarray/87641.cc | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/20_util/tuple/swap.cc 
b/libstdc++-v3/testsuite/20_util/tuple/swap.cc
index c086a4f1a8e..30c8322f01c 100644
--- a/libstdc++-v3/testsuite/20_util/tuple/swap.cc
+++ b/libstdc++-v3/testsuite/20_util/tuple/swap.cc
@@ -38,7 +38,7 @@ struct MoveOnly
   MoveOnly(MoveOnly const&) = delete;
   MoveOnly& operator=(MoveOnly const&) = delete;
 
-  bool operator==(MoveOnly const& m)
+  bool operator==(MoveOnly const& m) const
   { return i == m.i; }
 
   void swap(MoveOnly& m)
diff --git a/libstdc++-v3/testsuite/26_numerics/valarray/87641.cc 
b/libstdc++-v3/testsuite/26_numerics/valarray/87641.cc
index 38c35851716..4a6e402831d 100644
--- a/libstdc++-v3/testsuite/26_numerics/valarray/87641.cc
+++ b/libstdc++-v3/testsuite/26_numerics/valarray/87641.cc
@@ -39,7 +39,7 @@ struct X
   X() : val(1) { }
 
   X& operator+=(const X& x) { val += x.val; return *this; }
-  bool operator==(const X& x) { return val == x.val; }
+  bool operator==(const X& x) const { return val == x.val; }
 
   int val;
 };
-- 
2.38.1



[committed] libstdc++: Add missing definition for in C++14 mode

2022-11-10 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

We support  in C++14 as an extension, but that means that
constexpr static data members are not implicitly inline. Add an
out-of-class definition for C++14 mode.

This fixes a FAIL when -std=gnu++14 is used:
FAIL: 20_util/from_chars/1.cc (test for excess errors)

libstdc++-v3/ChangeLog:

* include/std/charconv (__from_chars_alnum_to_val_table::value):
[!__cpp_inline_variables]: Add non-inline definition.
---
 libstdc++-v3/include/std/charconv | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libstdc++-v3/include/std/charconv 
b/libstdc++-v3/include/std/charconv
index 09163af7fc9..acad865f8aa 100644
--- a/libstdc++-v3/include/std/charconv
+++ b/libstdc++-v3/include/std/charconv
@@ -444,6 +444,12 @@ namespace __detail
   static constexpr type value = (_DecOnly, _S_make_table());
 };
 
+#if ! __cpp_inline_variables
+  template
+const typename __from_chars_alnum_to_val_table<_DecOnly>::type
+  __from_chars_alnum_to_val_table<_DecOnly>::value;
+#endif
+
   // If _DecOnly is true: if the character is a decimal digit, then
   // return its corresponding base-10 value, otherwise return a value >= 127.
   // If _DecOnly is false: if the character is an alphanumeric digit, then
-- 
2.38.1



[committed] libstdc++: Fix test that uses C++17 variable template in C++14

2022-11-10 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

This test fails if run with -std=gnu++14 because it should be using
is_convertible instead of is_convertible_v.

libstdc++-v3/ChangeLog:

* testsuite/experimental/propagate_const/observers/107525.cc:
Use type trait instead of C++17 variable template.
---
 .../experimental/propagate_const/observers/107525.cc  | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git 
a/libstdc++-v3/testsuite/experimental/propagate_const/observers/107525.cc 
b/libstdc++-v3/testsuite/experimental/propagate_const/observers/107525.cc
index e7ecff73c1a..37e970c3af4 100644
--- a/libstdc++-v3/testsuite/experimental/propagate_const/observers/107525.cc
+++ b/libstdc++-v3/testsuite/experimental/propagate_const/observers/107525.cc
@@ -32,10 +32,10 @@ test_const_conversion()
 operator const int*() const = delete;
   };
 
-  static_assert(!std::is_convertible_v,
+  static_assert(!std::is_convertible::value,
"Cannot convert const X to const int*");
   // So should not be able to convert const propagate_const to const int*.
-  static_assert(!std::is_convertible_v, const int*>,
+  static_assert(!std::is_convertible, const 
int*>::value,
"So should not be able to convert const propagate_const to "
"const int* (although this is not what LFTSv3 says)");
 }
-- 
2.38.1



[committed] libstdc++: Avoid redundant checks in std::use_facet [PR103755]

2022-11-10 Thread Jonathan Wakely via Gcc-patches
As discussed in the PR, this makes it three times faster to construct
iostreams objects.

Tested x86_64-linux. Pushed to trunk.

-- >8 --

We do not need to do bounds checks or a runtime dynamic_cast when using
std::has_facet and std::use_facet to access the default facets that are
guaranteed to be present in every std::locale object. We can just index
straight into the array and use a static_cast for the conversion.

This patch adds a new std::__try_use_facet function that is like
std::use_facet but returns a pointer, so can be used to implement both
std::has_facet and std::use_facet. We can then do the necessary
metaprogramming to skip the redundant checks in std::__try_use_facet.

To avoid having to export (or hide) instantiations of the new function
from libstdc++.so the instantiations are given hidden visibility. This
allows them to be used in the library, but user code will instantiate it
again using the definition in the header. That would happen anyway,
because there are no explicit instantiation declarations for any of
std::has_facet, std::use_facet, or the new std::__try_use_facet.

libstdc++-v3/ChangeLog:

PR libstdc++/103755
* config/abi/pre/gnu.ver: Tighten patterns for facets in the
base version. Add exports for __try_use_facet.
* include/bits/basic_ios.tcc (basic_ios::_M_cache_locale): Use
__try_use_facet instead of has_facet and use_facet.
* include/bits/fstream.tcc (basic_filebuf::basic_filebuf()):
Likewise.
(basic_filebuf::imbue): Likewise.
* include/bits/locale_classes.h (locale, locale::id)
(locale::_Impl): Declare __try_use_facet as a friend.
* include/bits/locale_classes.tcc (__try_use_facet): Define new
function template with special cases for default facets.
(has_facet, use_facet): Call __try_use_facet.
* include/bits/locale_facets.tcc (__try_use_facet): Declare
explicit instantiations.
* include/bits/locale_facets_nonio.tcc (__try_use_facet):
Likewise.
* src/c++11/locale-inst-monetary.h (INSTANTIATE_FACET_ACCESSORS):
Use new macro for facet accessor instantiations.
* src/c++11/locale-inst-numeric.h (INSTANTIATE_FACET_ACCESSORS):
Likewise.
* src/c++11/locale-inst.cc (INSTANTIATE_USE_FACET): Define new
macro for instantiating __try_use_facet and use_facet.
(INSTANTIATE_FACET_ACCESSORS): Define new macro for also
defining has_facet.
* src/c++98/compatibility-ldbl.cc (__try_use_facet):
Instantiate.
* testsuite/22_locale/ctype/is/string/89728_neg.cc: Adjust
expected errors.
---
 libstdc++-v3/config/abi/pre/gnu.ver   |  43 ++-
 libstdc++-v3/include/bits/basic_ios.tcc   |  17 +--
 libstdc++-v3/include/bits/fstream.tcc |   8 +-
 libstdc++-v3/include/bits/locale_classes.h|  12 ++
 libstdc++-v3/include/bits/locale_classes.tcc  |  99 ++---
 libstdc++-v3/include/bits/locale_facets.tcc   |  34 +-
 .../include/bits/locale_facets_nonio.tcc  |  64 +++
 libstdc++-v3/src/c++11/locale-inst-monetary.h |   8 +-
 libstdc++-v3/src/c++11/locale-inst-numeric.h  |   8 +-
 libstdc++-v3/src/c++11/locale-inst.cc | 105 --
 libstdc++-v3/src/c++98/compatibility-ldbl.cc  |   8 ++
 .../22_locale/ctype/is/string/89728_neg.cc|   5 +-
 12 files changed, 276 insertions(+), 135 deletions(-)

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
b/libstdc++-v3/config/abi/pre/gnu.ver
index 4d97ec37147..225d6dc482b 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -133,17 +133,18 @@ GLIBCXX_3.4 {
 # std::logic_error::~l*;
 # std::[m-r]*;
 # std::[m]*;
-  std::messages[^_]*;
+# std::messages[^_]*;
 # std::messages_byname*;
-  std::money_*;
-  std::moneypunct[^_]*;
+# std::money_*;
+  std::money_base*;
+# std::moneypunct[^_]*;
 # std::moneypunct_byname*;
 # std::n[^u]*;
   std::n[^aueo]*;
   std::nothrow;
   std::nu[^m]*;
-  std::num[^ep]*;
-  std::numpunct[^_]*;
+  std::num[^_ep]*;
+# std::numpunct[^_]*;
 # std::numpunct_byname*;
   std::ostrstream*;
 # std::out_of_range::o*;
@@ -597,28 +598,49 @@ GLIBCXX_3.4 {
 _ZNSt12ctype_bynameI[cw]ED*;
 
 # std::num_get
+
_ZNSt7num_getI[cw]St19istreambuf_iteratorI[cw]St11char_traitsI[cw]EEE[CD][012]*;
+_ZNSt7num_getI[cw]St19istreambuf_iteratorI[cw]St11char_traitsI[cw]EEE2idE;
 
_ZNKSt7num_getI[cw]St19istreambuf_iteratorI[cw]St11char_traitsI[cw]EEE[2-9]*;
 
_ZNKSt7num_getI[cw]St19istreambuf_iteratorI[cw]St11char_traitsI[cw]EEE14_M_extract_intI*;
-
_ZNKSt7num_getI[cw]St19istreambuf_iteratorI[cw]St11char_traitsI[cw]EEE16_M_extract_floatI*;
+
_ZNKSt7num_getI[cw]St19istreambuf_iteratorI[cw]St11char_traitsI[cw]EEE16_M_extract_float*;
 
 # std::num_put
+

Re: [PATCH v2 2/4] LoongArch: Add ftint{,rm,rp}.{w,l}.{s,d} instructions

2022-11-10 Thread Xi Ruoyao via Gcc-patches
Lulu:

So I think the code is correct:

+   (
+|| flag_fp_int_builtin_inexact
+|| !flag_trapping_math)"

 is 1 for lrint, 0 for lceil and lfloor.  As N3054
says:

   The lrint and llrint functions provide floating-to-integer conversion as 
prescribed by IEC 60559.
   They round according to the current rounding direction. If the rounded value 
is outside the range of
   the return type, the numeric result is unspecified and the "invalid" 
floating-point exception is raised.
   When they raise no other floating-point exception and the result differs 
from the argument, they
   raise the "inexact" floating-point exception.
   
If flag_fp_int_builtin_inexact is set, we allow lceil and lfloor to
raise "inexact".

If flag_trapping_math is not set (it's set by default and can only be
unset explicitly with -fno-trapping-math: it's not even implied by -
ffast-math), we don't care about whether exceptions are raised or not.

So lceil and lfloor can be used if -ffp-int-builtin-inexact, or -fno-
trapping-math.

On Fri, 2022-11-11 at 00:07 +, Joseph Myers wrote:
> On Thu, 10 Nov 2022, Xi Ruoyao via Gcc-patches wrote:
> 
> > Joseph: can you confirm that -ftrapping-math allows floor and ceil to
> > raise inexact exception?  The man page currently says:
> > 
> > The default is -ffp-int-builtin-inexact, allowing the exception to be
> > raised, unless C2X or a later C standard is selected.  This option does 
> >    ^^^
> > nothing unless -ftrapping-math is in effect.
> > 
> > To me it's not very clear that "this option" stands for -fno-fp-int-
> > builtin-inexact or -ffp-int-builtin-inexact.
> 
> The -ftrapping-math option (which is on by default) means that we care
> about whether operations raise exceptions: they should raise exceptions if 
> and only if the relevant standard permit them to do so.
> 
> The combination of -ftrapping-math with -fno-fp-int-builtin-inexact means 
> the listed built-in functions must not raise "inexact".
> 
> If -fno-trapping-math is used, then we don't care about whether exceptions 
> are raised or not (for any floating-point operations, not just those 
> functions).  So given -fno-trapping-math, there is no difference between 
> -fno-fp-int-builtin-inexact and -ffp-int-builtin-inexact.
> 
> If -ffp-int-builtin-inexact (default before C2X), we don't care about 
> whether those functions raise "inexact" (but still care about other 
> exceptions and exceptions for other operations, unless 
> -fno-trapping-math).

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH v2] Add condition coverage profiling

2022-11-10 Thread Jørgen Kvalsvik via Gcc-patches
From: Jørgen Kvalsvik 

This patch adds support in gcc+gcov for modified condition/decision
coverage (MC/DC) with the -fprofile-conditions flag. MC/DC is a type of
test/code coverage and it is particularly important in the avation and
automotive industries for safety-critical applications. MC/DC it is
required for or recommended by:

* DO-178C for the most critical software (Level A) in avionics
* IEC 61508 for SIL 4
* ISO 26262-6 for ASIL D

>From the SQLite webpage:

Two methods of measuring test coverage were described above:
"statement" and "branch" coverage. There are many other test
coverage metrics besides these two. Another popular metric is
"Modified Condition/Decision Coverage" or MC/DC. Wikipedia defines
MC/DC as follows:

* Each decision tries every possible outcome.
* Each condition in a decision takes on every possible outcome.
* Each entry and exit point is invoked.
* Each condition in a decision is shown to independently affect
  the outcome of the decision.

In the C programming language where && and || are "short-circuit"
operators, MC/DC and branch coverage are very nearly the same thing.
The primary difference is in boolean vector tests. One can test for
any of several bits in bit-vector and still obtain 100% branch test
coverage even though the second element of MC/DC - the requirement
that each condition in a decision take on every possible outcome -
might not be satisfied.

https://sqlite.org/testing.html#mcdc

Wahlen, Heimdahl, and De Silva "Efficient Test Coverage Measurement for
MC/DC" describes an algorithm for adding instrumentation by carrying
over information from the AST, but my algorithm analyses the the control
flow graph to instrument for coverage. This has the benefit of being
programming language independent and faithful to compiler decisions
and transformations, although I have only tested it on constructs in C
and C++, see testsuite/gcc.misc-tests and testsuite/g++.dg.

Like Wahlen et al this implementation records coverage in fixed-size
bitsets which gcov knows how to interpret. This is very fast, but
introduces a limit on the number of terms in a single boolean
expression, the number of bits in a gcov_unsigned_type (which is
typedef'd to uint64_t), so for most practical purposes this would be
acceptable. This limitation is in the implementation and not the
algorithm, so support for more conditions can be added by also
introducing arbitrary-sized bitsets.

For space overhead, the instrumentation needs two accumulators
(gcov_unsigned_type) per condition in the program which will be written
to the gcov file. In addition, every function gets a pair of local
accumulators, but these accmulators are reused between conditions in the
same function.

For time overhead, there is a zeroing of the local accumulators for
every condition and one or two bitwise operation on every edge taken in
the an expression.

In action it looks pretty similar to the branch coverage. The -g short
opt carries no significance, but was chosen because it was an available
option with the upper-case free too.

gcov --conditions:

3:   17:void fn (int a, int b, int c, int d) {
3:   18:if ((a && (b || c)) && d)
condition outcomes covered 3/8
condition  0 not covered (true false)
condition  1 not covered (true)
condition  2 not covered (true)
condition  3 not covered (true)
1:   19:x = 1;
-:   20:else
2:   21:x = 2;
3:   22:}

gcov --conditions --json-format:

"conditions": [
{
"not_covered_false": [
0
],
"count": 8,
"covered": 3,
"not_covered_true": [
0,
1,
2,
3
]
}
],

Some expressions, mostly those without else-blocks, are effectively
"rewritten" in the CFG construction making the algorithm unable to
distinguish them:

and.c:

if (a && b && c)
x = 1;

ifs.c:

if (a)
if (b)
if (c)
x = 1;

gcc will build the same graph for both these programs, and gcov will
report boths as 3-term expressions. It is vital that it is not
interpreted the other way around (which is consistent with the shape of
the graph) because otherwise the masking would be wrong for the and.c
program which is a more severe error. While surprising, users would
probably expect some minor rewriting of semantically-identical
expressions.

and.c.gcov:
#:2:if (a && b && c)
condition outcomes covered 6/6
#:3:x = 1;

ifs.c.gcov:
#:2:if (a)
#:3:if (b)
#:4:if (c)
#:5:x = 1;
condition outcomes covered 6/6

Adding else clauses alters the program (ifs.c can have 3 elses, and.c
only 1) and coverage becomes less surprising


[PATCH] configure: Implement --enable-host-bind-now

2022-11-10 Thread Marek Polacek via Gcc-patches
This is a rebased version of the patch I posted in February:
.

Fortunately it is much simpler than the patch implementing --enable-host-pie.
I've converted the install.texi part into configuration.rst, otherwise
there are no changes to the original version.

With --enable-host-bind-now --enable-host-pie:
$ readelf -Wd ./gcc/cc1 ./gcc/cc1plus | grep FLAGS
 0x001e (FLAGS)  BIND_NOW
 0x6ffb (FLAGS_1)Flags: NOW PIE
 0x001e (FLAGS)  BIND_NOW
 0x6ffb (FLAGS_1)Flags: NOW PIE

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --

As promised in the --enable-host-pie patch, this patch adds another
configure option, --enable-host-bind-now, which adds -z now when linking
the compiler executables in order to extend hardening.  BIND_NOW with RELRO
allows the GOT to be marked RO; this prevents GOT modification attacks.

This option does not affect linking of target libraries; you can use
LDFLAGS_FOR_TARGET=-Wl,-z,relro,-z,now to enable RELRO/BIND_NOW.

c++tools/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.
* configure: Regenerate.

gcc/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Add
-Wl,-z,now to LD_PICFLAG if --enable-host-bind-now.
* configure: Regenerate.
* doc/install/configuration.rst: Document --enable-host-bind-now.

lto-plugin/ChangeLog:

* configure.ac (--enable-host-bind-now): New check.  Link with
-z,now.
* configure: Regenerate.
---
 c++tools/configure| 11 +++
 c++tools/configure.ac |  7 +++
 gcc/configure | 20 ++--
 gcc/configure.ac  | 13 -
 gcc/doc/install/configuration.rst |  7 +++
 lto-plugin/configure  | 20 ++--
 lto-plugin/configure.ac   | 11 +++
 7 files changed, 84 insertions(+), 5 deletions(-)

diff --git a/c++tools/configure b/c++tools/configure
index 88087009383..006efe07b35 100755
--- a/c++tools/configure
+++ b/c++tools/configure
@@ -628,6 +628,7 @@ EGREP
 GREP
 CXXCPP
 LD_PICFLAG
+enable_host_bind_now
 PICFLAG
 MAINTAINER
 CXX_AUX_TOOLS
@@ -702,6 +703,7 @@ enable_maintainer_mode
 enable_checking
 enable_default_pie
 enable_host_pie
+enable_host_bind_now
 with_gcc_major_version_only
 '
   ac_precious_vars='build_alias
@@ -1336,6 +1338,7 @@ Optional Features:
   yes,no,all,none,release.
   --enable-default-pieenable Position Independent Executable as default
   --enable-host-pie   build host code as PIE
+  --enable-host-bind-now  link host code as BIND_NOW
 
 Optional Packages:
   --with-PACKAGE[=ARG]use PACKAGE [ARG=yes]
@@ -3007,6 +3010,14 @@ fi
 
 
 
+# Enable --enable-host-bind-now
+# Check whether --enable-host-bind-now was given.
+if test "${enable_host_bind_now+set}" = set; then :
+  enableval=$enable_host_bind_now; LD_PICFLAG="$LD_PICFLAG -Wl,-z,now"
+fi
+
+
+
 
 # Check if O_CLOEXEC is defined by fcntl
 
diff --git a/c++tools/configure.ac b/c++tools/configure.ac
index 1e42689f2eb..d3f23f66f00 100644
--- a/c++tools/configure.ac
+++ b/c++tools/configure.ac
@@ -110,6 +110,13 @@ AC_ARG_ENABLE(host-pie,
[build host code as PIE])],
 [PICFLAG=-fPIE; LD_PICFLAG=-pie], [])
 AC_SUBST(PICFLAG)
+
+# Enable --enable-host-bind-now
+AC_ARG_ENABLE(host-bind-now,
+[AS_HELP_STRING([--enable-host-bind-now],
+   [link host code as BIND_NOW])],
+[LD_PICFLAG="$LD_PICFLAG -Wl,-z,now"], [])
+AC_SUBST(enable_host_bind_now)
 AC_SUBST(LD_PICFLAG)
 
 # Check if O_CLOEXEC is defined by fcntl
diff --git a/gcc/configure b/gcc/configure
index 3e303f7e5bd..fb88e41f712 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -635,6 +635,7 @@ CET_HOST_FLAGS
 LD_PICFLAG
 PICFLAG
 enable_default_pie
+enable_host_bind_now
 enable_host_pie
 enable_host_shared
 enable_plugin
@@ -1030,6 +1031,7 @@ enable_version_specific_runtime_libs
 enable_plugin
 enable_host_shared
 enable_host_pie
+enable_host_bind_now
 enable_libquadmath_support
 with_linker_hash_style
 with_diagnostics_color
@@ -1793,6 +1795,7 @@ Optional Features:
   --enable-plugin enable plugin support
   --enable-host-sharedbuild host code as shared libraries
   --enable-host-pie   build host code as PIE
+  --enable-host-bind-now  link host code as BIND_NOW
   --disable-libquadmath-support
   disable libquadmath support for Fortran
   --enable-default-pieenable Position Independent Executable as default
@@ -19764,7 +19767,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19779 "configure"
+#line 19782 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19870,7 +19873,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   

[PATCH] configure: Implement --enable-host-pie

2022-11-10 Thread Marek Polacek via Gcc-patches
This is a rebased version of the patch I posted in March:

which Alex sort of approved here:

but it was too late to commit the patch in GCC 12.

There are no changes except that I've converted the documentation
part into the ReST format, and of course regenerated configure.

With --enable-host-pie enabled:
$ file ./gcc/cc1 ./gcc/cc1plus
./gcc/cc1: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
3.2.0, with debug_info, not stripped
./gcc/cc1plus: ELF 64-bit LSB pie executable, x86-64, version 1 (GNU/Linux), 
dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 
3.2.0, with debug_info, not stripped

Bootstrapped/regtested on x86_64-pc-linux-gnu w/ and w/o --enable-host-pie,
ok for trunk?

-- >8 --

This patch implements the --enable-host-pie configure option which
makes the compiler executables PIE.  This can be used to enhance
protection against ROP attacks, and can be viewed as part of a wider
trend to harden binaries.

It is similar to the option --enable-host-shared, except that --e-h-s
won't add -shared to the linker flags whereas --e-h-p will add -pie.
It is different from --enable-default-pie because that option just
adds an implicit -fPIE/-pie when the compiler is invoked, but the
compiler itself isn't PIE.

Since r12-5768-gfe7c3ecf, PCH works well with PIE, so there are no PCH
regressions.

When building the compiler, the build process may use various in-tree
libraries; these need to be built with -fPIE so that it's possible to
use them when building a PIE.  For instance, when --with-included-gettext
is in effect, intl object files must be compiled with -fPIE.  Similarly,
when building in-tree gmp, isl, mpfr and mpc, they must be compiled with
-fPIE.

I plan to add an option to link with -Wl,-z,now.

ChangeLog:

* Makefile.def: Pass $(PICFLAG) to AM_CFLAGS for gmp, mpfr, mpc, and
isl.
* Makefile.in: Regenerate.
* Makefile.tpl: Set PICFLAG.
* configure.ac (--enable-host-pie): New check.  Set PICFLAG after this
check.
* configure: Regenerate.

c++tools/ChangeLog:

* Makefile.in: Rename PIEFLAG to PICFLAG.  Set LD_PICFLAG.  Use it.
Use pic/libiberty.a if PICFLAG is set.
* configure.ac (--enable-default-pie): Set PICFLAG instead of PIEFLAG.
(--enable-host-pie): New check.
* configure: Regenerate.

fixincludes/ChangeLog:

* Makefile.in: Set and use PICFLAG and LD_PICFLAG.  Use the "pic"
build of libiberty if PICFLAG is set.
* configure.ac:
* configure: Regenerate.

gcc/ChangeLog:

* Makefile.in: Set LD_PICFLAG.  Use it.  Set enable_host_pie.
Remove NO_PIE_CFLAGS and NO_PIE_FLAG.  Pass LD_PICFLAG to
ALL_LINKERFLAGS.  Use the "pic" build of libiberty if --enable-host-pie.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
check.
* configure: Regenerate.
* doc/install/configuration.rst: Document --enable-host-pie.

gcc/d/ChangeLog:

* Make-lang.in: Remove NO_PIE_CFLAGS.

intl/ChangeLog:

* Makefile.in: Use @PICFLAG@ in COMPILE as well.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

libcody/ChangeLog:

* Makefile.in: Pass LD_PICFLAG to LDFLAGS.
* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG and LD_PICFLAG after this
check.
* configure: Regenerate.

libcpp/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

libdecnumber/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.

libiberty/ChangeLog:

* configure.ac: Also set shared when enable_host_pie.
* configure: Regenerate.

zlib/ChangeLog:

* configure.ac (--enable-host-shared): Don't set PICFLAG here.
(--enable-host-pie): New check.  Set PICFLAG after this check.
* configure: Regenerate.
---
 Makefile.def  |   7 +-
 Makefile.in   | 273 +++---
 Makefile.tpl  |   1 +
 c++tools/Makefile.in  |  11 +-
 c++tools/configure|  17 +-
 c++tools/configure.ac |  11 +-
 configure |  22 +++
 configure.ac  |  16 ++
 fixincludes/Makefile.in   

[PATCH] i386: Add AMX-TILE dependency for AMX related ISAs

2022-11-10 Thread Haochen Jiang via Gcc-patches
Hi all,

For all AMX related ISAs, we have a potential dependency on AMX-TILE
or we even won't have the basic support on AMX.

This patch added those dependency. Ok for trunk?

BRs,
Haochen

gcc/ChangeLog:

* common/config/i386/i386-common.cc
(OPTION_MASK_ISA2_AMX_INT8_SET): Add AMX-TILE dependency.
(OPTION_MASK_ISA2_AMX_BF16_SET): Ditto.
(OPTION_MASK_ISA2_AMX_FP16_SET): Ditto.
(OPTION_MASK_ISA2_AMX_TILE_UNSET): Disable AMX_{INT8,
BF16, FP16} when disable AMX_TILE.

gcc/testsuite/ChangeLog:

* gcc.target/i386/amxbf16-dpbf16ps-2.c: Remove -amx-tile.
* gcc.target/i386/amxfp16-dpfp16ps-2.c: Ditto.
* gcc.target/i386/amxint8-dpbssd-2.c: Ditto.
* gcc.target/i386/amxint8-dpbsud-2.c: Ditto.
* gcc.target/i386/amxint8-dpbusd-2.c: Ditto.
* gcc.target/i386/amxint8-dpbuud-2.c: Ditto.
---
 gcc/common/config/i386/i386-common.cc  | 13 +
 gcc/testsuite/gcc.target/i386/amxbf16-dpbf16ps-2.c |  3 +--
 gcc/testsuite/gcc.target/i386/amxfp16-dpfp16ps-2.c |  3 +--
 gcc/testsuite/gcc.target/i386/amxint8-dpbssd-2.c   |  3 +--
 gcc/testsuite/gcc.target/i386/amxint8-dpbsud-2.c   |  3 +--
 gcc/testsuite/gcc.target/i386/amxint8-dpbusd-2.c   |  3 +--
 gcc/testsuite/gcc.target/i386/amxint8-dpbuud-2.c   |  3 +--
 7 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/gcc/common/config/i386/i386-common.cc 
b/gcc/common/config/i386/i386-common.cc
index 431fd0d3ad1..5e6d3da0306 100644
--- a/gcc/common/config/i386/i386-common.cc
+++ b/gcc/common/config/i386/i386-common.cc
@@ -106,12 +106,15 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_CLWB_SET OPTION_MASK_ISA_CLWB
 #define OPTION_MASK_ISA2_AVX512VP2INTERSECT_SET 
OPTION_MASK_ISA2_AVX512VP2INTERSECT
 #define OPTION_MASK_ISA2_AMX_TILE_SET OPTION_MASK_ISA2_AMX_TILE
-#define OPTION_MASK_ISA2_AMX_INT8_SET OPTION_MASK_ISA2_AMX_INT8
-#define OPTION_MASK_ISA2_AMX_BF16_SET OPTION_MASK_ISA2_AMX_BF16
+#define OPTION_MASK_ISA2_AMX_INT8_SET \
+  (OPTION_MASK_ISA2_AMX_TILE | OPTION_MASK_ISA2_AMX_INT8)
+#define OPTION_MASK_ISA2_AMX_BF16_SET \
+  (OPTION_MASK_ISA2_AMX_TILE | OPTION_MASK_ISA2_AMX_BF16)
 #define OPTION_MASK_ISA2_AVXVNNIINT8_SET OPTION_MASK_ISA2_AVXVNNIINT8
 #define OPTION_MASK_ISA2_AVXNECONVERT_SET OPTION_MASK_ISA2_AVXNECONVERT
 #define OPTION_MASK_ISA2_CMPCCXADD_SET OPTION_MASK_ISA2_CMPCCXADD
-#define OPTION_MASK_ISA2_AMX_FP16_SET OPTION_MASK_ISA2_AMX_FP16
+#define OPTION_MASK_ISA2_AMX_FP16_SET \
+  (OPTION_MASK_ISA2_AMX_TILE | OPTION_MASK_ISA2_AMX_FP16)
 #define OPTION_MASK_ISA2_PREFETCHI_SET OPTION_MASK_ISA2_PREFETCHI
 #define OPTION_MASK_ISA2_RAOINT_SET OPTION_MASK_ISA2_RAOINT
 
@@ -277,7 +280,9 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA2_SERIALIZE_UNSET OPTION_MASK_ISA2_SERIALIZE
 #define OPTION_MASK_ISA2_AVX512VP2INTERSECT_UNSET 
OPTION_MASK_ISA2_AVX512VP2INTERSECT
 #define OPTION_MASK_ISA2_TSXLDTRK_UNSET OPTION_MASK_ISA2_TSXLDTRK
-#define OPTION_MASK_ISA2_AMX_TILE_UNSET OPTION_MASK_ISA2_AMX_TILE
+#define OPTION_MASK_ISA2_AMX_TILE_UNSET \
+  (OPTION_MASK_ISA2_AMX_TILE | OPTION_MASK_ISA2_AMX_INT8_UNSET \
+   | OPTION_MASK_ISA2_AMX_BF16_UNSET | OPTION_MASK_ISA2_AMX_FP16_UNSET)
 #define OPTION_MASK_ISA2_AMX_INT8_UNSET OPTION_MASK_ISA2_AMX_INT8
 #define OPTION_MASK_ISA2_AMX_BF16_UNSET OPTION_MASK_ISA2_AMX_BF16
 #define OPTION_MASK_ISA2_UINTR_UNSET OPTION_MASK_ISA2_UINTR
diff --git a/gcc/testsuite/gcc.target/i386/amxbf16-dpbf16ps-2.c 
b/gcc/testsuite/gcc.target/i386/amxbf16-dpbf16ps-2.c
index b00bc13ec78..35881e7682a 100644
--- a/gcc/testsuite/gcc.target/i386/amxbf16-dpbf16ps-2.c
+++ b/gcc/testsuite/gcc.target/i386/amxbf16-dpbf16ps-2.c
@@ -1,7 +1,6 @@
 /* { dg-do run { target { ! ia32 } } } */
-/* { dg-require-effective-target amx_tile } */
 /* { dg-require-effective-target amx_bf16 } */
-/* { dg-options "-O2 -mamx-tile -mamx-bf16" } */
+/* { dg-options "-O2 -mamx-bf16" } */
 #include 
 
 #define AMX_BF16
diff --git a/gcc/testsuite/gcc.target/i386/amxfp16-dpfp16ps-2.c 
b/gcc/testsuite/gcc.target/i386/amxfp16-dpfp16ps-2.c
index 2d359a689ea..a1fafbcbfeb 100644
--- a/gcc/testsuite/gcc.target/i386/amxfp16-dpfp16ps-2.c
+++ b/gcc/testsuite/gcc.target/i386/amxfp16-dpfp16ps-2.c
@@ -1,8 +1,7 @@
 /* { dg-do run { target { ! ia32 } } } */
-/* { dg-require-effective-target amx_tile } */
 /* { dg-require-effective-target amx_fp16 } */
 /* { dg-require-effective-target avx512fp16 } */
-/* { dg-options "-O2 -mamx-tile -mamx-fp16 -mavx512fp16" } */
+/* { dg-options "-O2 -mamx-fp16 -mavx512fp16" } */
 #define AMX_FP16
 #define DO_TEST test_amx_fp16_dpfp16ps
 void test_amx_fp16_dpfp16ps ();
diff --git a/gcc/testsuite/gcc.target/i386/amxint8-dpbssd-2.c 
b/gcc/testsuite/gcc.target/i386/amxint8-dpbssd-2.c
index 74ad71be5c5..d7efb3d20c2 100644
--- a/gcc/testsuite/gcc.target/i386/amxint8-dpbssd-2.c
+++ b/gcc/testsuite/gcc.target/i386/amxint8-dpbssd-2.c
@@ -1,7 +1,6 @@
 /* { dg-do run { target { ! ia32 } } } */
-/* { 

[PATCH v2] match.pd: rewrite select to branchless expression

2022-11-10 Thread Michael Collison
This patches transforms ((x & 0x1) == 0) ? y : z  y -into 
(-(typeof(y))(x & 0x1) & z)  y, where op is a '^' or a '|'. It also 
transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x , 
0x1)) & z ) op y.


Matching this patterns allows GCC to generate branchless code for one of 
the functions in coremark.


Bootstrapped and tested on x86 and RISC-V. Okay?

Michael.

2022-11-10  Michael Collison  

    * match.pd ((x & 0x1) == 0) ? y : z  y
    -> (-(typeof(y))(x & 0x1) & z)  y.

2022-11-10  Michael Collison 

    * gcc.dg/tree-ssa/branchless-cond.c: New test.

---

Changes in v2:

- Rewrite comment to use C syntax

- Guard against 1-bit types

- Simplify pattern by using zero_one_valued_p

 gcc/match.pd  | 24 +
 .../gcc.dg/tree-ssa/branchless-cond.c | 26 +++
 2 files changed, 50 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 194ba8f5188..258531e9046 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3486,6 +3486,30 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
   (max @2 @1))
 
+/* ((x & 0x1) == 0) ? y : z  y -> (-(typeof(y))(x & 0x1) & z)  y */

+(for op (bit_xor bit_ior)
+ (simplify
+  (cond (eq zero_one_valued_p@0
+integer_zerop)
+@1
+(op:c @2 @1))
+  (if (INTEGRAL_TYPE_P (type)
+   && TYPE_PRECISION (type) > 1
+   && (INTEGRAL_TYPE_P (TREE_TYPE (@0
+   (op (bit_and (negate (convert:type @0)) @2) @1
+
+/* ((x & 0x1) == 0) ? z  y : y -> (-(typeof(y))(x & 0x1) & z)  y */
+(for op (bit_xor bit_ior)
+ (simplify
+  (cond (ne zero_one_valued_p@0
+integer_zerop)
+   (op:c @2 @1)
+@1)
+  (if (INTEGRAL_TYPE_P (type)
+   && TYPE_PRECISION (type) > 1
+   && (INTEGRAL_TYPE_P (TREE_TYPE (@0
+   (op (bit_and (negate (convert:type @0)) @2) @1
+
 /* Simplifications of shift and rotates.  */
 
 (for rotate (lrotate rrotate)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c 
b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
new file mode 100644
index 000..68087ae6568
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int f1(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) == 0) ? y : z ^ y;
+}
+
+int f2(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) != 0) ? z ^ y : y;
+}
+
+int f3(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) == 0) ? y : z | y;
+}
+
+int f4(unsigned int x, unsigned int y, unsigned int z)
+{
+  return ((x & 1) != 0) ? z | y : y;
+}
+
+/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */
+/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */
+/* { dg-final { scan-tree-dump-not "if" "optimized" } } */
--
2.34.1



Re: [PATCH] range-op: Implement floating point multiplication fold_range [PR107569]

2022-11-10 Thread Jakub Jelinek via Gcc-patches
On Thu, Nov 10, 2022 at 08:20:06PM +0100, Jakub Jelinek via Gcc-patches wrote:
> So, maybe for now a selftest will be better than a testcase, or
> alternatively a plugin test which acts like a selftest.

For a testsuite/g*.dg/plugin/ range-op-plugin.c test, would be
nice to write a short plugin that does:
1) register 2 new attributes, say gcc_range and gcc_expected_range,
   parse their arguments
2) registers some new pass (dunno where it would be best, say before evrp
   or so), which would:
   - for function parameters with gcc_range attributes set global? range
 on default def of that parameter
   - if function itself has a gcc_expected_range attribute, propagate
 ranges from arguments to the function result and compare against
 gcc_expected_range
   - if function itself has gcc_range attribute and one of the arguments
 gcc_expected_range itself, try to propagate range backwards from
 result to the argument
Then we could say write tests like:
__attribute__((gcc_expected_range (12.0, 32.0, 0))) double
foo (double x __attribute__((gcc_range (2.0, 4.0, 0))), double y 
__attribute__((gcc_range (6.0, 8.0, 0
{
  return x * y;
}
with for floating point types (parameter type or function result type)
the arguments of the attribute being (constant folded)
low bound, high bound, integer about NAN (say 0 meaning clear_nan,
bit 0 meaning +NAN, bit 1 -NAN, bit 2 meaning known NAN (then
the 2 bounds would be ignored)).
Eventually we could do something similar for integral types, pointer types
etc.
I think this would be far more useful compared to writing selftests for it,
and compared to the testcase I've posted would be easier to request or check
exact range rather than a rough range.  And we could easily
test not just very simple binary ops (in both directions), but builtin calls
etc.

Thoughts on this?

I can try to write a plugin that registers the attributes, parses them
and registers a pass, but would appreciate your or Andrew's help in filling
the actual pass.

Jakub



RE: [PATCH V2] Enable small loop unrolling for O2

2022-11-10 Thread Wang, Hongyu via Gcc-patches
Thanks for the notification! I’m not aware of the compile farm before. Will see 
what’s the impact of my patch then.

Regards,
Hongyu, Wang

From: David Edelsohn 
Sent: Thursday, November 10, 2022 1:22 AM
To: Wang, Hongyu 
Cc: GCC Patches 
Subject: Re: [PATCH V2] Enable small loop unrolling for O2

> This patch does not change rs6000/s390 since I don't have machines to
> test them, but I suppose the default behavior is the same since they
> enable flag_unroll_loops at O2.

There are Power (rs6000) systems in the Compile Farm.

Trial Linux on Z (s390x) VMs are available through the Linux Community Cloud.
https://linuxone.cloud.marist.edu/#/register?flag=VM

Thanks, David




Re: [PATCH] i386: Add ISA check for newly introduced prefetch builtins.

2022-11-10 Thread Hongtao Liu via Gcc-patches
On Wed, Nov 9, 2022 at 3:15 PM Haochen Jiang via Gcc-patches
 wrote:
>
> Hi all,
>
> As Hongtao said, the fail on pentiumpro is caused by missing ISA check
> since we are using emit_insn () through new builtins and it won't check
> if the TARGET matches. Previously, the builtin in middle-end will check
> that.
>
> On pentiumpro, we won't have anything that supports any prefetch so that
> it dropped into the pattern and then failed.
>
> I have added the restrictions just like what middle-end builtin_prefetch
> does. Also I added missing checks for PREFETCHI. Ok for trunk?
Ok.
>
> BRs,
> Haochen
>
> gcc/ChangeLog:
>
> * config/i386/i386-builtin.def (BDESC): Add
> OPTION_MASK_ISA2_PREFETCHI for prefetchi builtin.
> * config/i386/i386-expand.cc (ix86_expand_builtin):
> Add ISA check before emit_insn.
> * config/i386/prfchiintrin.h: Add target for intrin.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/prefetchi-5.c: New test.
> ---
>  gcc/config/i386/i386-builtin.def|  2 +-
>  gcc/config/i386/i386-expand.cc  | 11 +--
>  gcc/config/i386/prfchiintrin.h  | 14 +-
>  gcc/testsuite/gcc.target/i386/prefetchi-5.c |  4 
>  4 files changed, 27 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/prefetchi-5.c
>
> diff --git a/gcc/config/i386/i386-builtin.def 
> b/gcc/config/i386/i386-builtin.def
> index ea3aff7f125..5e0461acc00 100644
> --- a/gcc/config/i386/i386-builtin.def
> +++ b/gcc/config/i386/i386-builtin.def
> @@ -498,7 +498,7 @@ BDESC (0, OPTION_MASK_ISA2_WIDEKL, CODE_FOR_nothing, 
> "__builtin_ia32_aesencwide1
>  BDESC (0, OPTION_MASK_ISA2_WIDEKL, CODE_FOR_nothing, 
> "__builtin_ia32_aesencwide256kl_u8", IX86_BUILTIN_AESENCWIDE256KLU8, UNKNOWN, 
> (int) UINT8_FTYPE_PV2DI_PCV2DI_PCVOID)
>
>  /* PREFETCHI */
> -BDESC (0, 0, CODE_FOR_prefetchi, "__builtin_ia32_prefetchi", 
> IX86_BUILTIN_PREFETCHI, UNKNOWN, (int) VOID_FTYPE_PCVOID_INT)
> +BDESC (0, OPTION_MASK_ISA2_PREFETCHI, CODE_FOR_prefetchi, 
> "__builtin_ia32_prefetchi", IX86_BUILTIN_PREFETCHI, UNKNOWN, (int) 
> VOID_FTYPE_PCVOID_INT)
>  BDESC (0, 0, CODE_FOR_nothing, "__builtin_ia32_prefetch", 
> IX86_BUILTIN_PREFETCH, UNKNOWN, (int) VOID_FTYPE_PCVOID_INT_INT_INT)
>
>  BDESC_END (SPECIAL_ARGS, PURE_ARGS)
> diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
> index 9c92b07d5cd..0e45c195390 100644
> --- a/gcc/config/i386/i386-expand.cc
> +++ b/gcc/config/i386/i386-expand.cc
> @@ -13131,7 +13131,7 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
> subtarget,
>
> if (INTVAL (op3) == 1)
>   {
> -   if (TARGET_64BIT
> +   if (TARGET_64BIT && TARGET_PREFETCHI
> && local_func_symbolic_operand (op0, GET_MODE (op0)))
>   emit_insn (gen_prefetchi (op0, op2));
> else
> @@ -13150,7 +13150,14 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
> subtarget,
> op0 = convert_memory_address (Pmode, op0);
> op0 = copy_addr_to_reg (op0);
>   }
> -   emit_insn (gen_prefetch (op0, op1, op2));
> +
> +   if (TARGET_3DNOW || TARGET_PREFETCH_SSE
> +   || TARGET_PRFCHW || TARGET_PREFETCHWT1)
> + emit_insn (gen_prefetch (op0, op1, op2));
> +   else if (!MEM_P (op0) && side_effects_p (op0))
> + /* Don't do anything with direct references to volatile memory,
> +but generate code to handle other side effects.  */
> + emit_insn (op0);
>   }
>
> return 0;
> diff --git a/gcc/config/i386/prfchiintrin.h b/gcc/config/i386/prfchiintrin.h
> index 06deef488ba..996a4be1aba 100644
> --- a/gcc/config/i386/prfchiintrin.h
> +++ b/gcc/config/i386/prfchiintrin.h
> @@ -30,6 +30,13 @@
>
>  #ifdef __x86_64__
>
> +
> +#ifndef __PREFETCHI__
> +#pragma GCC push_options
> +#pragma GCC target("prefetchi")
> +#define __DISABLE_PREFETCHI__
> +#endif /* __PREFETCHI__ */
> +
>  extern __inline void
>  __attribute__((__gnu_inline__, __always_inline__, __artificial__))
>  _m_prefetchit0 (void* __P)
> @@ -44,6 +51,11 @@ _m_prefetchit1 (void* __P)
>__builtin_ia32_prefetchi (__P, 2);
>  }
>
> -#endif
> +#ifdef __DISABLE_PREFETCHI__
> +#undef __DISABLE_PREFETCHI__
> +#pragma GCC pop_options
> +#endif /* __DISABLE_PREFETCHI__ */
> +
> +#endif /* __x86_64__ */
>
>  #endif /* _PRFCHIINTRIN_H_INCLUDED */
> diff --git a/gcc/testsuite/gcc.target/i386/prefetchi-5.c 
> b/gcc/testsuite/gcc.target/i386/prefetchi-5.c
> new file mode 100644
> index 000..8c26540f96a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/prefetchi-5.c
> @@ -0,0 +1,4 @@
> +/* { dg-do compile { target { ia32 } } } */
> +/* { dg-options "-O0 -march=pentiumpro" } */
> +
> +#include "prefetchi-4.c"
> --
> 2.18.1
>


-- 
BR,
Hongtao


[PATCH 0/2] Support HWASAN with Intel LAM

2022-11-10 Thread liuhongt via Gcc-patches
  2 years ago, ARM folks support HWASAN[1] in GCC[2], and introduced several
target hooks(Many thanks to their work) so other backends can do similar
things if they have similar feature.
  Intel LAM(linear Address Masking)[3 Charpter 14] supports similar feature with
the upper bits of pointers can be used as metadata, LAM support two modes:
  LAM_U48:bits 48-62 can be used as metadata
  LAM_U57:bits 57-62 can be used as metedata.

These 2 patches mainly support those target hooks, but HWASAN is not really
enabled until the final decision for the LAM kernel interface which may take
quite a long time. We have verified our patches with a "fake" interface 
locally[4], and
decided to push the backend patches to the GCC13 to make other HWASAN 
developper's work
easy.

[1] https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557857.html
[3] 
https://www.intel.com/content/dam/develop/external/us/en/documents/architecture-instruction-set-extensions-programming-reference.pdf
[4] https://gitlab.com/x86-gcc/gcc/-/tree/users/intel/lam/master


Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

liuhongt (2):
  Implement hwasan target_hook.
  Enable hwasan for x86-64.

 gcc/config/i386/i386-expand.cc  |  12 
 gcc/config/i386/i386-options.cc |   3 +
 gcc/config/i386/i386-opts.h |   6 ++
 gcc/config/i386/i386-protos.h   |   2 +
 gcc/config/i386/i386.cc | 123 
 gcc/config/i386/i386.opt|  16 +
 libsanitizer/configure.tgt  |   1 +
 7 files changed, 163 insertions(+)

-- 
2.18.1



[PATCH 1/2] Implement hwasan target_hook.

2022-11-10 Thread liuhongt via Gcc-patches
gcc/ChangeLog:

* config/i386/i386-opts.h (enum lam_type): New enum.
* config/i386/i386.c (ix86_memtag_can_tag_addresses): New.
(ix86_memtag_set_tag): Ditto.
(ix86_memtag_extract_tag): Ditto.
(ix86_memtag_add_tag): Ditto.
(ix86_memtag_tag_size): Ditto.
(ix86_memtag_untagged_pointer): Ditto.
(TARGET_MEMTAG_CAN_TAG_ADDRESSES): New.
(TARGET_MEMTAG_ADD_TAG): Ditto.
(TARGET_MEMTAG_SET_TAG): Ditto.
(TARGET_MEMTAG_EXTRACT_TAG): Ditto.
(TARGET_MEMTAG_UNTAGGED_POINTER): Ditto.
(TARGET_MEMTAG_TAG_SIZE): Ditto.
(IX86_HWASAN_SHIFT): Ditto.
(IX86_HWASAN_TAG_SIZE): Ditto.
* config/i386/i386-expand.c (ix86_expand_call): Untag code
pointer.
* config/i386/i386-options.c (ix86_option_override_internal):
Error when enable -mlam=[u48|u57] for 32-bit code.
* config/i386/i386.opt: Add -mlam=[none|u48|u57].
* config/i386/i386-protos.h (ix86_memtag_untagged_pointer):
Declare.
(ix86_memtag_can_tag_addresses): Ditto.
---
 gcc/config/i386/i386-expand.cc  |  12 
 gcc/config/i386/i386-options.cc |   3 +
 gcc/config/i386/i386-opts.h |   6 ++
 gcc/config/i386/i386-protos.h   |   2 +
 gcc/config/i386/i386.cc | 123 
 gcc/config/i386/i386.opt|  16 +
 6 files changed, 162 insertions(+)

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 9c92b07d5cd..1af50c86c39 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -92,6 +92,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "i386-options.h"
 #include "i386-builtins.h"
 #include "i386-expand.h"
+#include "asan.h"
 
 /* Split one or more double-mode RTL references into pairs of half-mode
references.  The RTL can be REG, offsettable MEM, integer constant, or
@@ -9436,6 +9437,17 @@ ix86_expand_call (rtx retval, rtx fnaddr, rtx callarg1,
   fnaddr = gen_rtx_MEM (QImode, copy_to_mode_reg (word_mode, fnaddr));
 }
 
+  /* PR100665: Hwasan may tag code pointer which is not supported by LAM,
+ mask off code pointers here.
+ TODO: also need to handle indirect jump.  */
+  if (ix86_memtag_can_tag_addresses () && !fndecl
+  && sanitize_flags_p (SANITIZE_HWADDRESS))
+{
+  rtx untagged_addr = ix86_memtag_untagged_pointer (XEXP (fnaddr, 0),
+   NULL_RTX);
+  fnaddr = gen_rtx_MEM (QImode, untagged_addr);
+}
+
   call = gen_rtx_CALL (VOIDmode, fnaddr, callarg1);
 
   if (retval)
diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc
index e5c77f3a84d..b59ed5aee45 100644
--- a/gcc/config/i386/i386-options.cc
+++ b/gcc/config/i386/i386-options.cc
@@ -2006,6 +2006,9 @@ ix86_option_override_internal (bool main_args_p,
   if (TARGET_UINTR && !TARGET_64BIT)
 error ("%<-muintr%> not supported for 32-bit code");
 
+  if (ix86_lam_type && !TARGET_LP64)
+error ("%<-mlam=%> option: [u48|u57] not supported for 32-bit code");
+
   if (!opts->x_ix86_arch_string)
 opts->x_ix86_arch_string
   = TARGET_64BIT_P (opts->x_ix86_isa_flags)
diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
index 8f71e89fa9a..d3bfeed0af2 100644
--- a/gcc/config/i386/i386-opts.h
+++ b/gcc/config/i386/i386-opts.h
@@ -128,4 +128,10 @@ enum harden_sls {
   harden_sls_all = harden_sls_return | harden_sls_indirect_jmp
 };
 
+enum lam_type {
+  lam_none = 0,
+  lam_u48 = 1,
+  lam_u57
+};
+
 #endif
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 5318fc7fddf..2533f17006d 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -227,6 +227,8 @@ extern void ix86_expand_atomic_fetch_op_loop (rtx, rtx, 
rtx, enum rtx_code,
  bool, bool);
 extern void ix86_expand_cmpxchg_loop (rtx *, rtx, rtx, rtx, rtx, rtx,
  bool, rtx_code_label *);
+extern rtx ix86_memtag_untagged_pointer (rtx, rtx);
+extern bool ix86_memtag_can_tag_addresses (void);
 
 #ifdef TREE_CODE
 extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int);
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index f8586499cd1..e6609cc12bb 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -24260,6 +24260,111 @@ ix86_push_rounding (poly_int64 bytes)
   return ROUND_UP (bytes, UNITS_PER_WORD);
 }
 
+/* Use 8 bits metadata start from bit48 for LAM_U48,
+   6 bits metadat start from bit57 for LAM_U57.  */
+#define IX86_HWASAN_SHIFT (ix86_lam_type == lam_u48\
+  ? 48 \
+  : (ix86_lam_type == lam_u57 ? 57 : 0))
+#define IX86_HWASAN_TAG_SIZE (ix86_lam_type == lam_u48 \
+ ? 8   \
+ : 

[PATCH 2/2] Enable hwasan for x86-64.

2022-11-10 Thread liuhongt via Gcc-patches
libsanitizer
* configure.tgt: Enable hwasan for x86-64.
---
 libsanitizer/configure.tgt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libsanitizer/configure.tgt b/libsanitizer/configure.tgt
index 87d8a2c3820..72385a4a39d 100644
--- a/libsanitizer/configure.tgt
+++ b/libsanitizer/configure.tgt
@@ -29,6 +29,7 @@ case "${target}" in
TSAN_SUPPORTED=yes
LSAN_SUPPORTED=yes
TSAN_TARGET_DEPENDENT_OBJECTS=tsan_rtl_amd64.lo
+   HWASAN_SUPPORTED=yes
fi
;;
   powerpc*-*-linux*)
-- 
2.18.1



Re: [PATCH v2 2/4] LoongArch: Add ftint{,rm,rp}.{w,l}.{s,d} instructions

2022-11-10 Thread Joseph Myers
On Thu, 10 Nov 2022, Xi Ruoyao via Gcc-patches wrote:

> Joseph: can you confirm that -ftrapping-math allows floor and ceil to
> raise inexact exception?  The man page currently says:
> 
> The default is -ffp-int-builtin-inexact, allowing the exception to be
> raised, unless C2X or a later C standard is selected.  This option does 
>^^^
> nothing unless -ftrapping-math is in effect.
> 
> To me it's not very clear that "this option" stands for -fno-fp-int-
> builtin-inexact or -ffp-int-builtin-inexact.

The -ftrapping-math option (which is on by default) means that we care 
about whether operations raise exceptions: they should raise exceptions if 
and only if the relevant standard permit them to do so.

The combination of -ftrapping-math with -fno-fp-int-builtin-inexact means 
the listed built-in functions must not raise "inexact".

If -fno-trapping-math is used, then we don't care about whether exceptions 
are raised or not (for any floating-point operations, not just those 
functions).  So given -fno-trapping-math, there is no difference between 
-fno-fp-int-builtin-inexact and -ffp-int-builtin-inexact.

If -ffp-int-builtin-inexact (default before C2X), we don't care about 
whether those functions raise "inexact" (but still care about other 
exceptions and exceptions for other operations, unless 
-fno-trapping-math).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] c-family: Support #pragma region/endregion [PR85487]

2022-11-10 Thread Joseph Myers
On Thu, 10 Nov 2022, Jonathan Wakely via Gcc-patches wrote:

> Something similar has been proposed before, but didn't get approval.
> Jeff wanted a more general framework for ignoring pragmas. It might make
> sense to do that, and reuse it for the Darwin-specific 'mark' pragmas.
> But as I said in the PR, I looked at the darwin ones and they are unique
> among all pragmas in GCC. I am not going there, sorry :-)
> 
> In the PR it was suggested that we should check for syntax errors in
> these pragmas or make sure there are matching region/endregion pairs.
> I disagree. This is a simple, low-risk patch that removes unhelpful
> warnings for users who have these macros for their editor to process. 
> It's not our business to check for correct use of these macros, their
> meaning is determined by other tools that we don't control. We should
> just not complain when we see them, and no more.
> 
> Tested powerpc64le-linux. OK for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


demangler: Templated lambda demangling

2022-11-10 Thread Nathan Sidwell via Gcc-patches

Templated lambdas have a template-head, which is part of their
signature.  GCC ABI 18 mangles that into the lambda name.  This adds
support to the demangler.  We have to introduce artificial template
parameter names, as we need to refer to them from later components of
the lambda signature. We use $T:n, $N:n and $TT:n for type, non-type
and template parameters.  Non-type parameter names are not shown in
the strictly correct location -- for instance 'int () ()' would be
shown as 'int (&) $N:n'.  That's unfortunate, but an orthogonal issue.
The 'is_lambda_arg' field is now repurposed as indicating the number
of explicit template parameters (1-based).

I'll commit in a few days.

nathan


--
Nathan SidwellFrom b7f0ba90011b8c9cae7b7278463f609ba21cd44b Mon Sep 17 00:00:00 2001
From: Nathan Sidwell 
Date: Mon, 7 Nov 2022 11:24:14 -0500
Subject: [PATCH] demangler: Templated lambda demangling

Templated lambdas have a template-head, which is part of their
signature.  GCC ABI 18 mangles that into the lambda name.  This adds
support to the demangler.  We have to introduce artificial template
parameter names, as we need to refer to them from later components of
the lambda signature. We use $T:n, $N:n and $TT:n for type, non-type
and template parameters.  Non-type parameter names are not shown in
the strictly correct location -- for instance 'int () ()' would be
shown as 'int (&) $N:n'.  That's unfortunate, but an orthogonal issue.
The 'is_lambda_arg' field is now repurposed as indicating the number
of explicit template parameters (1-based).

	include/
	* demangle.h (enum demangle_component_type): Add
	DEMANGLE_COMPONENT_TEMPLATE_HEAD,
	DEMANGLE_COMPONENT_TEMPLATE_TYPE_PARM,
	DEMANGLE_COMPONENT_TEMPLATE_NON_TYPE_PARM,
	DEMANGLE_COMPONENT_TEMPLATE_TEMPLATE_PARM,
	DEMANGLE_COMPONENT_TEMPLATE_PACK_PARM.
	libiberty/
	* cp-demangle.c (struct d_print_info): Rename is_lambda_arg to
	lambda_tpl_parms.  Augment semantics.
	(d_make_comp): Add checks for new components.
	(d_template_parm, d_template_head): New.
	(d_lambda): Add templated lambda support.
	(d_print_init): Adjust.
	(d_print_lambda_parm_name): New.
	(d_print_comp_inner): Support templated lambdas,
	* testsuite/demangle-expected: Add testcases.
---
 include/demangle.h|   6 +
 libiberty/cp-demangle.c   | 260 +++---
 libiberty/testsuite/demangle-expected |  53 ++
 3 files changed, 295 insertions(+), 24 deletions(-)

diff --git a/include/demangle.h b/include/demangle.h
index 81d4353a86f..66637ebdc16 100644
--- a/include/demangle.h
+++ b/include/demangle.h
@@ -458,6 +458,12 @@ enum demangle_component_type
   DEMANGLE_COMPONENT_MODULE_ENTITY,
   DEMANGLE_COMPONENT_MODULE_INIT,
 
+  DEMANGLE_COMPONENT_TEMPLATE_HEAD,
+  DEMANGLE_COMPONENT_TEMPLATE_TYPE_PARM,
+  DEMANGLE_COMPONENT_TEMPLATE_NON_TYPE_PARM,
+  DEMANGLE_COMPONENT_TEMPLATE_TEMPLATE_PARM,
+  DEMANGLE_COMPONENT_TEMPLATE_PACK_PARM,
+
   /* A builtin type with argument.  This holds the builtin type
  information.  */
   DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 8413dcdc785..ad533f6085e 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -347,9 +347,9 @@ struct d_print_info
   /* Number of times d_print_comp was recursively called.  Should not
  be bigger than MAX_RECURSION_COUNT.  */
   int recursion;
-  /* Non-zero if we're printing a lambda argument.  A template
- parameter reference actually means 'auto', a pack expansion means T...  */
-  int is_lambda_arg;
+  /* 1 more than the number of explicit template parms of a lambda.  Template
+ parm references >= are actually 'auto'.  */
+  int lambda_tpl_parms;
   /* The current index into any template argument packs we are using
  for printing, or -1 to print the whole pack.  */
   int pack_index;
@@ -491,6 +491,10 @@ static struct demangle_component *d_local_name (struct d_info *);
 
 static int d_discriminator (struct d_info *);
 
+static struct demangle_component *d_template_parm (struct d_info *, int *bad);
+
+static struct demangle_component *d_template_head (struct d_info *, int *bad);
+
 static struct demangle_component *d_lambda (struct d_info *);
 
 static struct demangle_component *d_unnamed_type (struct d_info *);
@@ -1028,6 +1032,10 @@ d_make_comp (struct d_info *di, enum demangle_component_type type,
 case DEMANGLE_COMPONENT_TPARM_OBJ:
 case DEMANGLE_COMPONENT_STRUCTURED_BINDING:
 case DEMANGLE_COMPONENT_MODULE_INIT:
+case DEMANGLE_COMPONENT_TEMPLATE_HEAD:
+case DEMANGLE_COMPONENT_TEMPLATE_NON_TYPE_PARM:
+case DEMANGLE_COMPONENT_TEMPLATE_TEMPLATE_PARM:
+case DEMANGLE_COMPONENT_TEMPLATE_PACK_PARM:
   if (left == NULL)
 	return NULL;
   break;
@@ -1050,6 +1058,7 @@ d_make_comp (struct d_info *di, enum demangle_component_type type,
 case DEMANGLE_COMPONENT_CONST:
 case DEMANGLE_COMPONENT_ARGLIST:
 case DEMANGLE_COMPONENT_TEMPLATE_ARGLIST:
+case 

[PATCH] d: Update __FreeBSD_version values [PR107469]

2022-11-10 Thread Lorenzo Salvadore via Gcc-patches
Hello,

I would like to submit the patch below. Gerald Pfeifer already
volunteered to commit it once approved.

Thanks,

Lorenzo Salvadore

---

Update __FreeBSD_version values for the latest FreeBSD supported
versions. In particular, add __FreeBSD_version for FreeBSD 14, which is
necessary to compile libphobos successfully on FreeBSD 14.

The patch has already been applied successfully in the official FreeBSD
ports tree for the ports lang/gcc11 and lang/gcc11-devel. Please see the
following commits:

https://cgit.freebsd.org/ports/commit/?id=f61fb49b2e76fd4f7a5b7a11510b5109206c19f2
https://cgit.freebsd.org/ports/commit/?id=57936dba89ea208e5dbc1bd2d7fda3d29a1838b3

libphobos/ChangeLog:

2022-11-10  Lorenzo Salvadore  

PR d/107469.
* libdruntime/core/sys/freebsd/config.d: Update __FreeBSD_version.

---
 libphobos/libdruntime/core/sys/freebsd/config.d | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/libphobos/libdruntime/core/sys/freebsd/config.d 
b/libphobos/libdruntime/core/sys/freebsd/config.d
index 5e3129e2422..9d502e52e32 100644
--- a/libphobos/libdruntime/core/sys/freebsd/config.d
+++ b/libphobos/libdruntime/core/sys/freebsd/config.d
@@ -14,8 +14,9 @@ public import core.sys.posix.config;
 // NOTE: When adding newer versions of FreeBSD, verify all current versioned
 // bindings are still compatible with the release.

- version (FreeBSD_13) enum __FreeBSD_version = 130;
-else version (FreeBSD_12) enum __FreeBSD_version = 1202000;
+ version (FreeBSD_14) enum __FreeBSD_version = 140;
+else version (FreeBSD_13) enum __FreeBSD_version = 1301000;
+else version (FreeBSD_12) enum __FreeBSD_version = 1203000;
 else version (FreeBSD_11) enum __FreeBSD_version = 1104000;
 else version (FreeBSD_10) enum __FreeBSD_version = 1004000;
 else version (FreeBSD_9)  enum __FreeBSD_version = 903000;
--
2.38.0


[PATCH] Fortran: fix treatment of character, value, optional dummy arguments [PR107444]

2022-11-10 Thread Harald Anlauf via Gcc-patches
Dear Fortranners,

the attached patch is a follow-up to the fix for PR107441,
as it finally fixes the treatment of character dummy arguments
that have the value,optional attribute, and allows for checking
of the presence of such arguments.

This entails a small ABI clarification, as the previous text
was not really clear on the argument passing conventions,
and the previously generated code was inconsistent at best,
or rather wrong, for this kind of procedure arguments.
(E.g. the number of passed arguments was varying...)

Testcase cross-checked with NAG 7.1.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From d87e299dd2b7f4be6ca829e80cd94babc53fa12f Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 10 Nov 2022 22:30:27 +0100
Subject: [PATCH] Fortran: fix treatment of character, value, optional dummy
 arguments [PR107444]

Fix handling of character dummy arguments that have the optional+value
attribute.  Change name of internal symbols that carry the hidden presence
status of optional arguments to distinguish them from the internal hidden
character length.  Update documentation to clarify the gfortran ABI.

gcc/fortran/ChangeLog:

	PR fortran/107444
	* trans-decl.cc (create_function_arglist): Extend presence status
	to all intrinsic types, and change prefix of internal symbol to '.'.
	* trans-expr.cc (gfc_conv_expr_present): Align to changes in
	create_function_arglist.
	(gfc_conv_procedure_call): Fix generation of procedure arguments for
	the case of character dummy arguments with optional+value attribute.
	* trans-types.cc (gfc_get_function_type): Synchronize with changes
	to create_function_arglist.
	* doc/gfortran/naming-and-argument-passing-conventions.rst: Clarify
	the gfortran argument passing conventions with regard to OPTIONAL
	dummy arguments of intrinsic type.

gcc/testsuite/ChangeLog:

	PR fortran/107444
	* gfortran.dg/optional_absent_7.f90: Adjust regex.
	* gfortran.dg/optional_absent_8.f90: New test.
---
 ...aming-and-argument-passing-conventions.rst |  3 +-
 gcc/fortran/trans-decl.cc | 10 ++--
 gcc/fortran/trans-expr.cc | 25 ++---
 gcc/fortran/trans-types.cc| 14 ++---
 .../gfortran.dg/optional_absent_7.f90 |  2 +-
 .../gfortran.dg/optional_absent_8.f90 | 53 +++
 6 files changed, 84 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/optional_absent_8.f90

diff --git a/gcc/fortran/doc/gfortran/naming-and-argument-passing-conventions.rst b/gcc/fortran/doc/gfortran/naming-and-argument-passing-conventions.rst
index 4baaee9bfec..fa999fac355 100644
--- a/gcc/fortran/doc/gfortran/naming-and-argument-passing-conventions.rst
+++ b/gcc/fortran/doc/gfortran/naming-and-argument-passing-conventions.rst
@@ -142,8 +142,7 @@ is used for dummy arguments; with ``VALUE``, those variables are
 passed by value.

 For ``OPTIONAL`` dummy arguments, an absent argument is denoted
-by a NULL pointer, except for scalar dummy arguments of type
-``INTEGER``, ``LOGICAL``, ``REAL`` and ``COMPLEX``
+by a NULL pointer, except for scalar dummy arguments of intrinsic type
 which have the ``VALUE`` attribute.  For those, a hidden Boolean
 argument (``logical(kind=C_bool),value``) is used to indicate
 whether the argument is present.
diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
index 94988b8690e..217de6b8da0 100644
--- a/gcc/fortran/trans-decl.cc
+++ b/gcc/fortran/trans-decl.cc
@@ -2708,16 +2708,16 @@ create_function_arglist (gfc_symbol * sym)
 		type = gfc_sym_type (f->sym);
 	}
 	}
-  /* For noncharacter scalar intrinsic types, VALUE passes the value,
+  /* For scalar intrinsic types, VALUE passes the value,
 	 hence, the optional status cannot be transferred via a NULL pointer.
 	 Thus, we will use a hidden argument in that case.  */
-  else if (f->sym->attr.optional && f->sym->attr.value
-	   && !f->sym->attr.dimension && f->sym->ts.type != BT_CLASS
-	   && !gfc_bt_struct (f->sym->ts.type))
+  if (f->sym->attr.optional && f->sym->attr.value
+	  && !f->sym->attr.dimension && f->sym->ts.type != BT_CLASS
+	  && !gfc_bt_struct (f->sym->ts.type))
 	{
   tree tmp;
   strcpy ([1], f->sym->name);
-  name[0] = '_';
+	  name[0] = '.';
   tmp = build_decl (input_location,
 			PARM_DECL, get_identifier (name),
 			boolean_type_node);
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index f3fbb527157..b95c5cf2f96 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -1985,15 +1985,14 @@ gfc_conv_expr_present (gfc_symbol * sym, bool use_saved_desc)

   /* Intrinsic scalars with VALUE attribute which are passed by value
  use a hidden argument to denote the present status.  */
-  if (sym->attr.value && sym->ts.type != BT_CHARACTER
-  && sym->ts.type != BT_CLASS && sym->ts.type != BT_DERIVED
-  && !sym->attr.dimension)
+  if (sym->attr.value && 

[PATCH] RISC-V: Optimize masking with two clear bits not a SMALL_OPERAND

2022-11-10 Thread Philipp Tomsich
Add a split for cases where we can use two bclri (or one bclri and an
andi) to clear two bits.

gcc/ChangeLog:

* config/riscv/bitmanip.md (*bclri_nottwobits): New pattern.
(*bclridisi_nottwobits): New pattern, handling the sign-bit.
* config/riscv/predicates.md (const_nottwobits_operand):
New predicate.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbs-bclri.c: New test.

Signed-off-by: Philipp Tomsich 
---

 gcc/config/riscv/bitmanip.md   | 38 ++
 gcc/config/riscv/predicates.md |  5 +++
 gcc/testsuite/gcc.target/riscv/zbs-bclri.c | 12 +++
 3 files changed, 55 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbs-bclri.c

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 7fa8461bb71..f1d8f24c2d3 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -560,6 +560,44 @@
   "bclri\t%0,%1,%T2"
   [(set_attr "type" "bitmanip")])
 
+;; In case we have "val & ~IMM" where ~IMM has 2 bits set.
+(define_insn_and_split "*bclri_nottwobits"
+  [(set (match_operand:X 0 "register_operand" "=r")
+   (and:X (match_operand:X 1 "register_operand" "r")
+  (match_operand:X 2 "const_nottwobits_operand" "i")))]
+  "TARGET_ZBS && !paradoxical_subreg_p (operands[1])"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0) (and:X (match_dup 1) (match_dup 3)))
+   (set (match_dup 0) (and:X (match_dup 0) (match_dup 4)))]
+{
+   unsigned HOST_WIDE_INT bits = ~UINTVAL (operands[2]);
+   unsigned HOST_WIDE_INT topbit = HOST_WIDE_INT_1U << floor_log2 (bits);
+
+   operands[3] = GEN_INT (~bits | topbit);
+   operands[4] = GEN_INT (~topbit);
+})
+
+;; In case of a paradoxical subreg, the sign bit and the high bits are
+;; not allowed to be changed
+(define_insn_and_split "*bclridisi_nottwobits"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (and:DI (match_operand:DI 1 "register_operand" "r")
+   (match_operand:DI 2 "const_nottwobits_operand" "i")))]
+  "TARGET_64BIT && TARGET_ZBS
+   && clz_hwi (~UINTVAL (operands[2])) > 33"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0) (and:DI (match_dup 1) (match_dup 3)))
+   (set (match_dup 0) (and:DI (match_dup 0) (match_dup 4)))]
+{
+   unsigned HOST_WIDE_INT bits = ~UINTVAL (operands[2]);
+   unsigned HOST_WIDE_INT topbit = HOST_WIDE_INT_1U << floor_log2 (bits);
+
+   operands[3] = GEN_INT (~bits | topbit);
+   operands[4] = GEN_INT (~topbit);
+})
+
 (define_insn "*binv"
   [(set (match_operand:X 0 "register_operand" "=r")
(xor:X (ashift:X (const_int 1)
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index 6de9b39e39b..b368c11c930 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -304,6 +304,11 @@
(match_test "ctz_hwi (INTVAL (op)) > 0")
(match_test "SMALL_OPERAND (INTVAL (op) >> ctz_hwi (INTVAL (op)))")))
 
+;; A CONST_INT operand that has exactly two bits cleared.
+(define_predicate "const_nottwobits_operand"
+  (and (match_code "const_int")
+   (match_test "popcount_hwi (~UINTVAL (op)) == 2")))
+
 ;; A CONST_INT operand that fits into the unsigned half of a
 ;; signed-immediate after the top bit has been cleared.
 (define_predicate "uimm_extra_bit_operand"
diff --git a/gcc/testsuite/gcc.target/riscv/zbs-bclri.c 
b/gcc/testsuite/gcc.target/riscv/zbs-bclri.c
new file mode 100644
index 000..12e2063436c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbs-bclri.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zbs -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+/* bclri + bclri */
+long long f5 (long long a)
+{
+  return a & ~0x11000;
+}
+
+/* { dg-final { scan-assembler-times "bclri\t" 2 } } */
+
-- 
2.34.1



[PATCH] RISC-V: Use binvi to cover more immediates than with xori alone

2022-11-10 Thread Philipp Tomsich
Sequences of the form "a ^ C" with C being the positive half of a
signed immediate's range with one extra bit set in addtion are mapped
to xori and one binvi to avoid using a temporary (and a multi-insn
sequence to load C into that temporary).

gcc/ChangeLog:

* config/riscv/bitmanip.md (*binvi_extrabit): New pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbs-binvi.c: New test.

Signed-off-by: Philipp Tomsich 
---
- Depends on a predicate posted in "RISC-V: Optimize branches testing
  a bit-range or a shifted immediate".  Depending on the order of
  applying these, I'll take care to pull that part out of the other
  patch if needed.

 gcc/config/riscv/bitmanip.md   | 19 +++
 gcc/testsuite/gcc.target/riscv/zbs-binvi.c | 22 ++
 2 files changed, 41 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbs-binvi.c

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 436ff4ba958..7fa8461bb71 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -577,6 +577,25 @@
   "binvi\t%0,%1,%S2"
   [(set_attr "type" "bitmanip")])
 
+; Catch those cases where we can use a binvi + xori or binvi + binvi
+; instead of a lui + addi + xor sequence.
+(define_insn_and_split "*binvi_extrabit"
+  [(set (match_operand:X 0 "register_operand" "=r")
+   (xor:X (match_operand:X 1 "register_operand" "r")
+  (match_operand:X 2 "uimm_extra_bit_operand" "i")))]
+  "TARGET_ZBS"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0) (xor:X (match_dup 1) (match_dup 3)))
+   (set (match_dup 0) (xor:X (match_dup 0) (match_dup 4)))]
+{
+   unsigned HOST_WIDE_INT bits = UINTVAL (operands[2]);
+   unsigned HOST_WIDE_INT topbit = HOST_WIDE_INT_1U << floor_log2 (bits);
+
+   operands[3] = GEN_INT (bits &~ topbit);
+   operands[4] = GEN_INT (topbit);
+})
+
 (define_insn "*bext"
   [(set (match_operand:X 0 "register_operand" "=r")
(zero_extract:X (match_operand:X 1 "register_operand" "r")
diff --git a/gcc/testsuite/gcc.target/riscv/zbs-binvi.c 
b/gcc/testsuite/gcc.target/riscv/zbs-binvi.c
new file mode 100644
index 000..c2d6725b53b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbs-binvi.c
@@ -0,0 +1,22 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zbs -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+long long f3(long long a)
+{
+  return a ^ 0x1100;
+}
+
+long long f4 (long long a)
+{
+  return a ^ 0x80ffull;
+}
+
+long long f5 (long long a)
+{
+  return a ^ 0x8010ull;
+}
+
+/* { dg-final { scan-assembler-times "binvi\t" 4 } } */
+/* { dg-final { scan-assembler-times "xori\t" 2 } } */
+
-- 
2.34.1



[PATCH] RISC-V: Use bseti to cover more immediates than with ori alone

2022-11-10 Thread Philipp Tomsich
Sequences of the form "a | C" with C being the positive half of a
signed immediate's range with one extra bit set in addtion are mapped
to ori and one binvi to avoid using a temporary (and a multi-insn
sequence to load C into that temporary).

gcc/ChangeLog:

* config/riscv/bitmanip.md (*bseti_extrabit): New pattern

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbs-bseti.c: New test.

Signed-off-by: Philipp Tomsich 
---
- Depends on a predicate posted in "RISC-V: Optimize branches testing
  a bit-range or a shifted immediate".  Depending on the order of
  applying these, I'll take care to pull that part out of the other
  patch if needed.

 gcc/config/riscv/bitmanip.md   | 19 +++
 gcc/testsuite/gcc.target/riscv/zbs-bseti.c | 27 ++
 2 files changed, 46 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbs-bseti.c

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 06126ac4819..436ff4ba958 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -512,6 +512,25 @@
   "bseti\t%0,%1,%S2"
   [(set_attr "type" "bitmanip")])
 
+; Catch those cases where we can use a bseti + ori or bseti + bseti
+; instead of a lui + addi + or sequence.
+(define_insn_and_split "*bseti_extrabit"
+  [(set (match_operand:X 0 "register_operand" "=r")
+   (ior:X (match_operand:X 1 "register_operand" "r")
+  (match_operand:X 2 "uimm_extra_bit_operand" "i")))]
+  "TARGET_ZBS"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 0) (ior:X (match_dup 1) (match_dup 3)))
+   (set (match_dup 0) (ior:X (match_dup 0) (match_dup 4)))]
+{
+   unsigned HOST_WIDE_INT bits = UINTVAL (operands[2]);
+   unsigned HOST_WIDE_INT topbit = HOST_WIDE_INT_1U << floor_log2 (bits);
+
+   operands[3] = GEN_INT (bits &~ topbit);
+   operands[4] = GEN_INT (topbit);
+})
+
 ;; As long as the SImode operand is not a partial subreg, we can use a
 ;; bseti without postprocessing, as the middle end is smart enough to
 ;; stay away from the signbit.
diff --git a/gcc/testsuite/gcc.target/riscv/zbs-bseti.c 
b/gcc/testsuite/gcc.target/riscv/zbs-bseti.c
new file mode 100644
index 000..5738add6348
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbs-bseti.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zbs -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+long long foo1 (long long a)
+{
+  return a | 0x1100;
+}
+
+long long foo2 (long long a)
+{
+  return a | 0x80ffull;
+}
+
+long long foo3 (long long a)
+{
+  return a | 0x8001ull;
+}
+
+long long foo4 (long long a)
+{
+  return a | 0xfff;
+}
+
+/* { dg-final { scan-assembler-times "bseti\t" 5 } } */
+/* { dg-final { scan-assembler-times "ori\t" 3 } } */
+
-- 
2.34.1



[PATCH v2] RISC-V: costs: support shift-and-add in strength-reduction

2022-11-10 Thread Philipp Tomsich
The strength-reduction implementation in expmed.cc will assess the
profitability of using shift-and-add using a RTL expression that wraps
a MULT (with a power-of-2) in a PLUS.  Unless the RISC-V rtx_costs
function recognizes this as expressing a sh[123]add instruction, we
will return an inflated cost---thus defeating the optimization.

This change adds the necessary idiom recognition to provide an
accurate cost for this for of expressing sh[123]add.

Instead on expanding to
li  a5,200
mulwa0,a5,a0
with this change, the expression 'a * 200' is sythesized as:
sh2add  a0,a0,a0   // *5 = a + 4 * a
sh2add  a0,a0,a0   // *5 = a + 4 * a
sllia0,a0,3// *8

gcc/ChangeLog:

* config/riscv/riscv.c (riscv_rtx_costs): Recognize shNadd,
if expressed as a plus and multiplication with a power-of-2.
Split costing for MINUS from PLUS.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zba-shNadd-07.c: New test.

Signed-off-by: Philipp Tomsich 
---

Changes in v2:
- Split rtx_costs calculation for MINUS from PLUS to ensure that
  (minus reg (ashift reg SHAMT)) is not mistaken for a shNadd
- Add testcase

 gcc/config/riscv/riscv.cc | 19 
 .../gcc.target/riscv/zba-shNadd-07.c  | 31 +++
 2 files changed, 50 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shNadd-07.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 3e2dc8192e4..2a94482b8ed 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2428,6 +2428,12 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
   return false;
 
 case MINUS:
+  if (float_mode_p)
+   *total = tune_param->fp_add[mode == DFmode];
+  else
+   *total = riscv_binary_cost (x, 1, 4);
+  return false;
+
 case PLUS:
   /* add.uw pattern for zba.  */
   if (TARGET_ZBA
@@ -2451,6 +2457,19 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
  *total = COSTS_N_INSNS (1);
  return true;
}
+  /* Before strength-reduction, the shNadd can be expressed as the addition
+of a multiplication with a power-of-two.  If this case is not handled,
+the strength-reduction in expmed.c will calculate an inflated cost. */
+  if (TARGET_ZBA
+ && mode == word_mode
+ && GET_CODE (XEXP (x, 0)) == MULT
+ && REG_P (XEXP (XEXP (x, 0), 0))
+ && CONST_INT_P (XEXP (XEXP (x, 0), 1))
+ && IN_RANGE (pow2p_hwi (INTVAL (XEXP (XEXP (x, 0), 1))), 1, 3))
+   {
+ *total = COSTS_N_INSNS (1);
+ return true;
+   }
   /* shNadd.uw pattern for zba.
 [(set (match_operand:DI 0 "register_operand" "=r")
   (plus:DI
diff --git a/gcc/testsuite/gcc.target/riscv/zba-shNadd-07.c 
b/gcc/testsuite/gcc.target/riscv/zba-shNadd-07.c
new file mode 100644
index 000..98d35e1da9b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zba-shNadd-07.c
@@ -0,0 +1,31 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zba -mabi=lp64 -O2" } */
+
+unsigned long
+f1 (unsigned long i)
+{
+  return i * 200;
+}
+
+unsigned long
+f2 (unsigned long i)
+{
+  return i * 783;
+}
+
+unsigned long
+f3 (unsigned long i)
+{
+  return i * 784;
+}
+
+unsigned long
+f4 (unsigned long i)
+{
+  return i * 1574;
+}
+
+/* { dg-final { scan-assembler-times "sh2add" 2 } } */
+/* { dg-final { scan-assembler-times "sh1add" 2 } } */
+/* { dg-final { scan-assembler-times "slli" 5 } } */
+/* { dg-final { scan-assembler-times "mul" 1 } } */
-- 
2.34.1



Re: [PATCH] RISC-V: costs: support shift-and-add in strength-reduction

2022-11-10 Thread Philipp Tomsich
On Thu, 10 Nov 2022 at 21:47, Palmer Dabbelt  wrote:
>
> On Thu, 10 Nov 2022 07:09:35 PST (-0800), philipp.toms...@vrull.eu wrote:
> > On Thu, 10 Nov 2022 at 02:46, Palmer Dabbelt  wrote:
> >>
> >> On Tue, 08 Nov 2022 11:54:34 PST (-0800), philipp.toms...@vrull.eu wrote:
> >> > The strength-reduction implementation in expmed.c will assess the
> >> > profitability of using shift-and-add using a RTL expression that wraps
> >> > a MULT (with a power-of-2) in a PLUS.  Unless the RISC-V rtx_costs
> >> > function recognizes this as expressing a sh[123]add instruction, we
> >> > will return an inflated cost---thus defeating the optimization.
> >> >
> >> > This change adds the necessary idiom recognition to provide an
> >> > accurate cost for this for of expressing sh[123]add.
> >> >
> >> > Instead on expanding to
> >> >   li  a5,200
> >> >   mulwa0,a5,a0
> >> > with this change, the expression 'a * 200' is sythesized as:
> >> >   sh2add  a0,a0,a0   // *5 = a + 4 * a
> >> >   sh2add  a0,a0,a0   // *5 = a + 4 * a
> >> >   sllia0,a0,3// *8
> >>
> >> That's more instructions, but multiplication is generally expensive.  At
> >> some point I remember the SiFive cores getting very fast integer
> >> multipliers, but I don't see that reflected in the cost model anywhere
> >> so maybe I'm just wrong?  Andrew or Kito might remember...
> >>
> >> If the mul-based sequences are still faster on the SiFive cores then we
> >> should probably find a way to keep emitting them, which may just be a
> >> matter of adjusting those multiply costs.  Moving to the shift-based
> >> sequences seems reasonable for a generic target, though.
> >
> > The cost for a regular MULT is COSTS_N_INSNS(4) for the series-7 (see
> > the SImode and DImode entries in the int_mul line):
> > /* Costs to use when optimizing for Sifive 7 Series.  */
> > static const struct riscv_tune_param sifive_7_tune_info = {
> >   {COSTS_N_INSNS (4), COSTS_N_INSNS (5)},   /* fp_add */
> >   {COSTS_N_INSNS (4), COSTS_N_INSNS (5)},   /* fp_mul */
> >   {COSTS_N_INSNS (20), COSTS_N_INSNS (20)}, /* fp_div */
> >   {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},   /* int_mul */
> >   {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},   /* int_div */
> >   2,/* issue_rate */
> >   4,/* branch_cost */
> >   3,/* memory_cost */
> >   8,/* fmv_cost */
> >   true, /* slow_unaligned_access */
> > };
> >
> > So the break-even is at COSTS_N_INSNS(4) + rtx_cost(immediate).
> >
> > Testing against series-7, we get up to 5 (4 for the mul + 1 for the
> > li) instructions from strength reduction:
> >
> > val * 783
> > =>
> > sh1add a5,a0,a0
> > slli a5,a5,4
> > add a5,a5,a0
> > slli a5,a5,4
> > sub a0,a5,a0
> >
> > but fall back to a mul, once the cost exceeds this:
> >
> > val * 1574
> > =>
> > li a5,1574
> > mul a0,a0,a5
>
> That's just the cost model, though, not the hardware.  My argument was
> essentially that the cost model is wrong, assuming how I remember the
> hardware is right.  That was a while ago and there's a lot of options,
> though, so I'm not sure what these things actually look like.
>
> IMO that doesn't need to block this patch, though: having one incorrect
> cost model so it cancels out another one is a great way to lose our
> minds.
>
> >> Either way, it probably warrants a test case to make sure we don't
> >> regress in the future.
> >
> > Ack. Will be added for v2.
> >
> >>
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> >   * config/riscv/riscv.c (riscv_rtx_costs): Recognize shNadd,
> >> >   if expressed as a plus and multiplication with a power-of-2.
> >
> > This will still need to be regenerated (it's referring to a '.c'
> > extension still).
> >
> >> >
> >> > ---
> >> >
> >> >  gcc/config/riscv/riscv.cc | 13 +
> >> >  1 file changed, 13 insertions(+)
> >> >
> >> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> >> > index ab6c745c722..0b2c4b3599d 100644
> >> > --- a/gcc/config/riscv/riscv.cc
> >> > +++ b/gcc/config/riscv/riscv.cc
> >> > @@ -2451,6 +2451,19 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
> >> > outer_code, int opno ATTRIBUTE_UN
> >> > *total = COSTS_N_INSNS (1);
> >> > return true;
> >> >   }
> >> > +  /* Before strength-reduction, the shNadd can be expressed as the 
> >> > addition
> >> > +  of a multiplication with a power-of-two.  If this case is not 
> >> > handled,
> >> > +  the strength-reduction in expmed.c will calculate an inflated 
> >> > cost. */
> >> > +  if (TARGET_ZBA
> >> > +   && mode == word_mode
> >> > +   && GET_CODE (XEXP (x, 0)) == MULT
> >> > +   && REG_P (XEXP (XEXP (x, 0), 0))
> >> > +   && CONST_INT_P (XEXP (XEXP (x, 0), 1))
> >> > +   && IN_RANGE (pow2p_hwi (INTVAL (XEXP (XEXP (x, 

Re: [PATCH] RISC-V: costs: support shift-and-add in strength-reduction

2022-11-10 Thread Palmer Dabbelt

On Thu, 10 Nov 2022 07:09:35 PST (-0800), philipp.toms...@vrull.eu wrote:

On Thu, 10 Nov 2022 at 02:46, Palmer Dabbelt  wrote:


On Tue, 08 Nov 2022 11:54:34 PST (-0800), philipp.toms...@vrull.eu wrote:
> The strength-reduction implementation in expmed.c will assess the
> profitability of using shift-and-add using a RTL expression that wraps
> a MULT (with a power-of-2) in a PLUS.  Unless the RISC-V rtx_costs
> function recognizes this as expressing a sh[123]add instruction, we
> will return an inflated cost---thus defeating the optimization.
>
> This change adds the necessary idiom recognition to provide an
> accurate cost for this for of expressing sh[123]add.
>
> Instead on expanding to
>   li  a5,200
>   mulwa0,a5,a0
> with this change, the expression 'a * 200' is sythesized as:
>   sh2add  a0,a0,a0   // *5 = a + 4 * a
>   sh2add  a0,a0,a0   // *5 = a + 4 * a
>   sllia0,a0,3// *8

That's more instructions, but multiplication is generally expensive.  At
some point I remember the SiFive cores getting very fast integer
multipliers, but I don't see that reflected in the cost model anywhere
so maybe I'm just wrong?  Andrew or Kito might remember...

If the mul-based sequences are still faster on the SiFive cores then we
should probably find a way to keep emitting them, which may just be a
matter of adjusting those multiply costs.  Moving to the shift-based
sequences seems reasonable for a generic target, though.


The cost for a regular MULT is COSTS_N_INSNS(4) for the series-7 (see
the SImode and DImode entries in the int_mul line):
/* Costs to use when optimizing for Sifive 7 Series.  */
static const struct riscv_tune_param sifive_7_tune_info = {
  {COSTS_N_INSNS (4), COSTS_N_INSNS (5)},   /* fp_add */
  {COSTS_N_INSNS (4), COSTS_N_INSNS (5)},   /* fp_mul */
  {COSTS_N_INSNS (20), COSTS_N_INSNS (20)}, /* fp_div */
  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},   /* int_mul */
  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},   /* int_div */
  2,/* issue_rate */
  4,/* branch_cost */
  3,/* memory_cost */
  8,/* fmv_cost */
  true, /* slow_unaligned_access */
};

So the break-even is at COSTS_N_INSNS(4) + rtx_cost(immediate).

Testing against series-7, we get up to 5 (4 for the mul + 1 for the
li) instructions from strength reduction:

val * 783
=>
sh1add a5,a0,a0
slli a5,a5,4
add a5,a5,a0
slli a5,a5,4
sub a0,a5,a0

but fall back to a mul, once the cost exceeds this:

val * 1574
=>
li a5,1574
mul a0,a0,a5


That's just the cost model, though, not the hardware.  My argument was 
essentially that the cost model is wrong, assuming how I remember the 
hardware is right.  That was a while ago and there's a lot of options, 
though, so I'm not sure what these things actually look like.


IMO that doesn't need to block this patch, though: having one incorrect 
cost model so it cancels out another one is a great way to lose our 
minds.



Either way, it probably warrants a test case to make sure we don't
regress in the future.


Ack. Will be added for v2.



>
> gcc/ChangeLog:
>
>   * config/riscv/riscv.c (riscv_rtx_costs): Recognize shNadd,
>   if expressed as a plus and multiplication with a power-of-2.


This will still need to be regenerated (it's referring to a '.c'
extension still).


>
> ---
>
>  gcc/config/riscv/riscv.cc | 13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index ab6c745c722..0b2c4b3599d 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -2451,6 +2451,19 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
> *total = COSTS_N_INSNS (1);
> return true;
>   }
> +  /* Before strength-reduction, the shNadd can be expressed as the 
addition
> +  of a multiplication with a power-of-two.  If this case is not handled,
> +  the strength-reduction in expmed.c will calculate an inflated cost. */
> +  if (TARGET_ZBA
> +   && mode == word_mode
> +   && GET_CODE (XEXP (x, 0)) == MULT
> +   && REG_P (XEXP (XEXP (x, 0), 0))
> +   && CONST_INT_P (XEXP (XEXP (x, 0), 1))
> +   && IN_RANGE (pow2p_hwi (INTVAL (XEXP (XEXP (x, 0), 1))), 1, 3))

IIUC the fall-through is biting us here and this matches power-of-2 +1
and power-of-2 -1.  That looks to be the case for the one below, though,
so not sure if I'm just missing something?


The strength-reduction in expmed.cc uses "(PLUS (reg) (MULT (reg)
))" to express a shift-then-add.
Here's one of the relevant snippets (from the internal costing in expmed.cc):
  all.shift_mult = gen_rtx_MULT (mode, all.reg, all.reg);
  all.shift_add = gen_rtx_PLUS (mode, all.shift_mult, all.reg);

So while we normally encounter a 

Re: [PATCH 2/2] c++: remove i_c_e_p parm from tsubst_copy_and_build

2022-11-10 Thread Patrick Palka via Gcc-patches
On Thu, 10 Nov 2022, Patrick Palka wrote:

> AFAICT the only purpose of tsubst_copy_and_build's
> integral_constant_expression_p boolean parameter is to diagnose certain
> constructs that aren't allowed to appear in a C++98 integral constant
> expression context, specifically casts to a non-integral type (diagnosed
> from the *_CAST_EXPR case of tsubst_copy_and_build) or dependent names
> that resolve to a non-constant decl (diagnosed from the IDENTIFIER_NODE
> case of tsubst_copy_and_build).  The parameter has no effect outside of
> C++98 AFAICT.

I should add that the parameter was added to tsubst_copy_and_build
by r116276 which predates the constexpr machinery.

> 
> But diagnosing such constructs should arguably be done by
> is_constant_expression after substitution, and doing it during
> substitution by way of an additional parameter complicates the API of
> this workhouse function for functionality that's specific to C++98.
> And it seems is_constant_expression already does a good job of diagnosing
> the aforementioned two constructs in C++98 mode, at least as far as our
> testsuite is concerned.
> 
> So this patch gets rid of this parameter from tsubst_copy_and_build,
> tsubst_expr and tsubst_copy_and_build_call_args.  The only interesting
> changes are those to potential_constant_expression_1 and the
> IDENTIFIER_NODE and *_CAST_EXPR cases of tsubst_copy_and_build; the rest
> are mechanical adjustments to these functions and their call sites.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?
> 
> gcc/cp/ChangeLog:
> 
>   * constexpr.cc (potential_constant_expression_1)
>   : Use
>   cast_valid_in_integral_constant_expression_p instead of
>   open coding it.
>   * constraint.cc (tsubst_valid_expression_requirement): Adjust
>   calls to tsubst_copy_and_build and tsubst_expr.
>   (tsubst_constraint): Likewise.
>   (satisfy_atom): Likewise.
>   (diagnose_trait_expr): Likewise.
>   * cp-tree.h (tsubst_copy_and_build): Remove i_c_e_p parameter.
>   (tsubst_expr): Likewise.
>   * init.cc (get_nsdmi): Adjust calls to tsubst_copy_and_build
>   and tsubst_expr.
>   * pt.cc (expand_integer_pack): Likewise.
>   (instantiate_non_dependent_expr_internal): Likewise.
>   (tsubst_friend_function): Likewise.
>   (tsubst_attribute): Likewise.
>   (instantiate_class_template): Likewise.
>   (tsubst_template_arg): Likewise.
>   (gen_elem_of_pack_expansion_instantiation): Likewise.
>   (tsubst_fold_expr_init): Likewise.
>   (tsubst_pack_expansion): Likewise.
>   (tsubst_default_argument): Likewise.
>   (tsubst_function_decl): Likewise.
>   (tsubst_decl): Likewise.
>   (tsubst_arg_types): Likewise.
>   (tsubst_exception_specification): Likewise.
>   (tsubst): Likewise.
>   (tsubst_init): Likewise.
>   (tsubst_copy): Likewise.
>   (tsubst_omp_clause_decl): Likewise.
>   (tsubst_omp_clauses): Likewise.
>   (tsubst_copy_asm_operands): Likewise.
>   (tsubst_omp_for_iterator): Likewise.
>   (tsubst_expr): Likewise.  Remove i_c_e_p parameter.
>   (tsubst_omp_udr): Likewise.
>   (tsubst_non_call_postfix_expression): Likewise.  Remove i_c_e_p 
> parameter.
>   (tsubst_lambda_expr): Likewise.
>   (tsubst_copy_and_build_call_args): Likewise.
>   (tsubst_copy_and_build): Likewise.  Remove i_c_e_p parameter.
>   : Adjust call to finish_id_expression
>   following removal of i_c_e_p.
>   : Remove C++98-specific cast validity check
>   guarded by i_c_e_p.
>   (maybe_instantiate_noexcept): Adjust calls to
>   tsubst_copy_and_build and tsubst_expr.
>   (instantiate_body): Likewise.
>   (instantiate_decl): Likewise.
>   (tsubst_initializer_list): Likewise.
>   (tsubst_enum): Likewise.
> 
> gcc/objcp/ChangeLog:
> 
>   * objcp-lang.cc (objcp_tsubst_copy_and_build): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/template/crash55.C: Don't expect additional
>   C++98-specific diagnostics.
>   * g++.dg/template/ref3.C: Remove C++98-specific xfail.
> ---
>  gcc/cp/constexpr.cc |   4 +-
>  gcc/cp/constraint.cc|  14 +-
>  gcc/cp/cp-tree.h|   6 +-
>  gcc/cp/init.cc  |   6 +-
>  gcc/cp/pt.cc| 240 
>  gcc/objcp/objcp-lang.cc |   3 +-
>  gcc/testsuite/g++.dg/template/crash55.C |   3 +-
>  gcc/testsuite/g++.dg/template/ref3.C|   3 +-
>  8 files changed, 93 insertions(+), 186 deletions(-)
> 
> diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
> index 15b4f2c4a08..e665839f5b1 100644
> --- a/gcc/cp/constexpr.cc
> +++ b/gcc/cp/constexpr.cc
> @@ -9460,9 +9460,7 @@ potential_constant_expression_1 (tree t, bool 
> want_rval, bool strict, bool now,
>  case STATIC_CAST_EXPR:
>  case REINTERPRET_CAST_EXPR:
>  case 

[PATCH 2/2] c++: remove i_c_e_p parm from tsubst_copy_and_build

2022-11-10 Thread Patrick Palka via Gcc-patches
AFAICT the only purpose of tsubst_copy_and_build's
integral_constant_expression_p boolean parameter is to diagnose certain
constructs that aren't allowed to appear in a C++98 integral constant
expression context, specifically casts to a non-integral type (diagnosed
from the *_CAST_EXPR case of tsubst_copy_and_build) or dependent names
that resolve to a non-constant decl (diagnosed from the IDENTIFIER_NODE
case of tsubst_copy_and_build).  The parameter has no effect outside of
C++98 AFAICT.

But diagnosing such constructs should arguably be done by
is_constant_expression after substitution, and doing it during
substitution by way of an additional parameter complicates the API of
this workhouse function for functionality that's specific to C++98.
And it seems is_constant_expression already does a good job of diagnosing
the aforementioned two constructs in C++98 mode, at least as far as our
testsuite is concerned.

So this patch gets rid of this parameter from tsubst_copy_and_build,
tsubst_expr and tsubst_copy_and_build_call_args.  The only interesting
changes are those to potential_constant_expression_1 and the
IDENTIFIER_NODE and *_CAST_EXPR cases of tsubst_copy_and_build; the rest
are mechanical adjustments to these functions and their call sites.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1)
: Use
cast_valid_in_integral_constant_expression_p instead of
open coding it.
* constraint.cc (tsubst_valid_expression_requirement): Adjust
calls to tsubst_copy_and_build and tsubst_expr.
(tsubst_constraint): Likewise.
(satisfy_atom): Likewise.
(diagnose_trait_expr): Likewise.
* cp-tree.h (tsubst_copy_and_build): Remove i_c_e_p parameter.
(tsubst_expr): Likewise.
* init.cc (get_nsdmi): Adjust calls to tsubst_copy_and_build
and tsubst_expr.
* pt.cc (expand_integer_pack): Likewise.
(instantiate_non_dependent_expr_internal): Likewise.
(tsubst_friend_function): Likewise.
(tsubst_attribute): Likewise.
(instantiate_class_template): Likewise.
(tsubst_template_arg): Likewise.
(gen_elem_of_pack_expansion_instantiation): Likewise.
(tsubst_fold_expr_init): Likewise.
(tsubst_pack_expansion): Likewise.
(tsubst_default_argument): Likewise.
(tsubst_function_decl): Likewise.
(tsubst_decl): Likewise.
(tsubst_arg_types): Likewise.
(tsubst_exception_specification): Likewise.
(tsubst): Likewise.
(tsubst_init): Likewise.
(tsubst_copy): Likewise.
(tsubst_omp_clause_decl): Likewise.
(tsubst_omp_clauses): Likewise.
(tsubst_copy_asm_operands): Likewise.
(tsubst_omp_for_iterator): Likewise.
(tsubst_expr): Likewise.  Remove i_c_e_p parameter.
(tsubst_omp_udr): Likewise.
(tsubst_non_call_postfix_expression): Likewise.  Remove i_c_e_p 
parameter.
(tsubst_lambda_expr): Likewise.
(tsubst_copy_and_build_call_args): Likewise.
(tsubst_copy_and_build): Likewise.  Remove i_c_e_p parameter.
: Adjust call to finish_id_expression
following removal of i_c_e_p.
: Remove C++98-specific cast validity check
guarded by i_c_e_p.
(maybe_instantiate_noexcept): Adjust calls to
tsubst_copy_and_build and tsubst_expr.
(instantiate_body): Likewise.
(instantiate_decl): Likewise.
(tsubst_initializer_list): Likewise.
(tsubst_enum): Likewise.

gcc/objcp/ChangeLog:

* objcp-lang.cc (objcp_tsubst_copy_and_build): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/template/crash55.C: Don't expect additional
C++98-specific diagnostics.
* g++.dg/template/ref3.C: Remove C++98-specific xfail.
---
 gcc/cp/constexpr.cc |   4 +-
 gcc/cp/constraint.cc|  14 +-
 gcc/cp/cp-tree.h|   6 +-
 gcc/cp/init.cc  |   6 +-
 gcc/cp/pt.cc| 240 
 gcc/objcp/objcp-lang.cc |   3 +-
 gcc/testsuite/g++.dg/template/crash55.C |   3 +-
 gcc/testsuite/g++.dg/template/ref3.C|   3 +-
 8 files changed, 93 insertions(+), 186 deletions(-)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 15b4f2c4a08..e665839f5b1 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -9460,9 +9460,7 @@ potential_constant_expression_1 (tree t, bool want_rval, 
bool strict, bool now,
 case STATIC_CAST_EXPR:
 case REINTERPRET_CAST_EXPR:
 case IMPLICIT_CONV_EXPR:
-  if (cxx_dialect < cxx11
- && !dependent_type_p (TREE_TYPE (t))
- && !INTEGRAL_OR_ENUMERATION_TYPE_P (TREE_TYPE (t)))
+  if (!cast_valid_in_integral_constant_expression_p (TREE_TYPE (t)))
/* In C++98, a conversion to non-integral type 

[PATCH 1/2] c++: remove function_p parm from tsubst_copy_and_build

2022-11-10 Thread Patrick Palka via Gcc-patches
The function_p parameter of tsubst_copy_and_build (added in r69316) is
inspected only in its IDENTIFIER_NODE case, where it controls whether we
diagnose unqualified name lookup failure for the given identifier.  But
I think ever since r173965, we never substitute an IDENTIFIER_NODE with
function_p=true for which the lookup can possibly fail, and therefore
the flag is effectively unneeded.

Before that commit, we would incorrectly repeat unqualified lookup for
an ADL-enabled CALL_EXPR at instantiation time, which naturally could
fail and thus motivated the flag.  Afterwards, we no longer substitute
an IDENTIFIER_NODE callee when koenig_p is true so the flag isn't needed
for its original purpose.  What about when koenig_p=false?  Apparently
we still may have an IDENTIFIER_NODE callee in this case, namely when
unqualified name lookup found a dependent local function declaration,
but repeating that lookup can't fail.  (It also can't fail for USING_DECL
callees.)

So this patch removes this effectively unneeded parameter from
tsubst_copy_and_build.  It also updates a outdated comment in the
CALL_EXPR case about when we may see an IDENTIFIER_NODE callee with
koenig_p=false.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* cp-lang.cc (objcp_tsubst_copy_and_build): Remove
function_p parameter.
* cp-objcp-common.h (objcp_tsubst_copy_and_build):
Likewise.
* cp-tree.h (tsubst_copy_and_build): Likewise.
* init.cc (get_nsdmi): Adjust calls to tsubst_copy_and_build.
* pt.cc (expand_integer_pack): Likewise.
(instantiate_non_dependent_expr_internal): Likewise.
(tsubst_function_decl): Likewise.
(tsubst_arg_types): Likewise.
(tsubst_exception_specification): Likewise.
(tsubst): Likewise.
(tsubst_copy_asm_operands): Likewise.
(tsubst_expr): Likewise.
(tsubst_non_call_postfix_expression): Likewise.
(tsubst_lambda_expr): Likewise.
(tsubst_copy_and_build_call_args): Likewise.
(tsubst_copy_and_build): Remove function_p parameter
and adjust function comment.  Adjust recursive calls.
: Update outdated comment about when
we can see an IDENTIFIER_NODE callee with koenig_p=false.
(maybe_instantiate_noexcept): Adjust calls to
tsubst_copy_and_build.

gcc/objcp/ChangeLog:

* objcp-lang.cc (objcp_tsubst_copy_and_build): Remove
function_p parameter.
---
 gcc/cp/cp-lang.cc|  3 +--
 gcc/cp/cp-objcp-common.h |  3 +--
 gcc/cp/cp-tree.h |  2 +-
 gcc/cp/init.cc   |  2 +-
 gcc/cp/pt.cc | 46 
 gcc/objcp/objcp-lang.cc  |  5 ++---
 6 files changed, 19 insertions(+), 42 deletions(-)

diff --git a/gcc/cp/cp-lang.cc b/gcc/cp/cp-lang.cc
index c3cfde56cc6..a3f29eda0d6 100644
--- a/gcc/cp/cp-lang.cc
+++ b/gcc/cp/cp-lang.cc
@@ -116,8 +116,7 @@ tree
 objcp_tsubst_copy_and_build (tree /*t*/,
 tree /*args*/,
 tsubst_flags_t /*complain*/,
-tree /*in_decl*/,
-bool /*function_p*/)
+tree /*in_decl*/)
 {
   return NULL_TREE;
 }
diff --git a/gcc/cp/cp-objcp-common.h b/gcc/cp/cp-objcp-common.h
index 1a67f14d9b3..f4ba0c9e012 100644
--- a/gcc/cp/cp-objcp-common.h
+++ b/gcc/cp/cp-objcp-common.h
@@ -24,8 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 /* In cp/objcp-common.c, cp/cp-lang.cc and objcp/objcp-lang.cc.  */
 
 extern tree cp_get_debug_type (const_tree);
-extern tree objcp_tsubst_copy_and_build (tree, tree, tsubst_flags_t,
-tree, bool);
+extern tree objcp_tsubst_copy_and_build (tree, tree, tsubst_flags_t, tree);
 
 extern int cp_decl_dwarf_attribute (const_tree, int);
 extern int cp_type_dwarf_attribute (const_tree, int);
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index d13bb3d4c0e..40fd2e1ebb9 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7383,7 +7383,7 @@ extern tree tsubst_default_argument   (tree, 
int, tree, tree,
 tsubst_flags_t);
 extern tree tsubst (tree, tree, tsubst_flags_t, tree);
 extern tree tsubst_copy_and_build  (tree, tree, tsubst_flags_t,
-tree, bool = false, bool = 
false);
+tree, bool = false);
 extern tree tsubst_expr (tree, tree, tsubst_flags_t,
  tree, bool);
 extern tree tsubst_pack_expansion  (tree, tree, tsubst_flags_t, 
tree);
diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index 3d5d3904944..fee49090de7 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -622,7 +622,7 @@ get_nsdmi (tree member, bool in_ctor, tsubst_flags_t 
complain)
  /* Do deferred 

Re: [PATCH][GCC] arm: Add support for Cortex-X1C CPU.

2022-11-10 Thread Ramana Radhakrishnan via Gcc-patches
On Thu, Nov 10, 2022 at 10:24 AM Srinath Parvathaneni via Gcc-patches
 wrote:
>
> Hi,
>
> This patch adds the -mcpu support for the Arm Cortex-X1C CPU.
>
> Regression tested on arm-none-eabi and bootstrapped on 
> arm-none-linux-gnueabihf.
>
> Ok for GCC master?


Ok
Ramana
>
> Regards,
> Srinath.
>
> gcc/ChangeLog:
>
> 2022-11-09  Srinath Parvathaneni  
>
>* config/arm/arm-cpus.in (cortex-x1c): Define new CPU.
>* config/arm/arm-tables.opt: Regenerate.
>* config/arm/arm-tune.md: Likewise.
>* 
> doc/gcc/gcc-command-options/machine-dependent-options/arm-options.rst:
>Document Cortex-X1C CPU.
>
>gcc/testsuite/ChangeLog:
>
> 2022-11-09  Srinath Parvathaneni  
>
>* gcc.target/arm/multilib.exp: Add tests for Cortex-X1C.
>
>
> ### Attachment also inlined for ease of reply
> ###
>
>
> diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
> index 
> 5a63bc548e54dbfdce5d1df425bd615d81895d80..5ed4db340bc5d7c9a41e6d1a3f660bf2a97b058b
>  100644
> --- a/gcc/config/arm/arm-cpus.in
> +++ b/gcc/config/arm/arm-cpus.in
> @@ -1542,6 +1542,17 @@ begin cpu cortex-x1
>   part d44
>  end cpu cortex-x1
>
> +begin cpu cortex-x1c
> + cname cortexx1c
> + tune for cortex-a57
> + tune flags LDSCHED
> + architecture armv8.2-a+fp16+dotprod
> + option crypto add FP_ARMv8 CRYPTO
> + costs cortex_a57
> + vendor 41
> + part d4c
> +end cpu cortex-x1c
> +
>  begin cpu neoverse-n1
>   cname neoversen1
>   alias !ares
> diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
> index 
> e6461abcc57cd485025f3e18535267c454662cbe..a10a09e36cd004165b6f1efddeb3bfc29d8337ac
>  100644
> --- a/gcc/config/arm/arm-tables.opt
> +++ b/gcc/config/arm/arm-tables.opt
> @@ -255,6 +255,9 @@ Enum(processor_type) String(cortex-a710) Value( 
> TARGET_CPU_cortexa710)
>  EnumValue
>  Enum(processor_type) String(cortex-x1) Value( TARGET_CPU_cortexx1)
>
> +EnumValue
> +Enum(processor_type) String(cortex-x1c) Value( TARGET_CPU_cortexx1c)
> +
>  EnumValue
>  Enum(processor_type) String(neoverse-n1) Value( TARGET_CPU_neoversen1)
>
> diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
> index 
> abc290edd094179379f3856a3f8f64781e0c33f2..8af8c936abe31fb60e3de2fd713f4c6946c2a752
>  100644
> --- a/gcc/config/arm/arm-tune.md
> +++ b/gcc/config/arm/arm-tune.md
> @@ -46,7 +46,7 @@
> cortexa73cortexa53,cortexa55,cortexa75,
> cortexa76,cortexa76ae,cortexa77,
> cortexa78,cortexa78ae,cortexa78c,
> -   cortexa710,cortexx1,neoversen1,
> +   cortexa710,cortexx1,cortexx1c,neoversen1,
> cortexa75cortexa55,cortexa76cortexa55,neoversev1,
> neoversen2,cortexm23,cortexm33,
> cortexm35p,cortexm55,starmc1,
> diff --git 
> a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/arm-options.rst 
> b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/arm-options.rst
> index 
> 3315114969381995d47162b53abeb9bfc442fd28..d531eced20cbb583ecaba2ab3927937faf69b9de
>  100644
> --- 
> a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/arm-options.rst
> +++ 
> b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/arm-options.rst
> @@ -594,7 +594,7 @@ These :samp:`-m` options are defined for the ARM port:
>:samp:`cortex-r7`, :samp:`cortex-r8`, :samp:`cortex-r52`, 
> :samp:`cortex-r52plus`,
>:samp:`cortex-m0`, :samp:`cortex-m0plus`, :samp:`cortex-m1`, 
> :samp:`cortex-m3`,
>:samp:`cortex-m4`, :samp:`cortex-m7`, :samp:`cortex-m23`, 
> :samp:`cortex-m33`,
> -  :samp:`cortex-m35p`, :samp:`cortex-m55`, :samp:`cortex-x1`,
> +  :samp:`cortex-m35p`, :samp:`cortex-m55`, :samp:`cortex-x1`, 
> :samp:`cortex-x1c`,
>:samp:`cortex-m1.small-multiply`, :samp:`cortex-m0.small-multiply`,
>:samp:`cortex-m0plus.small-multiply`, :samp:`exynos-m1`, 
> :samp:`marvell-pj4`,
>:samp:`neoverse-n1`, :samp:`neoverse-n2`, :samp:`neoverse-v1`, 
> :samp:`xscale`,
> diff --git a/gcc/testsuite/gcc.target/arm/multilib.exp 
> b/gcc/testsuite/gcc.target/arm/multilib.exp
> index 
> 2fa648c61dafebb663969198bf7849400a7547f6..f903f028a83f884bdc1521f810f7e70e4130a715
>  100644
> --- a/gcc/testsuite/gcc.target/arm/multilib.exp
> +++ b/gcc/testsuite/gcc.target/arm/multilib.exp
> @@ -450,6 +450,9 @@ if {[multilib_config "aprofile"] } {
> {-march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -mthumb} 
> "thumb/v8-a+simd/hard"
> {-march=armv7-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp 
> -mthumb} "thumb/v7-a+simd/softfp"
> {-march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp 
> -mthumb} "thumb/v8-a+simd/softfp"
> +   {-mcpu=cortex-x1c -mfpu=auto -mfloat-abi=softfp -mthumb} 
> "thumb/v8-a+simd/softfp"
> +   {-mcpu=cortex-x1c -mfpu=auto -mfloat-abi=hard -mthumb} 
> "thumb/v8-a+simd/hard"
> +   {-mcpu=cortex-x1c -mfpu=auto -mfloat-abi=soft -mthumb} 
> "thumb/v8-a/nofp"
>  } {
> check_multi_dir $opts $dir
>  }
>
>
>


Re: [Patch Arm] Fix PR 92999

2022-11-10 Thread Ramana Radhakrishnan via Gcc-patches
On Thu, Nov 10, 2022 at 6:03 PM Richard Earnshaw
 wrote:
>
>
>
> On 10/11/2022 17:21, Richard Earnshaw via Gcc-patches wrote:
> >
> >
> > On 08/11/2022 18:20, Ramana Radhakrishnan via Gcc-patches wrote:
> >> PR92999 is a case where the VFP calling convention does not allocate
> >> enough FP registers for a homogenous aggregate containing FP16 values.
> >> I believe this is the complete fix but would appreciate another set of
> >> eyes on this.
> >>
> >> Could I get a hand with a regression test run on an armhf environment
> >> while I fix my environment ?
> >>
> >> gcc/ChangeLog:
> >>
> >> PR target/92999
> >> *  config/arm/arm.c (aapcs_vfp_allocate_return_reg): Adjust to handle
> >> aggregates with elements smaller than SFmode.
> >>
> >> gcc/testsuite/ChangeLog:
> >>
> >> * gcc.target/arm/pr92999.c: New test.
> >>
> >>
> >> Thanks,
> >> Ramana
> >>
> >> Signed-off-by: Ramana Radhakrishnan 
> >
> > I'm not sure about this.  The AAPCS does not mention a base type of a
> > half-precision FP type as an appropriate homogeneous aggregate for using
> > VFP registers for either calling or returning.

Ooh interesting, thanks for taking a look and poking at the AAPCS and
that's a good catch. BF16 should also have the same behaviour as FP16
, I suspect ?

> >
> > So perhaps the bug is that we try to treat this as a homogeneous
> > aggregate at all.

Yep I agree - I'll take a look again tomorrow and see if I can get a fix.

(And thanks Alex for the test run, I might trouble you again while I
still (slowly) get some of my boards back up)

regards,
Ramana


>
> R.


Re: old install to a different folder

2022-11-10 Thread Gerald Pfeifer
On Thu, 10 Nov 2022, Martin Liška wrote:
> We noticed we'll need the old /install to be available for redirect. 
>
> Gerald, can you please put it somewhere under /install-prev, or 
> something similar?

I'm afraid I am confused now. Based on your original request I had removed 
the original /install directoy.

Can you help me understand what exactly we need and what for (in the big
picture)? And I'll then see how I can help.

Gerald


Re: [PATCH] range-op: Implement floating point multiplication fold_range [PR107569]

2022-11-10 Thread Jakub Jelinek via Gcc-patches
On Thu, Nov 10, 2022 at 03:50:47PM +0100, Aldy Hernandez wrote:
> > @@ -1908,6 +1910,123 @@ class foperator_minus : public range_ope
> > }
> >   } fop_minus;
> > +/* Wrapper around frange_arithmetics, that computes the result
> > +   if inexact rounded to both directions.  Also, if one of the
> > +   operands is +-0.0 and another +-INF, return +-0.0 rather than
> > +   NAN.  */
> 
> s/frange_arithmetics/frange_arithmetic/
> 
> Also, would you mind written a little blurb about why it's necessary not to
> compute INF*0.0 as NAN.  I assume it's because you're using it for the cross
> product and you'll set maybe_nan separately, but it's nice to spell it out.

This made me think about it some more and I'll need to play around with it
some more, perhaps the right thing is similarly to what I've attached for
division to handle special cases upfront and call frange_arithmetic only
for the safe cases.
E.g. one case which the posted foperator_mult handles pessimistically is
[0.0, 10.0] * [INF, INF].  This should be just [INF, INF] +-NAN IMHO,
because the 0.0 * INF case will result in NAN, while
nextafter (0.0, 1.0) * INF
will be already INF and everything larger as well.
I could in frange_mult be very conservative and for the 0 * INF cases
set result_lb and result_ub to [0.0, INF] range (corresponding signs
depending on the xor of sign of ops), but that would be quite pessimistic as
well.  If one has:
[0.0, 0.0] * [10.0, INF], the result should be just [0.0, 0.0] +-NAN,
because again 0.0 * INF is NAN, but 0.0 * nextafter (INF, 0.0) is already 0.0.

Note, the is_square case doesn't suffer from all of this mess, the result
is never NAN (unless operand is NAN).

> It'd be nice to have some testcases.  For example, from what I can see, the
> original integer multiplication code came with some tests in
> gcc.dg/tree-ssa/vrp13.c (commit 9983270bec0a18).  It'd be nice to have some
> sanity checks, especially because so many things can go wrong with floats.
> 
> I'll leave it to you to decide what tests to include.

I've tried following, but it suffers from various issues:
1) we don't handle __builtin_signbit (whatever) == 0 (or != 0) as guarantee
   that in the guarded code whatever has signbit 0 or 1
2) __builtin_isinf (x) > 0 is lowered to x > DBL_MAX, but unfortunately we don't
   infer from that [INF,INF] range, but [DBL_MAX, INF] range
3) what I wrote above, I think we don't handle [0, 2] * [INF, INF] right but
   due to 2) we can't see it

So, maybe for now a selftest will be better than a testcase, or
alternatively a plugin test which acts like a selftest.

/* { dg-do compile { target { ! { vax-*-* powerpc-*-*spe pdp11-*-* } } } } */
/* { dg-options "-O2 -fno-trapping-math -fno-signaling-nans -fsigned-zeros 
-fno-tree-fre -fno-tree-dominator-opts -fno-thread-jumps -fdump-tree-optimized" 
} */
/* { dg-add-options ieee } */

void
foo (double x, double y)
{
  const double inf = __builtin_inf ();
  const double minf = -inf;
  if (__builtin_isnan (x) || __builtin_isnan (y))
return;
#define TEST(n, xl, xu, yl, yu, rl, ru, nan) \
  if ((__builtin_isinf (xl) > 0 \
   ? x > 0.0 && __builtin_isinf (x) \
   : __builtin_isinf (xu) < 0   \
   ? x < 0.0 && __builtin_isinf (x) \
   : x >= xl && x <= xu \
 && (xl != 0.0  \
 || __builtin_signbit (xl)  \
 || !__builtin_signbit (x)) \
 && (xu != 0.0  \
 || !__builtin_signbit (xu) \
 || __builtin_signbit (x))) \
  && (__builtin_isinf (yl) > 0  \
  ? y > 0.0 && __builtin_isinf (y)  \
  : __builtin_isinf (yu) < 0\
  ? y < 0.0 && __builtin_isinf (y)  \
  : y >= yl && y <= yu  \
&& (yl != 0.0   \
|| __builtin_signbit (yl)   \
|| !__builtin_signbit (y))  \
&& (yu != 0.0   \
|| !__builtin_signbit (yu)  \
|| __builtin_signbit (y \
{   \
  double r##n = x * y;  \
  if (nan == 2) \
{   \
  if (!__builtin_isnan (r##n))  \
__builtin_abort (); \
}   \
  else if (nan == 1)\
{   \
  if (!__builtin_isnan (r##n))  \
{   \
   

[PATCHv2] Use toplevel configure for GMP and MPFR for gdb

2022-11-10 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

This patch uses the toplevel configure parts for GMP/MPFR for
gdb. The only thing is that gdb now requires MPFR for building.
Before it was a recommended but not required library.
Also this allows building of GMP and MPFR with the toplevel
directory just like how it is done for GCC.
We now error out in the toplevel configure of the version
of GMP and MPFR that is wrong.

OK after GDB 13 branches? Build gdb 3 ways:
with GMP and MPFR in the toplevel (static library used at that point for both)
With only MPFR in the toplevel (GMP distro library used and MPFR built from 
source)
With neither GMP and MPFR in the toplevel (distro libraries used)

Changes from v1:
* Updated gdb/README and gdb/doc/gdb.texinfo.
* Regenerated using unmodified autoconf-2.69

Thanks,
Andrew Pinski

ChangeLog:
* Makefile.def: Add configure-gdb dependencies
on all-gmp and all-mpfr.
* configure.ac: Split out MPC checking from MPFR.
Require GMP and MPFR if the gdb directory exist.
* Makefile.in: Regenerate.
* configure: Regenerate.

gdb/ChangeLog:

PR bug/28500
* configure.ac: Remove AC_LIB_HAVE_LINKFLAGS
for gmp and mpfr.
Use GMPLIBS and GMPINC which is provided by the
toplevel configure.
* Makefile.in (LIBGMP, LIBMPFR): Remove.
(GMPLIBS, GMPINC): Add definition.
(INTERNAL_CFLAGS_BASE): Add GMPINC.
(CLIBS): Exchange LIBMPFR and LIBGMP
for GMPLIBS.
* target-float.c: Make the code conditional on
HAVE_LIBMPFR unconditional.
* top.c: Remove code checking HAVE_LIBMPFR.
* configure: Regenerate.
* config.in: Regenerate.
* README: Update GMP/MPFR section of the config
options.
* doc/gdb.texinfo: Likewise.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28500
---
 Makefile.def|2 +
 Makefile.in |2 +
 configure   |   67 ++-
 configure.ac|   45 +-
 gdb/Makefile.in |   12 +-
 gdb/README  |   28 +-
 gdb/config.in   |6 -
 gdb/configure   | 1014 +--
 gdb/configure.ac|   31 +-
 gdb/doc/gdb.texinfo |   13 +-
 gdb/target-float.c  |8 -
 gdb/top.c   |8 -
 12 files changed, 142 insertions(+), 1094 deletions(-)

diff --git a/Makefile.def b/Makefile.def
index acdcd625ed6..d5976e61d98 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -418,6 +418,8 @@ dependencies = { module=configure-isl; on=all-gmp; };
 dependencies = { module=all-intl; on=all-libiconv; };
 
 // Host modules specific to gdb.
+dependencies = { module=configure-gdb; on=all-gmp; };
+dependencies = { module=configure-gdb; on=all-mpfr; };
 dependencies = { module=configure-gdb; on=all-intl; };
 dependencies = { module=configure-gdb; on=configure-sim; };
 dependencies = { module=configure-gdb; on=all-bfd; };
diff --git a/Makefile.in b/Makefile.in
index cb39e4790d6..d0666c75b00 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -63748,6 +63748,8 @@ configure-libcc1: maybe-configure-gcc
 all-libcc1: maybe-all-gcc
 all-c++tools: maybe-all-gcc
 all-utils: maybe-all-libiberty
+configure-gdb: maybe-all-gmp
+configure-gdb: maybe-all-mpfr
 configure-gdb: maybe-all-intl
 configure-gdb: maybe-all-bfd
 configure-gdb: maybe-all-libiconv
diff --git a/configure b/configure
index 7bcb894d1fe..a891eeae4c0 100755
--- a/configure
+++ b/configure
@@ -8025,7 +8025,20 @@ _ACEOF
 
 
 # Check for GMP, MPFR and MPC
-gmplibs="-lmpc -lmpfr -lgmp"
+require_gmp=no
+require_mpc=no
+if test -d ${srcdir}/gcc ; then
+  require_gmp=yes
+  require_mpc=yes
+fi
+if test -d ${srcdir}/gdb ; then
+  require_gmp=yes
+fi
+
+gmplibs="-lmpfr -lgmp"
+if test x"$require_mpc" = "xyes" ; then
+  gmplibs="-lmpc $gmplibs"
+fi
 gmpinc=
 have_gmp=no
 
@@ -8160,7 +8173,7 @@ if test "x$with_gmp$with_gmp_include$with_gmp_lib" = x && 
test -d ${srcdir}/gmp;
   have_gmp=yes
 fi
 
-if test -d ${srcdir}/gcc && test "x$have_gmp" = xno; then
+if test "x$require_gmp" = xyes && test "x$have_gmp" = xno; then
   have_gmp=yes
   saved_CFLAGS="$CFLAGS"
   CFLAGS="$CFLAGS $gmpinc"
@@ -8270,7 +8283,7 @@ rm -f core conftest.err conftest.$ac_objext 
conftest.$ac_ext
   fi
 
   # Check for the MPC header version.
-  if test x"$have_gmp" = xyes ; then
+  if test "x$require_mpc" = xyes && test x"$have_gmp" = xyes ; then
 # Check for the recommended and required versions of MPC.
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking for the correct version 
of mpc.h" >&5
 $as_echo_n "checking for the correct version of mpc.h... " >&6; }
@@ -8324,18 +8337,17 @@ rm -f core conftest.err conftest.$ac_objext 
conftest.$ac_ext
   if test x"$have_gmp" = xyes; then
 saved_LIBS="$LIBS"
 LIBS="$LIBS $gmplibs"
-{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for the correct version 
of the gmp/mpfr/mpc libraries" >&5
-$as_echo_n "checking for the correct version of the gmp/mpfr/mpc libraries... 
" >&6; }
+{ $as_echo 

[committed] analyzer: new warning: -Wanalyzer-deref-before-check [PR99671]

2022-11-10 Thread David Malcolm via Gcc-patches
This patch implements a new -Wanalyzer-deref-before-check within
-fanalyzer.  It complains about code paths in which a pointer is checked
for NULL after it has already been dereferenced.

For example, for the testcase in PR 77432 the diagnostic emits:
deref-before-check-1.c: In function 'test_from_pr77432':
deref-before-check-1.c:6:8: warning: check of 'a' for NULL after already 
dereferencing it [-Wanalyzer-deref-before-check]
6 | if (a)
  |^
  'test_from_pr77432': events 1-2
|
|5 | int b = *a;
|  | ^
|  | |
|  | (1) pointer 'a' is dereferenced here
|6 | if (a)
|  |~
|  ||
|  |(2) pointer 'a' is checked for NULL here but it was already 
dereferenced at (1)
|

and in PR 77425 we had an instance of this hidden behind a
macro, which the diagnostic complains about as follows:

deref-before-check-pr77425.c: In function 'get_odr_type':
deref-before-check-pr77425.c:35:10: warning: check of 'odr_types_ptr' for NULL 
after already dereferencing it [-Wanalyzer-deref-before-check]
   35 |   if (odr_types_ptr)
  |  ^
  'get_odr_type': events 1-3
|
|   27 |   if (cond)
|  |  ^
|  |  |
|  |  (1) following 'false' branch...
|..
|   31 |   else if (other_cond)
|  |   ~~~
|  |   ||
|  |   |(2) ...to here
|  |   (3) following 'true' branch...
|
  'get_odr_type': event 4
|
|   11 | #define odr_types (*odr_types_ptr)
|  |   ~^~~
|  ||
|  |(4) ...to here
deref-before-check-pr77425.c:33:7: note: in expansion of macro 'odr_types'
|   33 |   odr_types[val->id] = 0;
|  |   ^
|
  'get_odr_type': event 5
|
|   11 | #define odr_types (*odr_types_ptr)
|  |   ~^~~
|  ||
|  |(5) pointer 'odr_types_ptr' is dereferenced here
deref-before-check-pr77425.c:33:7: note: in expansion of macro 'odr_types'
|   33 |   odr_types[val->id] = 0;
|  |   ^
|
  'get_odr_type': event 6
|
|   35 |   if (odr_types_ptr)
|  |  ^
|  |  |
|  |  (6) pointer 'odr_types_ptr' is checked for NULL here but 
it was already dereferenced at (5)
|

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-3884-g5c6546ca7d8cab.

gcc/analyzer/ChangeLog:
PR analyzer/99671
* analyzer.opt (Wanalyzer-deref-before-check): New warning.
* diagnostic-manager.cc
(null_assignment_sm_context::set_next_state): Only add state
change events for transition to "null" state.
(null_assignment_sm_context::is_transition_to_null): New.
* engine.cc (impl_region_model_context::on_pop_frame): New.
* exploded-graph.h (impl_region_model_context::on_pop_frame): New
decl.
* program-state.cc (sm_state_map::clear_any_state): New.
(sm_state_map::can_merge_with_p): New.
(program_state::can_merge_with_p): Replace requirement that
sm-states be equal in favor of an attempt to merge them.
* program-state.h (sm_state_map::clear_any_state): New decl.
(sm_state_map::can_merge_with_p): New decl.
* region-model.cc (region_model::eval_condition): Make const.
(region_model::pop_frame): Call ctxt->on_pop_frame.
* region-model.h (region_model::eval_condition): Make const.
(region_model_context::on_pop_frame): New vfunc.
(noop_region_model_context::on_pop_frame): New.
(region_model_context_decorator::on_pop_frame): New.
* sm-malloc.cc (enum resource_state): Add RS_ASSUMED_NON_NULL.
(allocation_state::dump_to_pp): Drop "final".
(struct assumed_non_null_state): New subclass.
(malloc_state_machine::m_assumed_non_null): New.
(assumed_non_null_p): New.
(class deref_before_check): New.
(assumed_non_null_state::dump_to_pp): New.
(malloc_state_machine::get_or_create_assumed_non_null_state_for_frame):
New.
(malloc_state_machine::maybe_assume_non_null): New.
(malloc_state_machine::on_stmt): Transition from start state to
"assumed-non-null" state for pointers passed to
__attribute__((nonnull)) arguments, and for pointers explicitly
dereferenced.  Call maybe_complain_about_deref_before_check for
pointers explicitly compared against NULL.
(malloc_state_machine::maybe_complain_about_deref_before_check):
New.
(malloc_state_machine::on_deallocator_call): Also transition
"assumed-non-null" states to "freed".
(malloc_state_machine::on_pop_frame): New.

Re: [PATCH] c++: Extend -Wdangling-reference for std::minmax

2022-11-10 Thread Marek Polacek via Gcc-patches
On Thu, Nov 10, 2022 at 08:07:25AM -1000, Jason Merrill wrote:
> On 11/9/22 15:56, Marek Polacek wrote:
> > This patch extends -Wdangling-reference to also warn for
> > 
> >auto v = std::minmax(1, 2);
> > 
> > which dangles because this overload of std::minmax returns
> > a std::pair where the two references are
> > bound to the temporaries created for the arguments of std::minmax.
> > This is a common footgun, also described at
> >  in Notes.
> > 
> > It works by extending do_warn_dangling_reference to also warn when the
> > function returns a std::pair.  std_pair_ref_ref_p
> > is a new helper to check that.
> > 
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > 
> > gcc/cp/ChangeLog:
> > 
> > * call.cc (std_pair_ref_ref_p): New.
> > (do_warn_dangling_reference): Also warn when the function returns
> > std::pair.  Recurse into TARGET_EXPR_INITIAL.
> > (maybe_warn_dangling_reference): Don't return early if we're
> > initializing a std_pair_ref_ref_p.
> > 
> > gcc/ChangeLog:
> > 
> > * doc/gcc/gcc-command-options/options-controlling-c++-dialect.rst:
> > Extend the description of -Wdangling-reference.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/warn/Wdangling-reference6.C: New test.
> > ---
> >   gcc/cp/call.cc| 52 ---
> >   .../options-controlling-c++-dialect.rst   | 10 
> >   .../g++.dg/warn/Wdangling-reference6.C| 38 ++
> >   3 files changed, 94 insertions(+), 6 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference6.C
> > 
> > diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
> > index 492db9b59ad..bd3b64a7e26 100644
> > --- a/gcc/cp/call.cc
> > +++ b/gcc/cp/call.cc
> > @@ -13527,6 +13527,34 @@ initialize_reference (tree type, tree expr,
> > return expr;
> >   }
> > +/* Return true if T is std::pair.  */
> > +
> > +static bool
> > +std_pair_ref_ref_p (tree t)
> > +{
> > +  /* First, check if we have std::pair.  */
> > +  if (!NON_UNION_CLASS_TYPE_P (t)
> > +  || !CLASSTYPE_TEMPLATE_INSTANTIATION (t))
> > +return false;
> > +  tree tdecl = TYPE_NAME (TYPE_MAIN_VARIANT (t));
> > +  if (!decl_in_std_namespace_p (tdecl))
> > +return false;
> > +  tree name = DECL_NAME (tdecl);
> > +  if (!name || !id_equal (name, "pair"))
> > +return false;
> > +
> > +  /* Now see if the template arguments are both const T&.  */
> > +  tree args = CLASSTYPE_TI_ARGS (t);
> > +  if (TREE_VEC_LENGTH (args) != 2)
> > +return false;
> > +  for (int i = 0; i < 2; i++)
> > +if (!TYPE_REF_OBJ_P (TREE_VEC_ELT (args, i))
> > +   || !CP_TYPE_CONST_P (TREE_TYPE (TREE_VEC_ELT (args, i
> > +  return false;
> > +
> > +  return true;
> > +}
> > +
> >   /* Helper for maybe_warn_dangling_reference to find a problematic 
> > CALL_EXPR
> >  that initializes the LHS (and at least one of its arguments represents
> >  a temporary, as outlined in maybe_warn_dangling_reference), or 
> > NULL_TREE
> > @@ -13556,11 +13584,6 @@ do_warn_dangling_reference (tree expr)
> > || warning_suppressed_p (fndecl, OPT_Wdangling_reference)
> > || !warning_enabled_at (DECL_SOURCE_LOCATION (fndecl),
> > OPT_Wdangling_reference)
> > -   /* If the function doesn't return a reference, don't warn.  This
> > -  can be e.g.
> > -const int& z = std::min({1, 2, 3, 4, 5, 6, 7});
> > -  which doesn't dangle: std::min here returns an int.  */
> > -   || !TYPE_REF_OBJ_P (TREE_TYPE (TREE_TYPE (fndecl)))
> > /* Don't emit a false positive for:
> > std::vector v = ...;
> > std::vector::const_iterator it = v.begin();
> > @@ -13573,6 +13596,20 @@ do_warn_dangling_reference (tree expr)
> > && DECL_OVERLOADED_OPERATOR_IS (fndecl, INDIRECT_REF)))
> >   return NULL_TREE;
> > +   tree rettype = TREE_TYPE (TREE_TYPE (fndecl));
> > +   /* If the function doesn't return a reference, don't warn.  This
> > +  can be e.g.
> > +const int& z = std::min({1, 2, 3, 4, 5, 6, 7});
> > +  which doesn't dangle: std::min here returns an int.
> > +
> > +  If the function returns a std::pair, we
> > +  warn, to detect e.g.
> > +std::pair v = std::minmax(1, 2);
> > +  which also creates a dangling reference, because std::minmax
> > +  returns std::pair(b, a).  */
> > +   if (!(TYPE_REF_OBJ_P (rettype) || std_pair_ref_ref_p (rettype)))
> 
> The patch is OK, but do you want to check reference to const for the single
> ref case as well, while you're changing this?

Thanks.  Yes, I plan to do that soon, but I didn't want to do it in
a single patch, because I want a dedicated test for 'int&' v. 'const int&'
and that felt like a follow-up patch.
 
> > + return NULL_TREE;
> > +
> > /* Here we're looking to see if any of the arguments is a temporary
> >initializing a 

Re: [PATCH] c++: Extend -Wdangling-reference for std::minmax

2022-11-10 Thread Jason Merrill via Gcc-patches

On 11/9/22 15:56, Marek Polacek wrote:

This patch extends -Wdangling-reference to also warn for

   auto v = std::minmax(1, 2);

which dangles because this overload of std::minmax returns
a std::pair where the two references are
bound to the temporaries created for the arguments of std::minmax.
This is a common footgun, also described at
 in Notes.

It works by extending do_warn_dangling_reference to also warn when the
function returns a std::pair.  std_pair_ref_ref_p
is a new helper to check that.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

* call.cc (std_pair_ref_ref_p): New.
(do_warn_dangling_reference): Also warn when the function returns
std::pair.  Recurse into TARGET_EXPR_INITIAL.
(maybe_warn_dangling_reference): Don't return early if we're
initializing a std_pair_ref_ref_p.

gcc/ChangeLog:

* doc/gcc/gcc-command-options/options-controlling-c++-dialect.rst:
Extend the description of -Wdangling-reference.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference6.C: New test.
---
  gcc/cp/call.cc| 52 ---
  .../options-controlling-c++-dialect.rst   | 10 
  .../g++.dg/warn/Wdangling-reference6.C| 38 ++
  3 files changed, 94 insertions(+), 6 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference6.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 492db9b59ad..bd3b64a7e26 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -13527,6 +13527,34 @@ initialize_reference (tree type, tree expr,
return expr;
  }
  
+/* Return true if T is std::pair.  */

+
+static bool
+std_pair_ref_ref_p (tree t)
+{
+  /* First, check if we have std::pair.  */
+  if (!NON_UNION_CLASS_TYPE_P (t)
+  || !CLASSTYPE_TEMPLATE_INSTANTIATION (t))
+return false;
+  tree tdecl = TYPE_NAME (TYPE_MAIN_VARIANT (t));
+  if (!decl_in_std_namespace_p (tdecl))
+return false;
+  tree name = DECL_NAME (tdecl);
+  if (!name || !id_equal (name, "pair"))
+return false;
+
+  /* Now see if the template arguments are both const T&.  */
+  tree args = CLASSTYPE_TI_ARGS (t);
+  if (TREE_VEC_LENGTH (args) != 2)
+return false;
+  for (int i = 0; i < 2; i++)
+if (!TYPE_REF_OBJ_P (TREE_VEC_ELT (args, i))
+   || !CP_TYPE_CONST_P (TREE_TYPE (TREE_VEC_ELT (args, i
+  return false;
+
+  return true;
+}
+
  /* Helper for maybe_warn_dangling_reference to find a problematic CALL_EXPR
 that initializes the LHS (and at least one of its arguments represents
 a temporary, as outlined in maybe_warn_dangling_reference), or NULL_TREE
@@ -13556,11 +13584,6 @@ do_warn_dangling_reference (tree expr)
|| warning_suppressed_p (fndecl, OPT_Wdangling_reference)
|| !warning_enabled_at (DECL_SOURCE_LOCATION (fndecl),
OPT_Wdangling_reference)
-   /* If the function doesn't return a reference, don't warn.  This
-  can be e.g.
-const int& z = std::min({1, 2, 3, 4, 5, 6, 7});
-  which doesn't dangle: std::min here returns an int.  */
-   || !TYPE_REF_OBJ_P (TREE_TYPE (TREE_TYPE (fndecl)))
/* Don't emit a false positive for:
std::vector v = ...;
std::vector::const_iterator it = v.begin();
@@ -13573,6 +13596,20 @@ do_warn_dangling_reference (tree expr)
&& DECL_OVERLOADED_OPERATOR_IS (fndecl, INDIRECT_REF)))
  return NULL_TREE;
  
+	tree rettype = TREE_TYPE (TREE_TYPE (fndecl));

+   /* If the function doesn't return a reference, don't warn.  This
+  can be e.g.
+const int& z = std::min({1, 2, 3, 4, 5, 6, 7});
+  which doesn't dangle: std::min here returns an int.
+
+  If the function returns a std::pair, we
+  warn, to detect e.g.
+std::pair v = std::minmax(1, 2);
+  which also creates a dangling reference, because std::minmax
+  returns std::pair(b, a).  */
+   if (!(TYPE_REF_OBJ_P (rettype) || std_pair_ref_ref_p (rettype)))


The patch is OK, but do you want to check reference to const for the 
single ref case as well, while you're changing this?



+ return NULL_TREE;
+
/* Here we're looking to see if any of the arguments is a temporary
   initializing a reference parameter.  */
for (int i = 0; i < call_expr_nargs (expr); ++i)
@@ -13614,6 +13651,8 @@ do_warn_dangling_reference (tree expr)
return do_warn_dangling_reference (TREE_OPERAND (expr, 2));
  case PAREN_EXPR:
return do_warn_dangling_reference (TREE_OPERAND (expr, 0));
+case TARGET_EXPR:
+  return do_warn_dangling_reference (TARGET_EXPR_INITIAL (expr));
  default:
return NULL_TREE;
  }
@@ -13640,7 +13679,8 @@ maybe_warn_dangling_reference (const_tree decl, tree 
init)
  {
if 

Re: [Patch Arm] Fix PR 92999

2022-11-10 Thread Richard Earnshaw via Gcc-patches




On 10/11/2022 17:21, Richard Earnshaw via Gcc-patches wrote:



On 08/11/2022 18:20, Ramana Radhakrishnan via Gcc-patches wrote:

PR92999 is a case where the VFP calling convention does not allocate
enough FP registers for a homogenous aggregate containing FP16 values.
I believe this is the complete fix but would appreciate another set of
eyes on this.

Could I get a hand with a regression test run on an armhf environment
while I fix my environment ?

gcc/ChangeLog:

PR target/92999
*  config/arm/arm.c (aapcs_vfp_allocate_return_reg): Adjust to handle
aggregates with elements smaller than SFmode.

gcc/testsuite/ChangeLog:

* gcc.target/arm/pr92999.c: New test.


Thanks,
Ramana

Signed-off-by: Ramana Radhakrishnan 


I'm not sure about this.  The AAPCS does not mention a base type of a 
half-precision FP type as an appropriate homogeneous aggregate for using 
VFP registers for either calling or returning.


So perhaps the bug is that we try to treat this as a homogeneous 
aggregate at all.


R.


And clang seems to agree with my opinion: https://godbolt.org/z/ncaYfzebM

R.


Re: [Patch Arm] Fix PR 92999

2022-11-10 Thread Richard Earnshaw via Gcc-patches




On 08/11/2022 18:20, Ramana Radhakrishnan via Gcc-patches wrote:

PR92999 is a case where the VFP calling convention does not allocate
enough FP registers for a homogenous aggregate containing FP16 values.
I believe this is the complete fix but would appreciate another set of
eyes on this.

Could I get a hand with a regression test run on an armhf environment
while I fix my environment ?

gcc/ChangeLog:

PR target/92999
*  config/arm/arm.c (aapcs_vfp_allocate_return_reg): Adjust to handle
aggregates with elements smaller than SFmode.

gcc/testsuite/ChangeLog:

* gcc.target/arm/pr92999.c: New test.


Thanks,
Ramana

Signed-off-by: Ramana Radhakrishnan 


I'm not sure about this.  The AAPCS does not mention a base type of a 
half-precision FP type as an appropriate homogeneous aggregate for using 
VFP registers for either calling or returning.


So perhaps the bug is that we try to treat this as a homogeneous 
aggregate at all.


R.


[PATCH (pushed)] docs: move label directly before title

2022-11-10 Thread Martin Liška

Otherwise Sphinx can compare if Intersphinx is unavailable:

gcc/fortran/doc/gfortran/intrinsic-procedures/atand.rst:50: WARNING: Failed to 
create a cross reference. A title or caption not found: 'atan'
gcc/fortran/doc/gfortran/intrinsic-procedures/atan2.rst:55: WARNING: Failed to 
create a cross reference. A title or caption not found: 'atan'
...

gcc/fortran/ChangeLog:

* doc/gfortran/intrinsic-procedures/abs.rst: Move label directly before 
title.
* doc/gfortran/intrinsic-procedures/acos.rst: Likewise.
* doc/gfortran/intrinsic-procedures/acosd.rst: Likewise.
* doc/gfortran/intrinsic-procedures/acosh.rst: Likewise.
* doc/gfortran/intrinsic-procedures/aimag.rst: Likewise.
* doc/gfortran/intrinsic-procedures/aint.rst: Likewise.
* doc/gfortran/intrinsic-procedures/anint.rst: Likewise.
* doc/gfortran/intrinsic-procedures/asin.rst: Likewise.
* doc/gfortran/intrinsic-procedures/asind.rst: Likewise.
* doc/gfortran/intrinsic-procedures/asinh.rst: Likewise.
* doc/gfortran/intrinsic-procedures/atan.rst: Likewise.
* doc/gfortran/intrinsic-procedures/atan2.rst: Likewise.
* doc/gfortran/intrinsic-procedures/atan2d.rst: Likewise.
* doc/gfortran/intrinsic-procedures/atand.rst: Likewise.
* doc/gfortran/intrinsic-procedures/atanh.rst: Likewise.
* doc/gfortran/intrinsic-procedures/besselj0.rst: Likewise.
* doc/gfortran/intrinsic-procedures/besselj1.rst: Likewise.
* doc/gfortran/intrinsic-procedures/besseljn.rst: Likewise.
* doc/gfortran/intrinsic-procedures/bessely0.rst: Likewise.
* doc/gfortran/intrinsic-procedures/bessely1.rst: Likewise.
* doc/gfortran/intrinsic-procedures/besselyn.rst: Likewise.
* doc/gfortran/intrinsic-procedures/btest.rst: Likewise.
* doc/gfortran/intrinsic-procedures/char.rst: Likewise.
* doc/gfortran/intrinsic-procedures/conjg.rst: Likewise.
* doc/gfortran/intrinsic-procedures/cos.rst: Likewise.
* doc/gfortran/intrinsic-procedures/cosd.rst: Likewise.
* doc/gfortran/intrinsic-procedures/cosh.rst: Likewise.
* doc/gfortran/intrinsic-procedures/cotan.rst: Likewise.
* doc/gfortran/intrinsic-procedures/cotand.rst: Likewise.
* doc/gfortran/intrinsic-procedures/dim.rst: Likewise.
* doc/gfortran/intrinsic-procedures/dprod.rst: Likewise.
* doc/gfortran/intrinsic-procedures/erf.rst: Likewise.
* doc/gfortran/intrinsic-procedures/erfc.rst: Likewise.
* doc/gfortran/intrinsic-procedures/exp.rst: Likewise.
* doc/gfortran/intrinsic-procedures/gamma.rst: Likewise.
* doc/gfortran/intrinsic-procedures/iand.rst: Likewise.
* doc/gfortran/intrinsic-procedures/ibclr.rst: Likewise.
* doc/gfortran/intrinsic-procedures/ibits.rst: Likewise.
* doc/gfortran/intrinsic-procedures/ibset.rst: Likewise.
* doc/gfortran/intrinsic-procedures/ichar.rst: Likewise.
* doc/gfortran/intrinsic-procedures/ieor.rst: Likewise.
* doc/gfortran/intrinsic-procedures/index.rst: Likewise.
* doc/gfortran/intrinsic-procedures/int.rst: Likewise.
* doc/gfortran/intrinsic-procedures/ior.rst: Likewise.
* doc/gfortran/intrinsic-procedures/ishft.rst: Likewise.
* doc/gfortran/intrinsic-procedures/ishftc.rst: Likewise.
* doc/gfortran/intrinsic-procedures/len.rst: Likewise.
* doc/gfortran/intrinsic-procedures/lge.rst: Likewise.
* doc/gfortran/intrinsic-procedures/lgt.rst: Likewise.
* doc/gfortran/intrinsic-procedures/lle.rst: Likewise.
* doc/gfortran/intrinsic-procedures/llt.rst: Likewise.
* doc/gfortran/intrinsic-procedures/log.rst: Likewise.
* doc/gfortran/intrinsic-procedures/log10.rst: Likewise.
* doc/gfortran/intrinsic-procedures/loggamma.rst: Likewise.
* doc/gfortran/intrinsic-procedures/max.rst: Likewise.
* doc/gfortran/intrinsic-procedures/min.rst: Likewise.
* doc/gfortran/intrinsic-procedures/mod.rst: Likewise.
* doc/gfortran/intrinsic-procedures/mvbits.rst: Likewise.
* doc/gfortran/intrinsic-procedures/nint.rst: Likewise.
* doc/gfortran/intrinsic-procedures/not.rst: Likewise.
* doc/gfortran/intrinsic-procedures/real.rst: Likewise.
* doc/gfortran/intrinsic-procedures/sign.rst: Likewise.
* doc/gfortran/intrinsic-procedures/sin.rst: Likewise.
* doc/gfortran/intrinsic-procedures/sind.rst: Likewise.
* doc/gfortran/intrinsic-procedures/sinh.rst: Likewise.
* doc/gfortran/intrinsic-procedures/sqrt.rst: Likewise.
* doc/gfortran/intrinsic-procedures/tan.rst: Likewise.
* doc/gfortran/intrinsic-procedures/tand.rst: Likewise.
* doc/gfortran/intrinsic-procedures/tanh.rst: Likewise.
---
 gcc/fortran/doc/gfortran/intrinsic-procedures/abs.rst  | 4 ++--
 gcc/fortran/doc/gfortran/intrinsic-procedures/acos.rst | 4 ++--
 

Re: [PATCH v3] c++: parser - Support for target address spaces in C++

2022-11-10 Thread Georg-Johann Lay




Am 10.11.22 um 15:08 schrieb Paul Iannetta:

On Thu, Nov 03, 2022 at 02:38:39PM +0100, Georg-Johann Lay wrote:

[PATCH v3] c++: parser - Support for target address spaces in C++

2. Will it work with compound literals?
===

Currently, the following C code works for target avr:

const __flash char *pHallo = (const __flash char[]) { "Hallo" };

This is a pointer in RAM (AS0) that holds the address of a string in flash
(AS1) and is initialized with that address. Unfortunately, this does not
work locally:

const __flash char* get_hallo (void)
{
 [static] const __flash char *p2 = (const __flash char[]) { "Hallo2" };
 return p2;
}

foo.c: In function 'get_hallo':
foo.c: error: compound literal qualified by address-space qualifier

Is there any way to make this work now? Would be great!



Currently, I implement the same restrictions as the C front-end, but I
think that this restriction could be lifted.


Hi Paul,

this would be great.  FYI, due to AVR quirks, .rodata is located in RAM.
Reason behind this is that in functions like

char get_letter (const char *c)
{
return *c;
}

there is no means to determine whether get_letter was called with a 
const char* or a char*.  Accessing flash vs. RAM would require different 
instructions, thus .rodata is part of RAM, so that RAM accesses will 
work in either case.


The obvious problem is that this wastes RAM. One way out is to define 
address space in flash and to pass const __flash char*, where respective 
objects are located in flash (.progmem.data in case of avr).


This is fine for objects which the application creates, but there are 
also artificial objects like vtables or cswtch tables.



3. Will TARGET_ADDR_SPACE_DIAGNOSE_USAGE still work?


Currently there is target hook TARGET_ADDR_SPACE_DIAGNOSE_USAGE.
I did not see it in your patches, so maybe I just missed it? See
https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gccint/Named-Address-Spaces.html#index-TARGET_005fADDR_005fSPACE_005fDIAGNOSE_005fUSAGE


That was a point I overlooked in my previous patch.  This will be in
my new revision where I also add implicit conversion between
address spaces and also expose TARGET_ADDR_SPACE_CONVERT.


4. Will it be possible to put C++ virtual tables in ASs, and how?
=


Currently, I do not allow the declaration of instances of classes in
an address space, mainly to not have to cope with the handling of the
this pointer.  That is,

   __flash Myclass *t;

does not work.  Nevertheless, I admit that this is would be nice to
have.


One big complaint about avr-g++ is that there is no way to put vtables in
flash (address-space 1) and to access them accordingly.  How can this be
achieved with C++ address spaces?


Do you want only the vtables in the flash address space or do you want
to be able to have the whole class content.


My question is about vtables, not the bits that represent some object.
vtables are stored independently of objects, usually in .rodata + 
comdat.  Notice that vtables are read-only and in static storage, even 
if objects are neither.


The problem with vtables is that the user has no handle to specify where 
to locate them -- and even if, due to AVR quirks, the right instruction 
must be used.  Thus just putting vtables in flash by means of some 
section attribute won't work, only address-spaces can do the trick.



1. If you only want the vtables, I think that a target hook called
at vtable creation would do the trick.


Yes, that would be enough, see https://gcc.gnu.org/PR43745

Johann


2. If you want to be able to have pointer to classes in __flash, I
will need to further the support I have currently implemented to
support the pointer this qualified with an address space.
Retrospectively, I think this have to be implemented.

Paul


Would be great if this would work, but I think this can be really 
tricky, because it's already tricky for non-class objects.


A user has to specify __flash explicitly, which is quite different to 
plain objects.  For example, a const int can live in .rodata, but in 
cases like


extern int func();
extern const int ival;
const int ival = func();

ival would live in .bss and be initialized at runtime by a static 
constructor. Consequently,


const __flash int ival = func();

is invalid code that has to be diagnosed [1], because in the avr case, 
__flash means non-volatile memory, which contradicts initialization at 
runtime.


So only objects that are TREE_READONLY can go into AS __flash, 
TREE_CONST is not enough.


How is this problem addressed?  Does this require a new target hook to 
diagnose such cases, and does the compiler know at that stage that an 
object will be TREE_READONLY?


[1] Notice that in C it's enough to check that __flash is always 
accompanied by const, but in C++ this is not more enough as shown in the 
example above.


Johann


p.s: I 

Re: [PATCH] Use toplevel configure for GMP and MPFR for gdb

2022-11-10 Thread Tom Tromey
> "Andrew" == apinski--- via Gcc-patches  writes:

Andrew> From: Andrew Pinski 
Andrew> This patch uses the toplevel configure parts for GMP/MPFR for
Andrew> gdb. The only thing is that gdb now requires MPFR for building.
Andrew> Before it was a recommended but not required library.
Andrew> Also this allows building of GMP and MPFR with the toplevel
Andrew> directory just like how it is done for GCC.
Andrew> We now error out in the toplevel configure of the version
Andrew> of GMP and MPFR that is wrong.

Thank you for doing this.  It's been on my to-do list to investigate
this for quite a while, but I never got to it... :(

One larger question I have is whether we should land this now, or wait
until after GDB 13 branches.  That is coming soon and maybe it's not
good to add a new dependency just before the release.

My inclination would be to defer it, I suppose out of conservatism, but
I'd appreciate hearing from others.

I think gdb/README and gdb/doc/gdb.texinfo need some minor changes,
because the GMP-related configure options are being renamed.

The commit message should mention "PR build/28500" somewhere so that the
commit is logged to bugzilla.  Also in gdb we've been using a "Bug:"
trailer in the commit message that has the full URL of the bug.
IIUC this does fix that PR.

Tom


Re: [PATCH] Remove SLOW_SHORT_ACCESS from target headers

2022-11-10 Thread Andrew Pinski via Gcc-patches
On Thu, Nov 10, 2022 at 12:47 AM Richard Biener
 wrote:
>
> On Thu, Nov 10, 2022 at 2:21 AM Andrew Pinski via Gcc-patches
>  wrote:
> >
> > On Wed, Nov 9, 2022 at 5:16 PM apinski--- via Gcc-patches
> >  wrote:
> > >
> > > From: Andrew Pinski 
> > >
> > > SLOW_SHORT_ACCESS is defined in bfin and i386 target
> > > headers but the target macro is not used elsewhere.
> > > So let's remove it from those two headers and poison it.
> >
> > Just to add, this target macro was defined in GCC 2.8.0 in i386.h but
> > not used in any other sources.
> > apinski@xeond:~/src/upstream-gcc/gcc$ git grep SLOW_SHORT_ACCESS
> > releases/gcc-2.8.0
> > releases/gcc-2.8.0:gcc/config/i386/i386.h:#define SLOW_SHORT_ACCESS 0
> >
> > So it looks like it was never used for the last 24+ years and it is
> > time to finally remove it.
>
> OK.  I notice you didn't remove any documentation so it was also undocumented?

Yes it was never documented as far as I can tell. It was just ever
defined in those 2 headers. Also it looked like bfin copied it from
the
i386 header too. The define has been in the i386 header since the GCC
2.8.0 release and not used anywhere else even at that point; searching
git earlier becomes harder and there was no reference to it in
ChangeLogs either.

Thanks,
Andrew

>
> Thanks,
> Richard.
>
> > Thanks,
> > Andrew Pinski
> >
> >
> > >
> > > OK? Built x86_64-linux-gnu and bfin-elf.
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/bfin/bfin.h (SLOW_SHORT_ACCESS): Delete.
> > > * config/i386/i386.h (SLOW_SHORT_ACCESS): Delete.
> > > * system.h: Poison SLOW_SHORT_ACCESS
> > > ---
> > >  gcc/config/bfin/bfin.h | 1 -
> > >  gcc/config/i386/i386.h | 3 ---
> > >  gcc/system.h   | 2 +-
> > >  3 files changed, 1 insertion(+), 5 deletions(-)
> > >
> > > diff --git a/gcc/config/bfin/bfin.h b/gcc/config/bfin/bfin.h
> > > index 4e7753038a8..1d75c655df8 100644
> > > --- a/gcc/config/bfin/bfin.h
> > > +++ b/gcc/config/bfin/bfin.h
> > > @@ -810,7 +810,6 @@ typedef struct {
> > > subsequent accesses occur to other fields in the same word of the
> > > structure, but to different bytes.  */
> > >  #define SLOW_BYTE_ACCESS  0
> > > -#define SLOW_SHORT_ACCESS 0
> > >
> > >  /* Define this if most significant bit is lowest numbered
> > > in instructions that operate on numbered bit-fields. */
> > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > > index b32db8da109..a5ad9f387f7 100644
> > > --- a/gcc/config/i386/i386.h
> > > +++ b/gcc/config/i386/i386.h
> > > @@ -1933,9 +1933,6 @@ do {
> > >   \
> > >
> > >  #define SLOW_BYTE_ACCESS 0
> > >
> > > -/* Nonzero if access to memory by shorts is slow and undesirable.  */
> > > -#define SLOW_SHORT_ACCESS 0
> > > -
> > >  /* Define this macro if it is as good or better to call a constant
> > > function address than to call an address kept in a register.
> > >
> > > diff --git a/gcc/system.h b/gcc/system.h
> > > index c192b6c3ce7..de9c5c0d2ef 100644
> > > --- a/gcc/system.h
> > > +++ b/gcc/system.h
> > > @@ -1075,7 +1075,7 @@ extern void fancy_abort (const char *, int, const 
> > > char *)
> > > EH_FRAME_IN_DATA_SECTION TARGET_FLT_EVAL_METHOD_NON_DEFAULT   
> > >  \
> > > JCR_SECTION_NAME TARGET_USE_JCR_SECTION SDB_DEBUGGING_INFO
> > >  \
> > > SDB_DEBUG NO_IMPLICIT_EXTERN_C NOTICE_UPDATE_CC   
> > >  \
> > > -   CC_STATUS_MDEP_INIT CC_STATUS_MDEP CC_STATUS
> > > +   CC_STATUS_MDEP_INIT CC_STATUS_MDEP CC_STATUS SLOW_SHORT_ACCESS
> > >
> > >  /* Hooks that are no longer used.  */
> > >   #pragma GCC poison LANG_HOOKS_FUNCTION_MARK LANG_HOOKS_FUNCTION_FREE  \
> > > --
> > > 2.17.1
> > >


RE: [PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire semantics

2022-11-10 Thread Kyrylo Tkachov via Gcc-patches
Hi Andre,

> -Original Message-
> From: Andre Vieira (lists) 
> Sent: Thursday, November 10, 2022 11:17 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov ; Richard Earnshaw
> ; Richard Sandiford
> 
> Subject: [PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire
> semantics
> 
> Hello,
> 
> This patch enables the use of LDAPR for load-acquire semantics. After
> some internal investigation based on the work published by Podkopaev et
> al. (https://dl.acm.org/doi/10.1145/3290382) we can confirm that using
> LDAPR for the C++ load-acquire semantics is a correct relaxation.
> 
> Bootstrapped and regression tested on aarch64-none-linux-gnu.
> 
> OK for trunk?

Thanks for the patch

> 
> 2022-11-09  Andre Vieira  
>      Kyrylo Tkachov  
> 
> gcc/ChangeLog:
> 
>      * config/aarch64/aarch64.h (AARCH64_ISA_RCPC): New Macro.
>      (TARGET_RCPC): New Macro.
>      * config/aarch64/atomics.md (atomic_load): Change into
>      an expand.
>      (aarch64_atomic_load_rcpc): New define_insn for ldapr.
>      (aarch64_atomic_load): Rename of old define_insn for ldar.
>      * config/aarch64/iterators.md (UNSPEC_LDAP): New unspec enum
> value.
>      *
> doc/gcc/gcc-command-options/machine-dependent-options/aarch64-
> options.rst
>      (rcpc): Ammend documentation to mention the effects on code
> generation.
> 
> gcc/testsuite/ChangeLog:
> 
>      * gcc.target/aarch64/ldapr.c: New test.
>      * lib/target-supports.exp (add_options_for_aarch64_rcpc): New
> options procedure.
>      (check_effective_target_aarch64_rcpc_ok_nocache): New
> check-effective-target.
>      (check_effective_target_aarch64_rcpc_ok): Likewise.

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 
bc95f6d9d15f190a3e33704b4def2860d5f339bd..801a62bf2ba432f35ae1931beb8c4405b77b36c3
 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -657,7 +657,42 @@
   }
 )
 
-(define_insn "atomic_load"
+(define_expand "atomic_load"
+  [(match_operand:ALLI 0 "register_operand" "=r")
+   (match_operand:ALLI 1 "aarch64_sync_memory_operand" "Q")
+   (match_operand:SI   2 "const_int_operand")]
+  ""
+  {
+/* If TARGET_RCPC and this is an ACQUIRE load, then expand to a pattern
+   using UNSPECV_LDAP.  */
+enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
+if (TARGET_RCPC
+   && (is_mm_acquire (model)
+   || is_mm_acq_rel (model)))
+{
+  emit_insn (gen_aarch64_atomic_load_rcpc (operands[0], operands[1],
+operands[2]));
+}
+else
+{
+  emit_insn (gen_aarch64_atomic_load (operands[0], operands[1],
+   operands[2]));
+}

No braces needed for single-statement bodies.

diff --git 
a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst 
b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
index 
c2b23a6ee97ef2b7c74119f22c1d3e3d85385f4d..25d609238db7d45845dbc446ac21d12dddcf8eac
 100644
--- 
a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
+++ 
b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
@@ -437,9 +437,9 @@ the following and their inverses no :samp:`{feature}` :
   floating-point instructions. This option is enabled by default for 
:option:`-march=armv8.4-a`. Use of this option with architectures prior to 
Armv8.2-A is not supported.
 
 :samp:`rcpc`
-  Enable the RcPc extension.  This does not change code generation from GCC,
-  but is passed on to the assembler, enabling inline asm statements to use
-  instructions from the RcPc extension.
+  Enable the RcPc extension.  This enables the use of the LDAPR instructions 
for
+  load-acquire atomic semantics, and passes it on to the assembler, enabling
+  inline asm statements to use instructions from the RcPc extension.

Let's capitalize this consistently throughout the patch as "RCpc".

diff --git a/gcc/testsuite/gcc.target/aarch64/ldapr.c 
b/gcc/testsuite/gcc.target/aarch64/ldapr.c
new file mode 100644
index 
..c36edfcd79a9ee41434ab09ac47d257a692a8606
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ldapr.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -std=c99" } */
+/* { dg-require-effective-target aarch64_rcpc_ok } */
+/* { dg-add-options aarch64_rcpc } */

If you're not doing an assemble here you probably don't care much about this 
target business? (it's more important on the arm side with incompatible ABIs, 
Thumb-ness).
I think in this case you can avoid introducing the effective targets and just 
add
#pragma GCC target "+rcpc"
to the body of the testcase (we use it in a few testcases for aarch64)

Otherwise looks good!
Thanks,
Kyrill



Re: [PATCH v4] c++: parser - Support for target address spaces in C++

2022-11-10 Thread Paul Iannetta via Gcc-patches
Hi,

It took a bit of time to rework the rough corners.  I tried to be
mirror as much as possible the C front-end, especially when it comes
to implicit conversions, and warnings; and expose the same hooks as
the C front-end does.  The C++ front-end produces as much warnings
that the C front-end, however they are sometimes a bit less
informative (especially so, when passing arguments to functions).

I also address the following points:

  1.  The handling of the register storage class is grouped with the
  other variables with automatic storage.  This incurs a slight
  dis-alignment since you cannot have global register variables do not
  trigger an error, only a warning.

  2. In template unification, I maintain that we don't want any
  changes to address spaces whatsoever during the unification process,
  hence ignoring the strict flag.  Nevertheless, we allow implicit
  conversion, and I have verified that, indeed,


template  void f(T **);
struct A {
   template  operator T*__seg_fs*();
};
int main()
{
   f((void* __seg_fs *)0);   // (1): void*__seg_fs* -> void** should be 
OK
   void (*p)(void * __seg_fs *) = f; // (2): error
}

works as intended. That is, (1) works if we set __seg_fs as a subspace
of the generic address space, and (2) is always an error.

  3. In template unification, when unifying __as1 T = __as2 U we want
  to unify to the __as1 at most, never to __as2 at most, because the
  function requiring __as1 T may want to mix address space from the
  bigger address space, therefore I think that we want to have the
  smaller address-space to be unified into the bigger of the two.

  4. I left untouched same_type_ignoring_top_level_qualifiers_p, even
  though that was very convenient and often lead to better error
  messages since error were caught earlier.

  5. The handling of conversions is done very late in the calling
  chain, because I absolutely want to fold the conversion and force
  the conversion to appear as an ADDR_SPACE_CONV_EXPR after
  gimplification.

  6.  Currently, I do not handle classes. I see what I can do in a
  further revision and maybe add a target hook to hand down to targets
  the choice of the address space of the vtable.

  7.  This can't be added as a test case, but I checked that:

 // In this test case, __seg_gs is a subset of the generic address
 // space.

 int f (int *);
 int main ()
 {
   int __seg_fs *pa;
   int __seg_gs *pb;
   int *pc;
   pa = (__seg_fs int *) pb; return *pa; // warning: cast to ‘__seg_fs’ address 
space pointer from disjoint ‘__seg_gs’ address space pointer
   pa = (int __seg_fs *) pc; return *pa; // warning: cast to ‘__seg_fs’ address 
space pointer from disjoint generic address space pointer
   pa = pb; return *pa; //  error: invalid conversion from ‘__seg_gs int*’ to 
‘__seg_fs int*’
   pc = pb; return *pc; // __seg_gs int * -> int * converted through 
ADDR_SPACE_CONV_EXPR
   pb = pc; return *pb; // error: invalid conversion from ‘int*’ to ‘__seg_gs 
int*’
   pc = pa; return *pb; //  error: invalid conversion from ‘__seg_fs int*’ to 
‘int*’
   return f (pb); // __seg_gs int * -> int * converted through 
ADDR_SPACE_CONV_EXPR
   // return f (pa); // error: invalid conversion from ‘__seg_fs int*’ to ‘int*’
 // note:   initializing argument 1 of ‘int f(int*)’
 }

Thanks,
Paul

Rebased on current trunk, bootstrapped and regtested.
#  >8 
gcc/cp/ChangeLog:

2022-11-09  Paul Iannetta  

* call.cc (convert_like_internal): Add support for implicit
  conversion between compatible address spaces.  This is done
  here (and not in a higher caller) because we want to force the
  use of cp_fold_convert, which later enable back-end writers to
  tune-up the fine details of the conversion.
* cp-tree.h (enum cp_decl_spec): Add a new decl spec to handle
  address spaces.
(struct cp_decl_specifier_seq): Likewise.
* decl.cc (get_type_quals): Add address space support.
(check_tag_decl): Likewise.
(grokdeclarator): Likewise.
* class.cc (fixed_type_or_null): Add a case for
  ADDR_SPACE_CONVERT_EXPR.
* constexpr.cc (cxx_eval_constant_expression): Likewise.
(potential_constant_expression_1): Likewise.
* cp-gimplify.cc (cp_fold): Likewise.
* error.cc (dump_expr): Likewise.
* expr.cc (mark_use): Likewise.
* tree.cc (cp_stabilize_reference): Likewise.
(strip_typedefs_expr): Likewise.
(cp_tree_equal): Likewise.
(mark_exp_read): Likewise.
* mangle.cc (write_CV_qualifiers_for_type): Mangle address
  spaces using the extended type qualifier scheme.
* parser.cc (cp_lexer_get_preprocessor_token): Add a call to
  targetm.addr_space.diagnose_usage when lexing an address space
  token.
(cp_parser_postfix_expression): Prevent the use of address
spaces with compound 

Re: [PATCH] Do not specify NAN sign in frange::set_nonnegative.

2022-11-10 Thread Jakub Jelinek via Gcc-patches
On Thu, Nov 10, 2022 at 04:03:46PM +0100, Aldy Hernandez wrote:
> [Jakub, how's this?  Do you agree?]
> 
> After further reading of the IEEE 754 standard, it has become clear
> that there are no guarantees with regards to the sign of a NAN when it
> comes to any operation other than copy, copysign, abs, and negate.
> 
> Currently, set_nonnegative() is only used in one place in ranger
> applicable to floating point values, when expanding unknown calls.
> Since we already specially handle copy, copysign, abs, and negate, all
> the calls to set_nonnegative() must be NAN-sign agnostic.
> 
> The cleanest solution is to leave the sign unspecificied in
> frange::set_nonnegative().  Any special case, must be handled by the
> caller.
> 
> gcc/ChangeLog:
> 
>   * value-range.cc (frange::set_nonnegative): Remove NAN sign handling.
>   (range_tests_signed_zeros): Adjust test.

LGTM, thanks.

Jakub



old install to a different folder

2022-11-10 Thread Martin Liška

Hi.

We noticed we'll need the old /install to be available for redirect.
Gerald, can you please put it somewhere under /install-prev, or something 
similar?

Thanks,
Martin


Re: [PATCH] doc: formatting fixes

2022-11-10 Thread Martin Liška

On 11/10/22 14:53, Andreas Schwab via Gcc-patches wrote:

gcc/
* doc/gcc/gcc-command-options/option-summary.rst: Fix formatting.
---
  gcc/doc/gcc/gcc-command-options/option-summary.rst | 8 ++--
  1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/gcc/doc/gcc/gcc-command-options/option-summary.rst 
b/gcc/doc/gcc/gcc-command-options/option-summary.rst
index 388da445591..acc70920bad 100644
--- a/gcc/doc/gcc/gcc-command-options/option-summary.rst
+++ b/gcc/doc/gcc/gcc-command-options/option-summary.rst
@@ -1209,10 +1209,8 @@ in the following sections.
:option:`-malign-data=type` |gol|
:option:`-mbig-endian`  :option:`-mlittle-endian` |gol|
:option:`-mstack-protector-guard=guard`  
:option:`-mstack-protector-guard-reg=reg` |gol|
-  :option:`-mstack-protector-guard-offset=offset`
-  -mcsr-check -mno-csr-check
-
-  .. program:: -mcsr-check -mno-csr-check
+  :option:`-mstack-protector-guard-offset=offset` |gol|
+  :option:`-mcsr-check`  :option:`-mno-csr-check`
  
*RL78 Options*
  
@@ -1524,5 +1522,3 @@ in the following sections.

*zSeries Options*
  
See :ref:`s-390-and-zseries-options`.

-
-  .. program:: None


This is not correct as 'program:: None' means we reset to default program after
we define options for specific targets (program::Xtensa, ...).

Martin


[PATCH] Make last DCE remove empty loops

2022-11-10 Thread Richard Biener via Gcc-patches
The following makes the last DCE pass CD-DCE and in turn the
last CD-DCE pass a DCE one.  That ensues we remove empty loops
that become empty between the two.  I've also moved the tail-call
pass after DCE since DCE can only improve things here.

The two testcases were the only ones scanning cddce3 so I've
changed them to scan the dce7 pass that's now in this place.
The testcases scanning dce7 also work when that's in the earlier
position.

Bootstrapped and tested on x86_64-unknown-linux-gnu, I'm going
to push this tomorrow if there are no comments.

Richard.

PR tree-optimization/84646
* tree-ssa-dce.cc (pass_dce::set_pass_param): Add param
wheter to run update-address-taken.
(pass_dce::execute): Honor it.
* passes.def: Exchange last DCE and CD-DCE invocations.
Swap pass_tail_calls and the last DCE.

* g++.dg/tree-ssa/pr106922.C: Continue to scan earlier DCE dump.
* gcc.dg/tree-ssa/20030808-1.c: Likewise.
---
 gcc/passes.def |  8 
 gcc/testsuite/g++.dg/tree-ssa/pr106922.C   |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c |  6 +++---
 gcc/tree-ssa-dce.cc| 15 +--
 4 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/gcc/passes.def b/gcc/passes.def
index 193b5794749..462e9afad61 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -346,8 +346,8 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_ccp, true /* nonzero_p */);
   NEXT_PASS (pass_warn_restrict);
   NEXT_PASS (pass_dse);
-  NEXT_PASS (pass_cd_dce, true /* update_address_taken_p */);
-  /* After late CD DCE we rewrite no longer addressed locals into SSA
+  NEXT_PASS (pass_dce, true /* update_address_taken_p */);
+  /* After late DCE we rewrite no longer addressed locals into SSA
 form if possible.  */
   NEXT_PASS (pass_forwprop);
   NEXT_PASS (pass_sink_code, true /* unsplit edges */);
@@ -355,12 +355,12 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_fold_builtins);
   NEXT_PASS (pass_optimize_widening_mul);
   NEXT_PASS (pass_store_merging);
-  NEXT_PASS (pass_tail_calls);
   /* If DCE is not run before checking for uninitialized uses,
 we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c).
 However, this also causes us to misdiagnose cases that should be
 real warnings (e.g., testsuite/gcc.dg/pr18501.c).  */
-  NEXT_PASS (pass_dce);
+  NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */);
+  NEXT_PASS (pass_tail_calls);
   /* Split critical edges before late uninit warning to reduce the
  number of false positives from it.  */
   NEXT_PASS (pass_split_crit_edges);
diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr106922.C 
b/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
index 2aec4975aa8..4b6a4ad7f6c 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
@@ -1,5 +1,5 @@
 // { dg-require-effective-target c++20 }
-// { dg-options "-O2 -fdump-tree-cddce3" }
+// { dg-options "-O2 -fdump-tree-dce7" }
 
 template  struct __new_allocator {
   void deallocate(int *, int) { operator delete(0); }
@@ -87,4 +87,4 @@ void testfunctionfoo() {
   }
 }
 
-// { dg-final { scan-tree-dump-not "m_initialized" "cddce3" } }
+// { dg-final { scan-tree-dump-not "m_initialized" "dce7" } }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c
index 456f6f27128..7d4a1383ca4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20030808-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O1 -fdump-tree-cddce3" } */
+/* { dg-options "-O1 -fdump-tree-dce7" } */
   
 extern void abort (void);
 
@@ -33,8 +33,8 @@ delete_dead_jumptables ()
 /* There should be no loads of ->code.  If any exist, then we failed to
optimize away all the IF statements and the statements feeding
their conditions.  */
-/* { dg-final { scan-tree-dump-times "->code" 0 "cddce3"} } */
+/* { dg-final { scan-tree-dump-times "->code" 0 "dce7"} } */

 /* There should be no IF statements.  */
-/* { dg-final { scan-tree-dump-times "if " 0 "cddce3"} } */
+/* { dg-final { scan-tree-dump-times "if " 0 "dce7"} } */
 
diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc
index 54e5d8c2923..187d58bdd37 100644
--- a/gcc/tree-ssa-dce.cc
+++ b/gcc/tree-ssa-dce.cc
@@ -2005,14 +2005,25 @@ class pass_dce : public gimple_opt_pass
 {
 public:
   pass_dce (gcc::context *ctxt)
-: gimple_opt_pass (pass_data_dce, ctxt)
+: gimple_opt_pass (pass_data_dce, ctxt), update_address_taken_p (false)
   {}
 
   /* opt_pass methods: */
   opt_pass * clone () final override { return new pass_dce (m_ctxt); }
+  void set_pass_param (unsigned n, bool param) final override
+{
+  gcc_assert (n == 0);
+  update_address_taken_p = param;
+}
   bool 

Re: [PATCH] RISC-V: costs: support shift-and-add in strength-reduction

2022-11-10 Thread Philipp Tomsich
On Thu, 10 Nov 2022 at 02:46, Palmer Dabbelt  wrote:
>
> On Tue, 08 Nov 2022 11:54:34 PST (-0800), philipp.toms...@vrull.eu wrote:
> > The strength-reduction implementation in expmed.c will assess the
> > profitability of using shift-and-add using a RTL expression that wraps
> > a MULT (with a power-of-2) in a PLUS.  Unless the RISC-V rtx_costs
> > function recognizes this as expressing a sh[123]add instruction, we
> > will return an inflated cost---thus defeating the optimization.
> >
> > This change adds the necessary idiom recognition to provide an
> > accurate cost for this for of expressing sh[123]add.
> >
> > Instead on expanding to
> >   li  a5,200
> >   mulwa0,a5,a0
> > with this change, the expression 'a * 200' is sythesized as:
> >   sh2add  a0,a0,a0   // *5 = a + 4 * a
> >   sh2add  a0,a0,a0   // *5 = a + 4 * a
> >   sllia0,a0,3// *8
>
> That's more instructions, but multiplication is generally expensive.  At
> some point I remember the SiFive cores getting very fast integer
> multipliers, but I don't see that reflected in the cost model anywhere
> so maybe I'm just wrong?  Andrew or Kito might remember...
>
> If the mul-based sequences are still faster on the SiFive cores then we
> should probably find a way to keep emitting them, which may just be a
> matter of adjusting those multiply costs.  Moving to the shift-based
> sequences seems reasonable for a generic target, though.

The cost for a regular MULT is COSTS_N_INSNS(4) for the series-7 (see
the SImode and DImode entries in the int_mul line):
/* Costs to use when optimizing for Sifive 7 Series.  */
static const struct riscv_tune_param sifive_7_tune_info = {
  {COSTS_N_INSNS (4), COSTS_N_INSNS (5)},   /* fp_add */
  {COSTS_N_INSNS (4), COSTS_N_INSNS (5)},   /* fp_mul */
  {COSTS_N_INSNS (20), COSTS_N_INSNS (20)}, /* fp_div */
  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},   /* int_mul */
  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},   /* int_div */
  2,/* issue_rate */
  4,/* branch_cost */
  3,/* memory_cost */
  8,/* fmv_cost */
  true, /* slow_unaligned_access */
};

So the break-even is at COSTS_N_INSNS(4) + rtx_cost(immediate).

Testing against series-7, we get up to 5 (4 for the mul + 1 for the
li) instructions from strength reduction:

val * 783
=>
sh1add a5,a0,a0
slli a5,a5,4
add a5,a5,a0
slli a5,a5,4
sub a0,a5,a0

but fall back to a mul, once the cost exceeds this:

val * 1574
=>
li a5,1574
mul a0,a0,a5

> Either way, it probably warrants a test case to make sure we don't
> regress in the future.

Ack. Will be added for v2.

>
> >
> > gcc/ChangeLog:
> >
> >   * config/riscv/riscv.c (riscv_rtx_costs): Recognize shNadd,
> >   if expressed as a plus and multiplication with a power-of-2.

This will still need to be regenerated (it's referring to a '.c'
extension still).

> >
> > ---
> >
> >  gcc/config/riscv/riscv.cc | 13 +
> >  1 file changed, 13 insertions(+)
> >
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index ab6c745c722..0b2c4b3599d 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -2451,6 +2451,19 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
> > outer_code, int opno ATTRIBUTE_UN
> > *total = COSTS_N_INSNS (1);
> > return true;
> >   }
> > +  /* Before strength-reduction, the shNadd can be expressed as the 
> > addition
> > +  of a multiplication with a power-of-two.  If this case is not 
> > handled,
> > +  the strength-reduction in expmed.c will calculate an inflated cost. 
> > */
> > +  if (TARGET_ZBA
> > +   && mode == word_mode
> > +   && GET_CODE (XEXP (x, 0)) == MULT
> > +   && REG_P (XEXP (XEXP (x, 0), 0))
> > +   && CONST_INT_P (XEXP (XEXP (x, 0), 1))
> > +   && IN_RANGE (pow2p_hwi (INTVAL (XEXP (XEXP (x, 0), 1))), 1, 3))
>
> IIUC the fall-through is biting us here and this matches power-of-2 +1
> and power-of-2 -1.  That looks to be the case for the one below, though,
> so not sure if I'm just missing something?

The strength-reduction in expmed.cc uses "(PLUS (reg) (MULT (reg)
))" to express a shift-then-add.
Here's one of the relevant snippets (from the internal costing in expmed.cc):
  all.shift_mult = gen_rtx_MULT (mode, all.reg, all.reg);
  all.shift_add = gen_rtx_PLUS (mode, all.shift_mult, all.reg);

So while we normally encounter a "(PLUS (reg) (ASHIFT (reg)
))", for the strength-reduction we also need to provide the
cost for the expression with a MULT).
The other idioms (those matching above and below the new one) always
require an ASHIFT for the inner.

>
> > + {
> > +   *total = COSTS_N_INSNS (1);
> > +   return true;
> > + }
> >/* shNadd.uw pattern for zba.
> >[(set 

[PATCH] Do not specify NAN sign in frange::set_nonnegative.

2022-11-10 Thread Aldy Hernandez via Gcc-patches
[Jakub, how's this?  Do you agree?]

After further reading of the IEEE 754 standard, it has become clear
that there are no guarantees with regards to the sign of a NAN when it
comes to any operation other than copy, copysign, abs, and negate.

Currently, set_nonnegative() is only used in one place in ranger
applicable to floating point values, when expanding unknown calls.
Since we already specially handle copy, copysign, abs, and negate, all
the calls to set_nonnegative() must be NAN-sign agnostic.

The cleanest solution is to leave the sign unspecificied in
frange::set_nonnegative().  Any special case, must be handled by the
caller.

gcc/ChangeLog:

* value-range.cc (frange::set_nonnegative): Remove NAN sign handling.
(range_tests_signed_zeros): Adjust test.
---
 gcc/value-range.cc | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 852ac09f2c4..d55d85846c1 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -797,14 +797,17 @@ frange::zero_p () const
  && real_iszero (_max));
 }
 
+// Set the range to non-negative numbers, that is [+0.0, +INF].
+//
+// The NAN in the resulting range (if HONOR_NANS) has a varying sign
+// as there are no guarantees in IEEE 754 wrt to the sign of a NAN,
+// except for copy, abs, and copysign.  It is the responsibility of
+// the caller to set the NAN's sign if desired.
+
 void
 frange::set_nonnegative (tree type)
 {
   set (type, dconst0, frange_val_max (type));
-
-  // Set +NAN as the only possibility.
-  if (HONOR_NANS (type))
-update_nan (/*sign=*/false);
 }
 
 // Here we copy between any two irange's.  The ranges can be legacy or
@@ -3923,7 +3926,6 @@ range_tests_signed_zeros ()
 ASSERT_TRUE (r0.undefined_p ());
 
   r0.set_nonnegative (float_type_node);
-  ASSERT_TRUE (r0.signbit_p (signbit) && !signbit);
   if (HONOR_NANS (float_type_node))
 ASSERT_TRUE (r0.maybe_isnan ());
 }
-- 
2.38.1



Re: [PATCH] range-op: Implement floating point multiplication fold_range [PR107569]

2022-11-10 Thread Aldy Hernandez via Gcc-patches




On 11/10/22 14:44, Jakub Jelinek wrote:

Hi!

The following patch implements frange multiplication, including the
special case of x * x.  The callers don't tell us that it is x * x,
just that it is either z = x * x or if (x == y) z = x * y;
For irange that makes no difference, but for frange it can mean
x is -0.0 and y is 0.0 if they have the same range that includes both
signed and unsigned zeros, so we need to assume result could be -0.0.

The patch causes one regression:
+FAIL: gcc.dg/fold-overflow-1.c scan-assembler-times 2139095040 2
but that is already tracked in PR107608 and affects not just the newly
added multiplication, but addition and other floating point operations
(and doesn't seem like a ranger bug but dce or whatever else).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Once we have division and the reverse ops for all of these, perhaps
we can do some cleanups to share common code, but the way I have division
now partly written doesn't show up many commonalities.  Multiplication
is simple, division is a nightmare.


Thanks for tackling this.  I'm happy you think multiplication is simple, 
cause all the cross product operators make my head spin.




2022-11-10  Jakub Jelinek  

PR tree-optimization/107569
PR tree-optimization/107591
* range-op.h (range_operator_float::rv_fold): Add relation_kind
argument.
* range-op-float.cc (range_operator_float::fold_range): Name
last argument trio and pass trio.op1_op2 () as last argument to
rv_fold.
(range_operator_float::rv_fold): Add relation_kind argument.
(foperator_plus::rv_fold, foperator_minus::rv_fold): Likewise.
(frange_mult): New function.
(foperator_mult): New class.
(floating_op_table::floating_op_table): Use foperator_mult for
MULT_EXPR.

--- gcc/range-op.h.jj   2022-11-10 00:55:09.430219763 +0100
+++ gcc/range-op.h  2022-11-10 11:30:33.594114939 +0100
@@ -123,7 +123,8 @@ public:
const REAL_VALUE_TYPE _lb,
const REAL_VALUE_TYPE _ub,
const REAL_VALUE_TYPE _lb,
-   const REAL_VALUE_TYPE _ub) const;
+   const REAL_VALUE_TYPE _ub,
+   relation_kind) const;
// Unary operations have the range of the LHS as op2.
virtual bool fold_range (irange , tree type,
   const frange ,
--- gcc/range-op-float.cc.jj2022-11-10 00:55:09.318221259 +0100
+++ gcc/range-op-float.cc   2022-11-10 11:31:29.040359082 +0100
@@ -51,7 +51,7 @@ along with GCC; see the file COPYING3.
  bool
  range_operator_float::fold_range (frange , tree type,
  const frange , const frange ,
- relation_trio) const
+ relation_trio trio) const
  {
if (empty_range_varying (r, type, op1, op2))
  return true;
@@ -65,7 +65,7 @@ range_operator_float::fold_range (frange
bool maybe_nan;
rv_fold (lb, ub, maybe_nan, type,
   op1.lower_bound (), op1.upper_bound (),
-  op2.lower_bound (), op2.upper_bound ());
+  op2.lower_bound (), op2.upper_bound (), trio.op1_op2 ());
  
// Handle possible NANs by saturating to the appropriate INF if only

// one end is a NAN.  If both ends are a NAN, just return a NAN.
@@ -103,8 +103,8 @@ range_operator_float::rv_fold (REAL_VALU
   const REAL_VALUE_TYPE _lb ATTRIBUTE_UNUSED,
   const REAL_VALUE_TYPE _ub ATTRIBUTE_UNUSED,
   const REAL_VALUE_TYPE _lb ATTRIBUTE_UNUSED,
-  const REAL_VALUE_TYPE _ub ATTRIBUTE_UNUSED)
-  const
+  const REAL_VALUE_TYPE _ub ATTRIBUTE_UNUSED,
+  relation_kind) const
  {
lb = dconstninf;
ub = dconstinf;
@@ -1868,7 +1868,8 @@ class foperator_plus : public range_oper
const REAL_VALUE_TYPE _lb,
const REAL_VALUE_TYPE _ub,
const REAL_VALUE_TYPE _lb,
-   const REAL_VALUE_TYPE _ub) const final override
+   const REAL_VALUE_TYPE _ub,
+   relation_kind) const final override
{
  frange_arithmetic (PLUS_EXPR, type, lb, lh_lb, rh_lb, dconstninf);
  frange_arithmetic (PLUS_EXPR, type, ub, lh_ub, rh_ub, dconstinf);
@@ -1892,7 +1893,8 @@ class foperator_minus : public range_ope
const REAL_VALUE_TYPE _lb,
const REAL_VALUE_TYPE _ub,
const REAL_VALUE_TYPE _lb,
-   const REAL_VALUE_TYPE _ub) const final override
+   const REAL_VALUE_TYPE _ub,
+   relation_kind) const final override
{
  frange_arithmetic (MINUS_EXPR, type, lb, lh_lb, rh_ub, dconstninf);
  frange_arithmetic (MINUS_EXPR, type, ub, lh_ub, rh_lb, dconstinf);
@@ -1908,6 +1910,123 @@ class 

[PATCH RESEND] riscv: improve the cost model for loading a 64bit constant in rv32.

2022-11-10 Thread Lin Sinan via Gcc-patches
The motivation of this patch is to correct the wrong estimation of the number 
of instructions needed for loading a 64bit constant in rv32 in the current cost 
model(riscv_interger_cost). According to the current implementation, if a 
constant requires more than 3 instructions(riscv_const_insn and 
riscv_legitimate_constant_p), then the constant will be put into constant pool 
when expanding gimple to rtl(legitimate_constant_p hook and emit_move_insn). So 
the inaccurate cost model leads to the suboptimal codegen in rv32 and the wrong 
estimation part could be corrected through this fix.

e.g. the current codegen for loading 0x839290001 in rv32

  lui a5,%hi(.LC0)
  lw  a0,%lo(.LC0)(a5)
  lw  a1,%lo(.LC0+4)(a5)
.LC0:
  .word   958988289
  .word   8

output after this patch

  li a0,958988288
  addi a0,a0,1
  li a1,8

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_build_integer): Handle the case of 
loading 64bit constant in rv32.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rv32-load-64bit-constant.c: New test.

Signed-off-by: Lin Sinan 
---
 gcc/config/riscv/riscv.cc | 23 +++
 .../riscv/rv32-load-64bit-constant.c  | 38 +++
 2 files changed, 61 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 32f9ef9ade9..9dffabdc5e3 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -618,6 +618,29 @@ riscv_build_integer (struct riscv_integer_op *codes, 
HOST_WIDE_INT value,
}
 }
 
+  if ((value > INT32_MAX || value < INT32_MIN) && !TARGET_64BIT)
+{
+  unsigned HOST_WIDE_INT loval = sext_hwi (value, 32);
+  unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32);
+  struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS],
+   hicode[RISCV_MAX_INTEGER_OPS];
+  int hi_cost, lo_cost;
+
+  hi_cost = riscv_build_integer_1 (hicode, hival, mode);
+  if (hi_cost < cost)
+   {
+ lo_cost = riscv_build_integer_1 (alt_codes, loval, mode);
+ if (lo_cost + hi_cost < cost)
+   {
+ memcpy (codes, alt_codes,
+ lo_cost * sizeof (struct riscv_integer_op));
+ memcpy (codes + lo_cost, hicode,
+ hi_cost * sizeof (struct riscv_integer_op));
+ cost = lo_cost + hi_cost;
+   }
+   }
+}
+
   return cost;
 }
 
diff --git a/gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c 
b/gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c
new file mode 100644
index 000..61d482fb283
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rv32-load-64bit-constant.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32im -mabi=ilp32 -O1" } */
+
+/* This test only applies to RV32. Some of 64bit constants in this test will 
be put
+into the constant pool in RV64, since RV64 might need one extra instruction to 
load
+64bit constant. */
+
+unsigned long long
+rv32_mov_64bit_int1 (void)
+{
+  return 0x739290001LL;
+}
+
+unsigned long long
+rv32_mov_64bit_int2 (void)
+{
+  return 0x839290001LL;
+}
+
+unsigned long long
+rv32_mov_64bit_int3 (void)
+{
+  return 0x392900013929LL;
+}
+
+unsigned long long
+rv32_mov_64bit_int4 (void)
+{
+  return 0x392900113929LL;
+}
+
+unsigned long long
+rv32_mov_64bit_int5 (void)
+{
+  return 0x14736def3929LL;
+}
+
+/* { dg-final { scan-assembler-not "lw\t" } } */
-- 
2.36.0



Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-10 Thread Hongyu Wang via Gcc-patches
> Well, with AVX512 v64qi that's 64*64 == 4096 cases to check.  I think
> a lambda function is fine to use.  The alternative (used by the vectorizer
> in some places) is to use sth like
>
>  auto_sbitmap seen (nelts);
>  for (i = 0; i < nelts; i++)
>{
>  if (!bitmap_set_bit (seen, i))
>break;
>  count++;
>}
>  full_perm_p = count == nelts;
>
> I'll note that you should still check .encoding ().encoded_full_vector_p ()
> and only bother to check that case, that's a very simple check.

Thanks for the good example! We also tried using wide_int as a bitmask
but your code looks more simple and reasonable.

Updated the patch accordingly.

Richard Biener  于2022年11月10日周四 16:56写道:


>
> On Thu, Nov 10, 2022 at 3:27 AM Hongyu Wang  wrote:
> >
> > Hi Prathamesh and Richard,
> >
> > Thanks for the review and nice suggestions!
> >
> > > > I guess the transform should work as long as mask is same for both
> > > > vectors even if it's
> > > > not constant ?
> > >
> > > Yes, please change accordingly (and maybe push separately).
> > >
> >
> > Removed VECTOR_CST for integer ops.
> >
> > > > If this transform is meant only for VLS vectors, I guess you should
> > > > bail out if TYPE_VECTOR_SUBPARTS is not constant,
> > > > otherwise it will crash for VLA vectors.
> > >
> > > I suppose it's difficult to create a VLA permute that covers all elements
> > > and that is not trivial though.  But indeed add ().is_constant to the
> > > VECTOR_FLOAT_TYPE_P guard.
> >
> > Added.
> >
> > > Meh, that's quadratic!  I suggest to check .encoding 
> > > ().encoded_full_vector_p ()
> > > (as said I can't think of a non-full encoding that isn't trivial
> > > but covers all elements) and then simply .qsort () the vector_builder
> > > (it derives
> > > from vec<>) so the scan is O(n log n).
> >
> > The .qsort () approach requires an extra cmp_func that IMO would not
> > be feasible to be implemented in match.pd (I suppose lambda function
> > would not be a good idea either).
> > Another solution would be using hash_set but it does not work here for
> > int64_t or poly_int64 type.
> > So I kept current O(n^2) simple code here, and I suppose usually the
> > permutation indices would be a small number even for O(n^2)
> > complexity.
>
> Well, with AVX512 v64qi that's 64*64 == 4096 cases to check.  I think
> a lambda function is fine to use.  The alternative (used by the vectorizer
> in some places) is to use sth like
>
>  auto_sbitmap seen (nelts);
>  for (i = 0; i < nelts; i++)
>{
>  if (!bitmap_set_bit (seen, i))
>break;
>  count++;
>}
>  full_perm_p = count == nelts;
>
> I'll note that you should still check .encoding ().encoded_full_vector_p ()
> and only bother to check that case, that's a very simple check.
>
> >
> > Attached updated patch.
> >
> > Richard Biener via Gcc-patches  于2022年11月8日周二 
> > 22:38写道:
> >
> >
> > >
> > > On Fri, Nov 4, 2022 at 7:44 AM Prathamesh Kulkarni via Gcc-patches
> > >  wrote:
> > > >
> > > > On Fri, 4 Nov 2022 at 05:36, Hongyu Wang via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > This is a follow-up patch for PR98167
> > > > >
> > > > > The sequence
> > > > >  c1 = VEC_PERM_EXPR (a, a, mask)
> > > > >  c2 = VEC_PERM_EXPR (b, b, mask)
> > > > >  c3 = c1 op c2
> > > > > can be optimized to
> > > > >  c = a op b
> > > > >  c3 = VEC_PERM_EXPR (c, c, mask)
> > > > > for all integer vector operation, and float operation with
> > > > > full permutation.
> > > > >
> > > > > Bootstrapped & regrtested on x86_64-pc-linux-gnu.
> > > > >
> > > > > Ok for trunk?
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > > PR target/98167
> > > > > * match.pd: New perm + vector op patterns for int and fp 
> > > > > vector.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > > PR target/98167
> > > > > * gcc.target/i386/pr98167.c: New test.
> > > > > ---
> > > > >  gcc/match.pd| 49 
> > > > > +
> > > > >  gcc/testsuite/gcc.target/i386/pr98167.c | 44 ++
> > > > >  2 files changed, 93 insertions(+)
> > > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr98167.c
> > > > >
> > > > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > > > index 194ba8f5188..b85ad34f609 100644
> > > > > --- a/gcc/match.pd
> > > > > +++ b/gcc/match.pd
> > > > > @@ -8189,3 +8189,52 @@ and,
> > > > >   (bit_and (negate @0) integer_onep@1)
> > > > >   (if (!TYPE_OVERFLOW_SANITIZED (type))
> > > > >(bit_and @0 @1)))
> > > > > +
> > > > > +/* Optimize
> > > > > +   c1 = VEC_PERM_EXPR (a, a, mask)
> > > > > +   c2 = VEC_PERM_EXPR (b, b, mask)
> > > > > +   c3 = c1 op c2
> > > > > +   -->
> > > > > +   c = a op b
> > > > > +   c3 = VEC_PERM_EXPR (c, c, mask)
> > > > > +   For all integer non-div operations.  */
> > > > > +(for op (plus minus mult bit_and bit_ior bit_xor
> > > > > +lshift rshift)
> > > > > + (simplify
> > > > > +  (op 

[PATCH] better PHI copy propagation for forwprop

2022-11-10 Thread Richard Biener via Gcc-patches
We can handle _1 = PHI <_1, _2> as a copy.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/84646
* tree-ssa-forwprop.cc (pass_forwprop::execute): Improve
copy propagation across PHIs.
---
 gcc/tree-ssa-forwprop.cc | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
index 4b693ef095c..7c7942600ef 100644
--- a/gcc/tree-ssa-forwprop.cc
+++ b/gcc/tree-ssa-forwprop.cc
@@ -3384,7 +3384,12 @@ pass_forwprop::execute (function *fun)
  FOR_EACH_PHI_ARG (use_p, phi, it, SSA_OP_USE)
{
  tree use = USE_FROM_PTR (use_p);
- if (! first)
+ if (use == res)
+   /* The PHI result can also appear on a backedge, if so
+  we can ignore this case for the purpose of determining
+  the singular value.  */
+   ;
+ else if (! first)
first = use;
  else if (! operand_equal_p (first, use, 0))
{
-- 
2.35.3


Re: Ping [PATCH] Add condition coverage profiling

2022-11-10 Thread Martin Liška

On 11/2/22 07:16, Jørgen Kvalsvik via Gcc-patches wrote:

Ping. I would like to see this become a part of gcc 13, will we be able to
commit before the window closes?


Hello.

I'm sorry but I was interrupted by the Sphinx conversion task. Anyway, please 
update
the coding style and I can return to patch review in the upcoming weeks.

About the window closure: we still have (basically till the end of year) as 
stage1 closing
period means that all patches should be *under review* until this deadline. And 
that's true
for your patch.

Cheers,
Martin


Re: [PATCH] maintainer-scripts/gcc_release: compress xz in parallel

2022-11-10 Thread Martin Liška

On 11/9/22 03:06, Xi Ruoyao via Gcc-patches wrote:

On Wed, 2022-11-09 at 01:52 +, Joseph Myers wrote:

On Tue, 8 Nov 2022, Xi Ruoyao via Gcc-patches wrote:


I'm wondering if running xz -T0 on different machines (with different
core numbers) may produce different compressed data.  The difference can
cause trouble distributing checksums.


gcc_release definitely doesn't use any options to make the tar file
reproducible (the timestamps, user and group names and ordering of the
files in the tarball, and quite likely permissions other than whether a
file has execute permission, may depend on when the script was run and on
what system as what user - not just on the commit from which the tar file
was built).  So I don't think possible variation of xz output matters here
at present.


OK then.  I'm already using commands like

git archive --format=tar --prefix=gcc-$(git gcc-descr HEAD)/ HEAD | xz -T0 > 
../gcc-$(git gcc-descr HEAD).tar.xz

when I generate a GCC snapshot tarball for my own use.




Hi.

We may consider using zstd compression that also support a multi-threaded 
compression
(which is stable). Note the decompression of zstd is much faster than xz.

Martin


Re: [PATCH 1/3] STABS: remove -gstabs and -gxcoff functionality

2022-11-10 Thread Martin Liška

On 11/10/22 15:12, Michael Matz wrote:

Hello,

On Thu, 10 Nov 2022, Martin Liška wrote:


These changes are part of
commit r13-2361-g7e0db0cdf01e9c885a29cb37415f5bc00d90c029
"STABS: remove -gstabs and -gxcoff functionality".  What this does is
remove these identifiers from "poisoning":

  /* As the last action in this file, we poison the identifiers that
 shouldn't be used.
  [...]
  /* Other obsolete target macros, or macros that used to be in target
 headers and were not used, and may be obsolete or may never have
 been used.  */
   #pragma GCC poison [...]

Shouldn't these identifiers actually stay (so that any accidental future
use gets flagged, as I understand this machinery), and instead more
identifiers be added potentially: those where their definition/use got
removed with "STABS: remove -gstabs and -gxcoff functionality"?  (I've
not checked.)


Well, the identifiers are not used any longer, so I don't think we should
poison them. Or do I miss something?


It's the very nature of poisoned identifiers that they aren't used (every
use would get flagged as error).  The point of poisoning them is to avoid
future new uses to creep in (e.g. via mislead back- or forward-ports,
which can for instance happen easily with backend macros when an
out-of-tree port is eventually tried to be integrated).  Hence, generally
the list of those identifiers is only extended, never reduced.  (There may
be exceptions of course)


Ahh, ok, makes sense. So Thomas, please put them back to the poisoned list.

Martin




Ciao,
Michael.




Re: [PATCH 1/3] STABS: remove -gstabs and -gxcoff functionality

2022-11-10 Thread Michael Matz via Gcc-patches
Hello,

On Thu, 10 Nov 2022, Martin Liška wrote:

> > These changes are part of
> > commit r13-2361-g7e0db0cdf01e9c885a29cb37415f5bc00d90c029
> > "STABS: remove -gstabs and -gxcoff functionality".  What this does is
> > remove these identifiers from "poisoning":
> > 
> >  /* As the last action in this file, we poison the identifiers that
> > shouldn't be used.
> >  [...]
> >  /* Other obsolete target macros, or macros that used to be in target
> > headers and were not used, and may be obsolete or may never have
> > been used.  */
> >   #pragma GCC poison [...]
> > 
> > Shouldn't these identifiers actually stay (so that any accidental future
> > use gets flagged, as I understand this machinery), and instead more
> > identifiers be added potentially: those where their definition/use got
> > removed with "STABS: remove -gstabs and -gxcoff functionality"?  (I've
> > not checked.)
> 
> Well, the identifiers are not used any longer, so I don't think we should
> poison them. Or do I miss something?

It's the very nature of poisoned identifiers that they aren't used (every 
use would get flagged as error).  The point of poisoning them is to avoid 
future new uses to creep in (e.g. via mislead back- or forward-ports, 
which can for instance happen easily with backend macros when an 
out-of-tree port is eventually tried to be integrated).  Hence, generally 
the list of those identifiers is only extended, never reduced.  (There may 
be exceptions of course)


Ciao,
Michael.


Re: [PATCH v3] c++: parser - Support for target address spaces in C++

2022-11-10 Thread Paul Iannetta via Gcc-patches
On Thu, Nov 03, 2022 at 02:38:39PM +0100, Georg-Johann Lay wrote:
> > [PATCH v3] c++: parser - Support for target address spaces in C++
> 
> First of all, it is great news that GCC is going to implement named address
> spaces for C++.
> 
> I have some questions:
> 
> 1. How is name-mangling going to work?
> ==
> 
> Clang supports address spaces in C++, and for address-space 1 it does
> generate code like the following:
> 
> #define __flash __attribute__((__address_space__(1)))
> 
> char get_p (const __flash char *p)
> {
> return *p;
> }
> 
> 
> _Z5get_pPU3AS1Kc:
>...
> 
> I.e. address-space 1 is mangled as "AS1".
> 
> (Notice that Clang's attribute actually works like a qualifier here, one
> could not get this to work with GCC attributes.)
> 

Currently, the __address_space__ attribute does not exist in GCC, and
the definition of address-spaces as well as how they are laid-out can
only modified in the back-end.

I agree that this is a convenient attribute to define address-spaces
on the fly, but this lacks the flexibility of specifying which
address-spaces are subsets of others and how they should be promoted.

The mangling will be done with the extended qualifiers extension, for
example, your example would be mangled into "_Z5get_pPU7__flashKc".

> 
> 2. Will it work with compound literals?
> ===
> 
> Currently, the following C code works for target avr:
> 
> const __flash char *pHallo = (const __flash char[]) { "Hallo" };
> 
> This is a pointer in RAM (AS0) that holds the address of a string in flash
> (AS1) and is initialized with that address. Unfortunately, this does not
> work locally:
> 
> const __flash char* get_hallo (void)
> {
> [static] const __flash char *p2 = (const __flash char[]) { "Hallo2" };
> return p2;
> }
> 
> foo.c: In function 'get_hallo':
> foo.c: error: compound literal qualified by address-space qualifier
> 
> Is there any way to make this work now? Would be great!
> 

Currently, I implement the same restrictions as the C front-end, but I
think that this restriction could be lifted.

> 
> 3. Will TARGET_ADDR_SPACE_DIAGNOSE_USAGE still work?
> 
> 
> Currently there is target hook TARGET_ADDR_SPACE_DIAGNOSE_USAGE.
> I did not see it in your patches, so maybe I just missed it? See
> https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gccint/Named-Address-Spaces.html#index-TARGET_005fADDR_005fSPACE_005fDIAGNOSE_005fUSAGE
> 

That was a point I overlooked in my previous patch.  This will be in
my new revision where I also add implicit conversion between
address spaces and also expose TARGET_ADDR_SPACE_CONVERT.

> 
> 4. Will it be possible to put C++ virtual tables in ASs, and how?
> =
> 

Currently, I do not allow the declaration of instances of classes in
an address space, mainly to not have to cope with the handling of the
this pointer.  That is,

  __flash Myclass *t;

does not work.  Nevertheless, I admit that this is would be nice to
have.

> One big complaint about avr-g++ is that there is no way to put vtables in
> flash (address-space 1) and to access them accordingly.  How can this be
> achieved with C++ address spaces?

Do you want only the vtables in the flash address space or do you want
to be able to have the whole class content.

1. If you only want the vtables, I think that a target hook called
at vtable creation would do the trick.

2. If you want to be able to have pointer to classes in __flash, I
will need to further the support I have currently implemented to
support the pointer this qualified with an address space.
Retrospectively, I think this have to be implemented.

Paul

> 
> Background: The AVR architecture has non-linear address space, and you
> cannot tell from the numeric value of an address whether it's in RAM or
> flash. You will have to use different instructions depending on the
> location.
> 
> This means that .rodata must be located in RAM, because otherwise one would
> not know whether const char* pointed to RAM or flash, but to de-reference
> you's need different instructions.
> 
> One way out is named address spaces, so we could finally fix
> 
> https://gcc.gnu.org/PR43745
> 
> 
> Regards,
> 
> Johann
> 






Re: [PATCH] sphinx: support Sphinx in lib*/Makefile.am.

2022-11-10 Thread Michael Matz via Gcc-patches
Hello,

On Thu, 10 Nov 2022, Martin Liška wrote:

> This is a patch which adds support for Sphinx in lib*/Makefile.am where
> I wrongly modified Makefile.in that are generated.
> 
> One thing that's missing is that the generated Makefile.in does not
> contain 'install-info-am' target and thus the created info files
> are not installed with 'make install'. Does anybody know?

The whole generation/processing of '*info*' targets (and dvi,pdf,ps,html 
targets) is triggered by the presence of a 'TEXINFO' primary 
(here in the 'info_TEXINFO' variable), which you removed.  As the sphinx 
result is not appropriate for either TEXINFO or MANS primaries (the only 
ones in automake related specifically to documentation), you probably want 
to include them in the DATA primary.  For backward compatibility you might 
want to add your own {un,}install-info-am targets depending on 
{un,}install-data-am then, though I'm not sure why one would need one.

I currently don't quite see how you make the Sphinx results be installed 
at all, AFAICS there's no mention of them in any of the automake 
variables.  You have to list something somewhere (as said, probably in 
DATA) to enable automake to generate the usual set of Makefile targets.

(beware: I'm not an automake expert, so the above might turn out to be 
misleading advise :-) )


Ciao,
Michael.


> 
> Thanks,
> Martin
> 
> ---
>  libgomp/Makefile.am   |  27 ++-
>  libgomp/Makefile.in   | 275 +++---
>  libgomp/testsuite/Makefile.in |   3 +
>  libitm/Makefile.am|  26 ++-
>  libitm/Makefile.in| 278 ++
>  libitm/testsuite/Makefile.in  |   3 +
>  libquadmath/Makefile.am   |  37 ++--
>  libquadmath/Makefile.in   | 307 +++---
>  8 files changed, 208 insertions(+), 748 deletions(-)
> 
> diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
> index 428f7a9dab5..ab5e86b0f98 100644
> --- a/libgomp/Makefile.am
> +++ b/libgomp/Makefile.am
> @@ -11,6 +11,8 @@ config_path = @config_path@
>  search_path = $(addprefix $(top_srcdir)/config/, $(config_path))
> $(top_srcdir) \
> $(top_srcdir)/../include
>  +abs_doc_builddir = @abs_top_builddir@/doc
> +
>  fincludedir =
> $(libdir)/gcc/$(target_alias)/$(gcc_version)$(MULTISUBDIR)/finclude
>  libsubincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include
>  @@ -100,18 +102,6 @@ fortran.o: libgomp_f.h
>  env.lo: libgomp_f.h
>  env.o: libgomp_f.h
>  -
> -# Automake Documentation:
> -# If your package has Texinfo files in many directories, you can use the
> -# variable TEXINFO_TEX to tell Automake where to find the canonical
> -# `texinfo.tex' for your package. The value of this variable should be
> -# the relative path from the current `Makefile.am' to `texinfo.tex'.
> -TEXINFO_TEX   = ../gcc/doc/include/texinfo.tex
> -
> -# Defines info, dvi, pdf and html targets
> -MAKEINFOFLAGS = -I $(srcdir)/../gcc/doc/include
> -info_TEXINFOS = libgomp.texi
> -
>  # AM_CONDITIONAL on configure option --generated-files-in-srcdir
>  if GENINSRC
>  STAMP_GENINSRC = stamp-geninsrc
> @@ -127,7 +117,7 @@ STAMP_BUILD_INFO =
>  endif
>   -all-local: $(STAMP_GENINSRC)
> +all-local: $(STAMP_GENINSRC) $(STAMP_BUILD_INFO)
>   stamp-geninsrc: libgomp.info
>   cp -p $(top_builddir)/libgomp.info $(srcdir)/libgomp.info
> @@ -135,8 +125,15 @@ stamp-geninsrc: libgomp.info
>   libgomp.info: $(STAMP_BUILD_INFO)
>  -stamp-build-info: libgomp.texi
> - $(MAKEINFO) $(AM_MAKEINFOFLAGS) $(MAKEINFOFLAGS) -I $(srcdir) -o
> libgomp.info $(srcdir)/libgomp.texi
> +RST_FILES:=$(shell find $(srcdir) -name *.rst)
> +SPHINX_CONFIG_FILES:=$(srcdir)/doc/conf.py $(srcdir)/../doc/baseconf.py
> +SPHINX_FILES:=$(RST_FILES) $(SPHINX_CONFIG_FILES)
> +
> +stamp-build-info: $(SPHINX_FILES)
> + + if [ x$(HAS_SPHINX_BUILD) = xhas-sphinx-build ]; then \
> +   make -C $(srcdir)/../doc info SOURCEDIR=$(abs_srcdir)/doc
> BUILDDIR=$(abs_doc_builddir)/info SPHINXBUILD=$(SPHINX_BUILD); \
> +   cp ./doc/info/texinfo/libgomp.info libgomp.info; \
> + else true; fi
>   @touch $@
>   diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
> index 814ccd13dc0..4d0f2184e95 100644
> --- a/libgomp/Makefile.in
> +++ b/libgomp/Makefile.in
> @@ -177,7 +177,7 @@ am__uninstall_files_from_dir = { \
>  || { echo " ( cd '$$dir' && rm -f" $$files ")"; \
>   $(am__cd) "$$dir" && rm -f $$files; }; \
>}
> -am__installdirs = "$(DESTDIR)$(toolexeclibdir)" "$(DESTDIR)$(infodir)" \
> +am__installdirs = "$(DESTDIR)$(toolexeclibdir)" \
>   "$(DESTDIR)$(fincludedir)" "$(DESTDIR)$(libsubincludedir)" \
>   "$(DESTDIR)$(toolexeclibdir)"
>  LTLIBRARIES = $(toolexeclib_LTLIBRARIES)
> @@ -269,16 +269,9 @@ am__v_FCLD_0 = @echo "  FCLD" $@;
>  am__v_FCLD_1 =
>  SOURCES = $(libgomp_plugin_gcn_la_SOURCES) \
>   $(libgomp_plugin_nvptx_la_SOURCES) $(libgomp_la_SOURCES)
> -AM_V_DVIPS = $(am__v_DVIPS_@AM_V@)
> -am__v_DVIPS_ = 

Re: [PATCH 1/3] STABS: remove -gstabs and -gxcoff functionality

2022-11-10 Thread Martin Liška

On 11/4/22 10:32, Thomas Schwinge wrote:

Hi!

On 2022-09-01T12:05:23+0200, Martin Liška  wrote:

gcc/ChangeLog:



--- a/gcc/system.h
+++ b/gcc/system.h
@@ -1009,8 +1009,7 @@ extern void fancy_abort (const char *, int, const char *)
   ASM_OUTPUT_DEFINE_LABEL_DIFFERENCE_SYMBOL HOST_WORDS_BIG_ENDIAN\
   OBJC_PROLOGUE ALLOCATE_TRAMPOLINE HANDLE_PRAGMA ROUND_TYPE_SIZE\
   ROUND_TYPE_SIZE_UNIT CONST_SECTION_ASM_OP CRT_GET_RFIB_TEXT\
- DBX_LBRAC_FIRST DBX_OUTPUT_ENUM DBX_OUTPUT_SOURCE_FILENAME \
- DBX_WORKING_DIRECTORY INSN_CACHE_DEPTH INSN_CACHE_SIZE \
+ INSN_CACHE_DEPTH INSN_CACHE_SIZE   \
   INSN_CACHE_LINE_WIDTH INIT_SECTION_PREAMBLE NEED_ATEXIT ON_EXIT\
   EXIT_BODY OBJECT_FORMAT_ROSE MULTIBYTE_CHARS MAP_CHARACTER \
   LIBGCC_NEEDS_DOUBLE FINAL_PRESCAN_LABEL DEFAULT_CALLER_SAVES   \
@@ -1023,15 +1022,14 @@ extern void fancy_abort (const char *, int, const char 
*)
   MAX_WCHAR_TYPE_SIZE SHARED_SECTION_ASM_OP INTEGRATE_THRESHOLD  \
   FINAL_REG_PARM_STACK_SPACE MAYBE_REG_PARM_STACK_SPACE  \
   TRADITIONAL_PIPELINE_INTERFACE DFA_PIPELINE_INTERFACE  \
- DBX_OUTPUT_STANDARD_TYPES BUILTIN_SETJMP_FRAME_VALUE   \
+ BUILTIN_SETJMP_FRAME_VALUE \
   SUNOS4_SHARED_LIBRARIES PROMOTE_FOR_CALL_ONLY  \
   SPACE_AFTER_L_OPTION NO_RECURSIVE_FUNCTION_CSE \
   DEFAULT_MAIN_RETURN TARGET_MEM_FUNCTIONS EXPAND_BUILTIN_VA_ARG \
   COLLECT_PARSE_FLAG DWARF2_GENERATE_TEXT_SECTION_LABEL WINNING_GDB  \
   ASM_OUTPUT_FILENAME ASM_OUTPUT_SOURCE_LINE FILE_NAME_JOINER\
- GDB_INV_REF_REGPARM_STABS_LETTER DBX_MEMPARM_STABS_LETTER  \
- PUT_SDB_SRC_FILE STABS_GCC_MARKER DBX_OUTPUT_FUNCTION_END  \
- DBX_OUTPUT_GCC_MARKER DBX_FINISH_SYMBOL SDB_GENERATE_FAKE  \
+ GDB_INV_REF_REGPARM_STABS_LETTER   \
+ PUT_SDB_SRC_FILE STABS_GCC_MARKER SDB_GENERATE_FAKE\
   NON_SAVING_SETJMP TARGET_LATE_RTL_PROLOGUE_EPILOGUE\
   CASE_DROPS_THROUGH TARGET_BELL TARGET_BS TARGET_CR TARGET_DIGIT0   \
  TARGET_ESC TARGET_FF TARGET_NEWLINE TARGET_TAB TARGET_VT\
@@ -1056,8 +1054,8 @@ extern void fancy_abort (const char *, int, const char *)
   PREFERRED_OUTPUT_RELOAD_CLASS SYSTEM_INCLUDE_DIR   \
   STANDARD_INCLUDE_DIR STANDARD_INCLUDE_COMPONENT\
   LINK_ELIMINATE_DUPLICATE_LDIRECTORIES MIPS_DEBUGGING_INFO  \
- IDENT_ASM_OP ALL_COP_ADDITIONAL_REGISTER_NAMES DBX_OUTPUT_LBRAC\
- DBX_OUTPUT_NFUN DBX_OUTPUT_RBRAC RANGE_TEST_NON_SHORT_CIRCUIT  \
+ IDENT_ASM_OP ALL_COP_ADDITIONAL_REGISTER_NAMES \
+ RANGE_TEST_NON_SHORT_CIRCUIT   \
   REAL_VALUE_TRUNCATE REVERSE_CONDEXEC_PREDICATES_P  \
   TARGET_ALIGN_ANON_BITFIELDS TARGET_NARROW_VOLATILE_BITFIELDS   \
   IDENT_ASM_OP UNALIGNED_SHORT_ASM_OP UNALIGNED_INT_ASM_OP   \


These changes are part of
commit r13-2361-g7e0db0cdf01e9c885a29cb37415f5bc00d90c029
"STABS: remove -gstabs and -gxcoff functionality".  What this does is
remove these identifiers from "poisoning":

 /* As the last action in this file, we poison the identifiers that
shouldn't be used.
 [...]
 /* Other obsolete target macros, or macros that used to be in target
headers and were not used, and may be obsolete or may never have
been used.  */
  #pragma GCC poison [...]

Shouldn't these identifiers actually stay (so that any accidental future
use gets flagged, as I understand this machinery), and instead more
identifiers be added potentially: those where their definition/use got
removed with "STABS: remove -gstabs and -gxcoff functionality"?  (I've
not checked.)


Well, the identifiers are not used any longer, so I don't think we should
poison them. Or do I miss something?

Martin




Grüße
  Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955




Re: [committed] Add another commit to ignore

2022-11-10 Thread Martin Liška

On 11/7/22 09:25, Jakub Jelinek wrote:

Hi!

We can't handle r13-3652-ge4cba49413ca429dc82f6aa2e88129ecb3fdd943
because that commit removed whole liboffloadmic including its
ChangeLog (I'm surprised that touching ChangeLog worked out together
with removing the files), but gcc-changelog/git_update_version.py
then choked on it because it couldn't add the liboffloadmic
entries.


Hi.

Interesting. I'll handle such a situation for the future.


Wonder if next time such removals shouldn't be committed in 2 steps,
in one step everything but the ChangeLog would be removed, wait for
update_git_version and then in a separate commit just remove the
ChangeLog.


Will take a look.

Martin



Anyway, to restore daily bumps I had to commit the following
patch, run update_git_version manually and then commit in
r13-3705-g89d0a14a1fdf89d38d9db1156ffde8c1b276823c the ChangeLog
entries for the removal manually.

2022-11-06  Jakub Jelinek  

 * gcc-changelog/git_update_version.py: Add
 e4cba49413ca429dc82f6aa2e88129ecb3fdd943 to ignored commits.

--- contrib/gcc-changelog/git_update_version.py.jj
+++ contrib/gcc-changelog/git_update_version.py
@@ -33,7 +33,8 @@ IGNORED_COMMITS = (
  '04a040d907a83af54e0a98bdba5bfabc0ef4f700',
  '2e96b5f14e4025691b57d2301d71aa6092ed44bc',
  '3ab5c8cd03d92bf4ec41e351820349d92fbc40c4',
-'86d8e0c0652ef5236a460b75c25e4f7093cc0651')
+'86d8e0c0652ef5236a460b75c25e4f7093cc0651',
+'e4cba49413ca429dc82f6aa2e88129ecb3fdd943')
  
  FORMAT = '%(asctime)s:%(levelname)s:%(name)s:%(message)s'

  logging.basicConfig(level=logging.INFO, format=FORMAT,


Jakub





Re: [PATCH] RISC-V: Fix selection of pipeline model for sifive-7-series

2022-11-10 Thread Philipp Tomsich
Applied to master, thank you!

On Thu, 10 Nov 2022 at 02:03, Kito Cheng  wrote:
>
> LGTM, thank you for catching that!!
>
> On Wed, Nov 9, 2022 at 3:50 PM Philipp Tomsich  
> wrote:
> >
> > A few of the gcc.target/riscv/mcpu-*.c tests have been failing for a
> > while now, due to the pipeline model for sifive-7-series not being
> > selected despite -mtune=sifive-7-series.  The root cause is that the
> > respective RISCV_TUNE entry points to generic instead.  Fix this.
> >
> > Fixes 97d1ed67fc6 ("RISC-V: Support --target-help for -mcpu/-mtune")
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv-cores.def (RISCV_TUNE): Update
> >   sifive-7-series to point to the sifive_7 pipeline
> >   description.
> >
> > Signed-off-by: Philipp Tomsich 
> > ---
> >
> >  gcc/config/riscv/riscv-cores.def | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/gcc/config/riscv/riscv-cores.def 
> > b/gcc/config/riscv/riscv-cores.def
> > index b84ad999ac1..31ad34682c5 100644
> > --- a/gcc/config/riscv/riscv-cores.def
> > +++ b/gcc/config/riscv/riscv-cores.def
> > @@ -36,7 +36,7 @@
> >  RISCV_TUNE("rocket", generic, rocket_tune_info)
> >  RISCV_TUNE("sifive-3-series", generic, rocket_tune_info)
> >  RISCV_TUNE("sifive-5-series", generic, rocket_tune_info)
> > -RISCV_TUNE("sifive-7-series", generic, sifive_7_tune_info)
> > +RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
> >  RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
> >  RISCV_TUNE("size", generic, optimize_size_tune_info)
> >
> > --
> > 2.34.1
> >


[PATCH] doc: formatting fixes

2022-11-10 Thread Andreas Schwab via Gcc-patches
gcc/
* doc/gcc/gcc-command-options/option-summary.rst: Fix formatting.
---
 gcc/doc/gcc/gcc-command-options/option-summary.rst | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/gcc/doc/gcc/gcc-command-options/option-summary.rst 
b/gcc/doc/gcc/gcc-command-options/option-summary.rst
index 388da445591..acc70920bad 100644
--- a/gcc/doc/gcc/gcc-command-options/option-summary.rst
+++ b/gcc/doc/gcc/gcc-command-options/option-summary.rst
@@ -1209,10 +1209,8 @@ in the following sections.
   :option:`-malign-data=type` |gol|
   :option:`-mbig-endian`  :option:`-mlittle-endian` |gol|
   :option:`-mstack-protector-guard=guard`  
:option:`-mstack-protector-guard-reg=reg` |gol|
-  :option:`-mstack-protector-guard-offset=offset`
-  -mcsr-check -mno-csr-check
-
-  .. program:: -mcsr-check -mno-csr-check
+  :option:`-mstack-protector-guard-offset=offset` |gol|
+  :option:`-mcsr-check`  :option:`-mno-csr-check`
 
   *RL78 Options*
 
@@ -1524,5 +1522,3 @@ in the following sections.
   *zSeries Options*
 
   See :ref:`s-390-and-zseries-options`.
-
-  .. program:: None
-- 
2.38.1


-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-11-10 Thread Richard Biener via Gcc-patches
On Thu, Nov 10, 2022 at 2:05 PM Martin Liška  wrote:
>
> On 11/9/22 18:14, Joseph Myers wrote:
> > On Wed, 9 Nov 2022, Martin Liška wrote:
> >
> >> 1) not synchronized content among lib*/Makefile.in and lib*/Makefile.am.
> >> Apparently, I modified the generated Makefile.in file with the rules like:
> >>
> >> doc/info/texinfo/libitm.info: $(SPHINX_FILES)
> >>  + if [ x$(HAS_SPHINX_BUILD) = xhas-sphinx-build ]; then \
> >>make -C $(srcdir)/../doc info SOURCEDIR=$(abs_srcdir)/doc 
> >> BUILDDIR=$(abs_doc_builddir)/info SPHINXBUILD=$(SPHINX_BUILD); \
> >>  else true; fi
> >>
> >> Can you please modify Makefile.am in the corresponding manner and 
> >> re-generate Makefile.in?
> >
> > I think someone else had best look at this.
>
> All right, I've got a patch candidate for it, so I'll be hopefully able to 
> manage.
>
> >
> >> 2) Adding proper support --enable-generated-files-in-srcdir in gcc_release:
> >
> > It looks like all the GENINSRC rules / conditionals are still present.
> > So maybe there are details where the paths are wrong, or where fixes are
> > needed to ensure the files get installed from the source directory when
> > available in the source directory but not available in the build directory
> > because Sphinx isn't available, but much of the code for the feature is
> > still there.
>
> I can investigate then. Is the option --enable-generated-files-in-srcdir 
> suppose
> to be used when building from a release tarball (that includes man/info in 
> src),
> or to create such a source tarball?

It's to create such a tarball which then can be built with a reduced toolset.

> Cheers,
> Martin
>


[PATCH] range-op: Implement floating point multiplication fold_range [PR107569]

2022-11-10 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch implements frange multiplication, including the
special case of x * x.  The callers don't tell us that it is x * x,
just that it is either z = x * x or if (x == y) z = x * y;
For irange that makes no difference, but for frange it can mean
x is -0.0 and y is 0.0 if they have the same range that includes both
signed and unsigned zeros, so we need to assume result could be -0.0.

The patch causes one regression:
+FAIL: gcc.dg/fold-overflow-1.c scan-assembler-times 2139095040 2
but that is already tracked in PR107608 and affects not just the newly
added multiplication, but addition and other floating point operations
(and doesn't seem like a ranger bug but dce or whatever else).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Once we have division and the reverse ops for all of these, perhaps
we can do some cleanups to share common code, but the way I have division
now partly written doesn't show up many commonalities.  Multiplication
is simple, division is a nightmare.

2022-11-10  Jakub Jelinek  

PR tree-optimization/107569
PR tree-optimization/107591
* range-op.h (range_operator_float::rv_fold): Add relation_kind
argument.
* range-op-float.cc (range_operator_float::fold_range): Name
last argument trio and pass trio.op1_op2 () as last argument to
rv_fold.
(range_operator_float::rv_fold): Add relation_kind argument.
(foperator_plus::rv_fold, foperator_minus::rv_fold): Likewise.
(frange_mult): New function.
(foperator_mult): New class.
(floating_op_table::floating_op_table): Use foperator_mult for
MULT_EXPR.

--- gcc/range-op.h.jj   2022-11-10 00:55:09.430219763 +0100
+++ gcc/range-op.h  2022-11-10 11:30:33.594114939 +0100
@@ -123,7 +123,8 @@ public:
const REAL_VALUE_TYPE _lb,
const REAL_VALUE_TYPE _ub,
const REAL_VALUE_TYPE _lb,
-   const REAL_VALUE_TYPE _ub) const;
+   const REAL_VALUE_TYPE _ub,
+   relation_kind) const;
   // Unary operations have the range of the LHS as op2.
   virtual bool fold_range (irange , tree type,
   const frange ,
--- gcc/range-op-float.cc.jj2022-11-10 00:55:09.318221259 +0100
+++ gcc/range-op-float.cc   2022-11-10 11:31:29.040359082 +0100
@@ -51,7 +51,7 @@ along with GCC; see the file COPYING3.
 bool
 range_operator_float::fold_range (frange , tree type,
  const frange , const frange ,
- relation_trio) const
+ relation_trio trio) const
 {
   if (empty_range_varying (r, type, op1, op2))
 return true;
@@ -65,7 +65,7 @@ range_operator_float::fold_range (frange
   bool maybe_nan;
   rv_fold (lb, ub, maybe_nan, type,
   op1.lower_bound (), op1.upper_bound (),
-  op2.lower_bound (), op2.upper_bound ());
+  op2.lower_bound (), op2.upper_bound (), trio.op1_op2 ());
 
   // Handle possible NANs by saturating to the appropriate INF if only
   // one end is a NAN.  If both ends are a NAN, just return a NAN.
@@ -103,8 +103,8 @@ range_operator_float::rv_fold (REAL_VALU
   const REAL_VALUE_TYPE _lb ATTRIBUTE_UNUSED,
   const REAL_VALUE_TYPE _ub ATTRIBUTE_UNUSED,
   const REAL_VALUE_TYPE _lb ATTRIBUTE_UNUSED,
-  const REAL_VALUE_TYPE _ub ATTRIBUTE_UNUSED)
-  const
+  const REAL_VALUE_TYPE _ub ATTRIBUTE_UNUSED,
+  relation_kind) const
 {
   lb = dconstninf;
   ub = dconstinf;
@@ -1868,7 +1868,8 @@ class foperator_plus : public range_oper
const REAL_VALUE_TYPE _lb,
const REAL_VALUE_TYPE _ub,
const REAL_VALUE_TYPE _lb,
-   const REAL_VALUE_TYPE _ub) const final override
+   const REAL_VALUE_TYPE _ub,
+   relation_kind) const final override
   {
 frange_arithmetic (PLUS_EXPR, type, lb, lh_lb, rh_lb, dconstninf);
 frange_arithmetic (PLUS_EXPR, type, ub, lh_ub, rh_ub, dconstinf);
@@ -1892,7 +1893,8 @@ class foperator_minus : public range_ope
const REAL_VALUE_TYPE _lb,
const REAL_VALUE_TYPE _ub,
const REAL_VALUE_TYPE _lb,
-   const REAL_VALUE_TYPE _ub) const final override
+   const REAL_VALUE_TYPE _ub,
+   relation_kind) const final override
   {
 frange_arithmetic (MINUS_EXPR, type, lb, lh_lb, rh_ub, dconstninf);
 frange_arithmetic (MINUS_EXPR, type, ub, lh_ub, rh_lb, dconstinf);
@@ -1908,6 +1910,123 @@ class foperator_minus : public range_ope
   }
 } fop_minus;
 
+/* Wrapper around frange_arithmetics, that computes the result
+   if inexact rounded to both directions.  Also, if one of the
+   operands is +-0.0 and 

Re: [PATCH] RISC-V: Implement movmisalign to enable SLP

2022-11-10 Thread Philipp Tomsich
On Thu, 10 Nov 2022 at 02:24, Kito Cheng  wrote:
>
> I am not sure if I am missing something, your testcase should rely on
> movmisalignhi pattern, but you defined movmisalign with ANYF
> mode iterator rather than movmisalign with HI, SI, DI?


It was already defined with the ANYI iterator in the patch, but that
seems to be a moot point...

>
> And seems the testcase compile with `-march=rv64gc -mabi=lp64
> -mtune=size -O2` w/o this patch already generated lhu/sh pair?


...as this change is needed on our GCC 12.x tree, but the current
trunk seems to correctly form the lhu on master (at least for the
artificial testcase) even without it.
Thanks for catching this!

I'll put this back to the end of the queue: this has to be looked at
with the original underlying issue in SPEC CPU 2017.
You'll probably not hear more on this specific case until after the
close of phase 1.

—Philipp.

>
> On Wed, Nov 9, 2022 at 3:08 PM Philipp Tomsich  
> wrote:
> >
> > The default implementation of support_vector_misalignment() checks
> > whether movmisalign is present for the requested mode.  This
> > will be used by vect_supportable_dr_alignment() to determine whether a
> > misaligned access of vectorized data is permissible.
> >
> > For RISC-V this is required to convert multiple integer data refs,
> > such as "c[1] << 8) | c[0]" into a larger (in the example before: a
> > halfword load) access.
> > We conditionalize on !riscv_slow_unaligned_access_p to allow the
> > misaligned refs, if they are not expected to be slow.
> >
> > This benefits both xalancbmk and blender on SPEC CPU 2017.
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv.md (movmisalign): Implement.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/movmisalign-1.c: New test.
> > * gcc.target/riscv/movmisalign-2.c: New test.
> > * gcc.target/riscv/movmisalign-3.c: New test.
> >
> > Signed-off-by: Philipp Tomsich 
> > ---
> >
> >  gcc/config/riscv/riscv.md  | 18 ++
> >  gcc/testsuite/gcc.target/riscv/movmisalign-1.c | 12 
> >  gcc/testsuite/gcc.target/riscv/movmisalign-2.c | 12 
> >  gcc/testsuite/gcc.target/riscv/movmisalign-3.c | 12 
> >  4 files changed, 54 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/movmisalign-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/movmisalign-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/movmisalign-3.c
> >
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index 289ff7470c6..1b357a9c57f 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -1715,6 +1715,24 @@
> >   MAX_MACHINE_MODE, [3], TRUE);
> >  })
> >
> > +;; Misaligned (integer) moves: provide an implementation for
> > +;; movmisalign, so the default support_vector_misalignment() will
> > +;; return the right boolean depending on whether
> > +;; riscv_slow_unaligned_access_p is set or not.
> > +;;
> > +;; E.g., this is needed for SLP to convert "c[1] << 8) | c[0]" into a
> > +;; HImode load (a good test case will be blender and xalancbmk in SPEC
> > +;; CPU 2017).
> > +;;
> > +(define_expand "movmisalign"
> > +  [(set (match_operand:ANYI 0 "")
> > +   (match_operand:ANYI 1 ""))]
> > +  "!riscv_slow_unaligned_access_p"
> > +{
> > +  if (riscv_legitimize_move (mode, operands[0], operands[1]))
> > +DONE;
> > +})
> > +
> >  ;; 64-bit integer moves
> >
> >  (define_expand "movdi"
> > diff --git a/gcc/testsuite/gcc.target/riscv/movmisalign-1.c 
> > b/gcc/testsuite/gcc.target/riscv/movmisalign-1.c
> > new file mode 100644
> > index 000..791a3d63335
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/movmisalign-1.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64gc -mabi=lp64 -mtune=size" } */
> > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-O1" } } */
> > +
> > +void f(unsigned short *sink, unsigned char *arr)
> > +{
> > +  *sink = (arr[1] << 8) | arr[0];
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "lhu\t" 1 } } */
> > +/* { dg-final { scan-assembler-not "lbu\t" } } */
> > +
> > diff --git a/gcc/testsuite/gcc.target/riscv/movmisalign-2.c 
> > b/gcc/testsuite/gcc.target/riscv/movmisalign-2.c
> > new file mode 100644
> > index 000..ef73dcb2d9d
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/movmisalign-2.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64gc -mabi=lp64 -mtune=size -mstrict-align" } */
> > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-O1" } } */
> > +
> > +void f(unsigned short *sink, unsigned char *arr)
> > +{
> > +  *sink = (arr[1] << 8) | arr[0];
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "lbu\t" 2 } } */
> > +/* { dg-final { scan-assembler-not "lhu\t" } } */
> > +
> > diff --git a/gcc/testsuite/gcc.target/riscv/movmisalign-3.c 
> > b/gcc/testsuite/gcc.target/riscv/movmisalign-3.c
> > new file 

[PATCH] Restore CCP copy propagation

2022-11-10 Thread Richard Biener via Gcc-patches
The following restores copy propagation in CCP for the case the
lattice was constant before trying to transition to a copy.  At
some point we changed to use the meet operator to handle
integer constant -> integer constant transitions but that screws
up the const -> copy lattice transition.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/84646
* tree-ssa-ccp.cc (set_lattice_value): Make sure we
allow a const -> copy transition and avoid using meet
in that case.

* gcc.dg/tree-ssa/ssa-ccp-42.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-42.c | 26 ++
 gcc/tree-ssa-ccp.cc|  7 +-
 2 files changed, 32 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-42.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-42.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-42.c
new file mode 100644
index 000..b4e5c0f73f2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-ccp-42.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-fgimple -O -fdump-tree-ccp1" } */
+
+__GIMPLE (ssa,startwith("ccp")) int foo (int n)
+{
+  int i;
+  int j;
+
+  __BB(2):
+i_1 = 0;
+goto __BB3;
+
+  __BB(3):
+i_2 = __PHI (__BB2: i_1, __BB3: i_4);
+j_3 = i_2;
+i_4 = i_2 + 1;
+if (i_4 < n_5(D))
+  goto __BB3;
+else
+  goto __BB4;
+
+  __BB(4):
+return j_3;
+}
+
+/* { dg-final { scan-tree-dump "return i_2;" "ccp1" } } */
diff --git a/gcc/tree-ssa-ccp.cc b/gcc/tree-ssa-ccp.cc
index 2bcd90646f6..69fd7f1d11d 100644
--- a/gcc/tree-ssa-ccp.cc
+++ b/gcc/tree-ssa-ccp.cc
@@ -532,7 +532,12 @@ set_lattice_value (tree var, ccp_prop_value_t *new_val)
  use the meet operator to retain a conservative value.
  Missed optimizations like PR65851 makes this necessary.
  It also ensures we converge to a stable lattice solution.  */
-  if (old_val->lattice_val != UNINITIALIZED)
+  if (old_val->lattice_val != UNINITIALIZED
+  /* But avoid using meet for constant -> copy transitions.  */
+  && !(old_val->lattice_val == CONSTANT
+  && CONSTANT_CLASS_P (old_val->value)
+  && new_val->lattice_val == CONSTANT
+  && TREE_CODE (new_val->value) == SSA_NAME))
 ccp_lattice_meet (new_val, old_val);
 
   gcc_checking_assert (valid_lattice_transition (*old_val, *new_val));
-- 
2.35.3


Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-11-10 Thread Martin Liška

On 11/9/22 18:14, Joseph Myers wrote:

On Wed, 9 Nov 2022, Martin Liška wrote:


1) not synchronized content among lib*/Makefile.in and lib*/Makefile.am.
Apparently, I modified the generated Makefile.in file with the rules like:

doc/info/texinfo/libitm.info: $(SPHINX_FILES)
+ if [ x$(HAS_SPHINX_BUILD) = xhas-sphinx-build ]; then \
  make -C $(srcdir)/../doc info SOURCEDIR=$(abs_srcdir)/doc 
BUILDDIR=$(abs_doc_builddir)/info SPHINXBUILD=$(SPHINX_BUILD); \
else true; fi

Can you please modify Makefile.am in the corresponding manner and re-generate 
Makefile.in?


I think someone else had best look at this.


All right, I've got a patch candidate for it, so I'll be hopefully able to 
manage.




2) Adding proper support --enable-generated-files-in-srcdir in gcc_release:


It looks like all the GENINSRC rules / conditionals are still present.
So maybe there are details where the paths are wrong, or where fixes are
needed to ensure the files get installed from the source directory when
available in the source directory but not available in the build directory
because Sphinx isn't available, but much of the code for the feature is
still there.


I can investigate then. Is the option --enable-generated-files-in-srcdir suppose
to be used when building from a release tarball (that includes man/info in src),
or to create such a source tarball?

Cheers,
Martin



[PATCH (pushed)] sphinx: add missing newline for conf.py files.

2022-11-10 Thread Martin Liška

gcc/d/ChangeLog:

* doc/conf.py: Add newline at last line.

gcc/ChangeLog:

* doc/cpp/conf.py: Add newline at last line.
* doc/cppinternals/conf.py: Add newline at last line.
* doc/gcc/conf.py: Add newline at last line.
* doc/gccint/conf.py: Add newline at last line.
* doc/install/conf.py: Add newline at last line.

gcc/fortran/ChangeLog:

* doc/gfc-internals/conf.py: Add newline at last line.
* doc/gfortran/conf.py: Add newline at last line.

gcc/go/ChangeLog:

* doc/conf.py: Add newline at last line.

libgomp/ChangeLog:

* doc/conf.py: Add newline at last line.

libiberty/ChangeLog:

* doc/conf.py: Add newline at last line.

libitm/ChangeLog:

* doc/conf.py: Add newline at last line.

libquadmath/ChangeLog:

* doc/conf.py: Add newline at last line.
---
 gcc/d/doc/conf.py | 2 +-
 gcc/doc/cpp/conf.py   | 2 +-
 gcc/doc/cppinternals/conf.py  | 2 +-
 gcc/doc/gcc/conf.py   | 2 +-
 gcc/doc/gccint/conf.py| 2 +-
 gcc/doc/install/conf.py   | 2 +-
 gcc/fortran/doc/gfc-internals/conf.py | 2 +-
 gcc/fortran/doc/gfortran/conf.py  | 2 +-
 gcc/go/doc/conf.py| 2 +-
 libgomp/doc/conf.py   | 2 +-
 libiberty/doc/conf.py | 2 +-
 libitm/doc/conf.py| 2 +-
 libquadmath/doc/conf.py   | 2 +-
 13 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/d/doc/conf.py b/gcc/d/doc/conf.py
index c33f28a2f7f..180b8351bdd 100644
--- a/gcc/d/doc/conf.py
+++ b/gcc/d/doc/conf.py
@@ -27,4 +27,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/doc/cpp/conf.py b/gcc/doc/cpp/conf.py
index 29d3aed4558..2abfb353a6a 100644
--- a/gcc/doc/cpp/conf.py
+++ b/gcc/doc/cpp/conf.py
@@ -27,4 +27,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/doc/cppinternals/conf.py b/gcc/doc/cppinternals/conf.py
index d9ec1a69125..bee71cd10ab 100644
--- a/gcc/doc/cppinternals/conf.py
+++ b/gcc/doc/cppinternals/conf.py
@@ -21,4 +21,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/doc/gcc/conf.py b/gcc/doc/gcc/conf.py
index 7987f4d885b..6905e5521a5 100644
--- a/gcc/doc/gcc/conf.py
+++ b/gcc/doc/gcc/conf.py
@@ -34,4 +34,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/doc/gccint/conf.py b/gcc/doc/gccint/conf.py
index 466261dbeb1..bd4bc748841 100644
--- a/gcc/doc/gccint/conf.py
+++ b/gcc/doc/gccint/conf.py
@@ -21,4 +21,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/doc/install/conf.py b/gcc/doc/install/conf.py
index ebc1b40482b..e69dfa15b84 100644
--- a/gcc/doc/install/conf.py
+++ b/gcc/doc/install/conf.py
@@ -21,4 +21,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/fortran/doc/gfc-internals/conf.py 
b/gcc/fortran/doc/gfc-internals/conf.py
index 176e6310b89..f69edb7c290 100644
--- a/gcc/fortran/doc/gfc-internals/conf.py
+++ b/gcc/fortran/doc/gfc-internals/conf.py
@@ -21,4 +21,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/fortran/doc/gfortran/conf.py b/gcc/fortran/doc/gfortran/conf.py
index 8be1e53d872..f1b042a6177 100644
--- a/gcc/fortran/doc/gfortran/conf.py
+++ b/gcc/fortran/doc/gfortran/conf.py
@@ -27,4 +27,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/go/doc/conf.py b/gcc/go/doc/conf.py
index 9157fba79ee..7c6ffb4eacb 100644
--- a/gcc/go/doc/conf.py
+++ b/gcc/go/doc/conf.py
@@ -27,4 +27,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/libgomp/doc/conf.py b/libgomp/doc/conf.py
index 27e3131fae8..f89b6f31def 100644
--- a/libgomp/doc/conf.py
+++ b/libgomp/doc/conf.py
@@ -21,4 +21,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 

[PATCH (pushed)] sphinx: add missing newline for conf.py files.

2022-11-10 Thread Martin Liška

gcc/d/ChangeLog:

* doc/conf.py: Add newline at last line.

gcc/ChangeLog:

* doc/cpp/conf.py: Add newline at last line.
* doc/cppinternals/conf.py: Add newline at last line.
* doc/gcc/conf.py: Add newline at last line.
* doc/gccint/conf.py: Add newline at last line.
* doc/install/conf.py: Add newline at last line.

gcc/fortran/ChangeLog:

* doc/gfc-internals/conf.py: Add newline at last line.
* doc/gfortran/conf.py: Add newline at last line.

gcc/go/ChangeLog:

* doc/conf.py: Add newline at last line.

libgomp/ChangeLog:

* doc/conf.py: Add newline at last line.

libiberty/ChangeLog:

* doc/conf.py: Add newline at last line.

libitm/ChangeLog:

* doc/conf.py: Add newline at last line.

libquadmath/ChangeLog:

* doc/conf.py: Add newline at last line.
---
 gcc/d/doc/conf.py | 2 +-
 gcc/doc/cpp/conf.py   | 2 +-
 gcc/doc/cppinternals/conf.py  | 2 +-
 gcc/doc/gcc/conf.py   | 2 +-
 gcc/doc/gccint/conf.py| 2 +-
 gcc/doc/install/conf.py   | 2 +-
 gcc/fortran/doc/gfc-internals/conf.py | 2 +-
 gcc/fortran/doc/gfortran/conf.py  | 2 +-
 gcc/go/doc/conf.py| 2 +-
 libgomp/doc/conf.py   | 2 +-
 libiberty/doc/conf.py | 2 +-
 libitm/doc/conf.py| 2 +-
 libquadmath/doc/conf.py   | 2 +-
 13 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/d/doc/conf.py b/gcc/d/doc/conf.py
index c33f28a2f7f..180b8351bdd 100644
--- a/gcc/d/doc/conf.py
+++ b/gcc/d/doc/conf.py
@@ -27,4 +27,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/doc/cpp/conf.py b/gcc/doc/cpp/conf.py
index 29d3aed4558..2abfb353a6a 100644
--- a/gcc/doc/cpp/conf.py
+++ b/gcc/doc/cpp/conf.py
@@ -27,4 +27,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/doc/cppinternals/conf.py b/gcc/doc/cppinternals/conf.py
index d9ec1a69125..bee71cd10ab 100644
--- a/gcc/doc/cppinternals/conf.py
+++ b/gcc/doc/cppinternals/conf.py
@@ -21,4 +21,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/doc/gcc/conf.py b/gcc/doc/gcc/conf.py
index 7987f4d885b..6905e5521a5 100644
--- a/gcc/doc/gcc/conf.py
+++ b/gcc/doc/gcc/conf.py
@@ -34,4 +34,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/doc/gccint/conf.py b/gcc/doc/gccint/conf.py
index 466261dbeb1..bd4bc748841 100644
--- a/gcc/doc/gccint/conf.py
+++ b/gcc/doc/gccint/conf.py
@@ -21,4 +21,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/doc/install/conf.py b/gcc/doc/install/conf.py
index ebc1b40482b..e69dfa15b84 100644
--- a/gcc/doc/install/conf.py
+++ b/gcc/doc/install/conf.py
@@ -21,4 +21,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/fortran/doc/gfc-internals/conf.py 
b/gcc/fortran/doc/gfc-internals/conf.py
index 176e6310b89..f69edb7c290 100644
--- a/gcc/fortran/doc/gfc-internals/conf.py
+++ b/gcc/fortran/doc/gfc-internals/conf.py
@@ -21,4 +21,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/fortran/doc/gfortran/conf.py b/gcc/fortran/doc/gfortran/conf.py
index 8be1e53d872..f1b042a6177 100644
--- a/gcc/fortran/doc/gfortran/conf.py
+++ b/gcc/fortran/doc/gfortran/conf.py
@@ -27,4 +27,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/gcc/go/doc/conf.py b/gcc/go/doc/conf.py
index 9157fba79ee..7c6ffb4eacb 100644
--- a/gcc/go/doc/conf.py
+++ b/gcc/go/doc/conf.py
@@ -27,4 +27,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 
-set_common(name, globals())

\ No newline at end of file
+set_common(name, globals())
diff --git a/libgomp/doc/conf.py b/libgomp/doc/conf.py
index 27e3131fae8..f89b6f31def 100644
--- a/libgomp/doc/conf.py
+++ b/libgomp/doc/conf.py
@@ -21,4 +21,4 @@ texinfo_documents = [
   ('index', name, project, authors, None, None, None, True)
 ]
 

Re: [RFC] docs: remove documentation for unsupported releases

2022-11-10 Thread Alexander Monakov via Gcc-patches


On Thu, 10 Nov 2022, Martin Liška wrote:

> On 11/10/22 08:29, Gerald Pfeifer wrote:
> > On Wed, 9 Nov 2022, Alexander Monakov wrote:
> >> For this I would suggest using the  tag to neatly fold links
> >> for old releases. Please see the attached patch.
> > 
> > Loving it, Alexander!
> > 
> > What do you guys think about unfolding all releases we, the GCC project,
> > currently support (per https://gcc.gnu.org that'd be 12.x, 11.x, and 10.x
> > at this point)?
> 
> Works for me!
> 
> > 
> > Either way: yes, please (aka approved). :-)
> 
> Alexander, can you please install such change?

Yes, pushed: https://gcc.gnu.org/onlinedocs/

Alexander


[PATCH] sphinx: support Sphinx in lib*/Makefile.am.

2022-11-10 Thread Martin Liška

Hi.

This is a patch which adds support for Sphinx in lib*/Makefile.am where
I wrongly modified Makefile.in that are generated.

One thing that's missing is that the generated Makefile.in does not
contain 'install-info-am' target and thus the created info files
are not installed with 'make install'. Does anybody know?

Thanks,
Martin

---
 libgomp/Makefile.am   |  27 ++-
 libgomp/Makefile.in   | 275 +++---
 libgomp/testsuite/Makefile.in |   3 +
 libitm/Makefile.am|  26 ++-
 libitm/Makefile.in| 278 ++
 libitm/testsuite/Makefile.in  |   3 +
 libquadmath/Makefile.am   |  37 ++--
 libquadmath/Makefile.in   | 307 +++---
 8 files changed, 208 insertions(+), 748 deletions(-)

diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index 428f7a9dab5..ab5e86b0f98 100644
--- a/libgomp/Makefile.am
+++ b/libgomp/Makefile.am
@@ -11,6 +11,8 @@ config_path = @config_path@
 search_path = $(addprefix $(top_srcdir)/config/, $(config_path)) $(top_srcdir) 
\
  $(top_srcdir)/../include
 
+abs_doc_builddir = @abs_top_builddir@/doc

+
 fincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)$(MULTISUBDIR)/finclude
 libsubincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include
 
@@ -100,18 +102,6 @@ fortran.o: libgomp_f.h

 env.lo: libgomp_f.h
 env.o: libgomp_f.h
 
-

-# Automake Documentation:
-# If your package has Texinfo files in many directories, you can use the
-# variable TEXINFO_TEX to tell Automake where to find the canonical
-# `texinfo.tex' for your package. The value of this variable should be
-# the relative path from the current `Makefile.am' to `texinfo.tex'.
-TEXINFO_TEX   = ../gcc/doc/include/texinfo.tex
-
-# Defines info, dvi, pdf and html targets
-MAKEINFOFLAGS = -I $(srcdir)/../gcc/doc/include
-info_TEXINFOS = libgomp.texi
-
 # AM_CONDITIONAL on configure option --generated-files-in-srcdir
 if GENINSRC
 STAMP_GENINSRC = stamp-geninsrc
@@ -127,7 +117,7 @@ STAMP_BUILD_INFO =
 endif
 
 
-all-local: $(STAMP_GENINSRC)

+all-local: $(STAMP_GENINSRC) $(STAMP_BUILD_INFO)
 
 stamp-geninsrc: libgomp.info

cp -p $(top_builddir)/libgomp.info $(srcdir)/libgomp.info
@@ -135,8 +125,15 @@ stamp-geninsrc: libgomp.info
 
 libgomp.info: $(STAMP_BUILD_INFO)
 
-stamp-build-info: libgomp.texi

-   $(MAKEINFO) $(AM_MAKEINFOFLAGS) $(MAKEINFOFLAGS) -I $(srcdir) -o 
libgomp.info $(srcdir)/libgomp.texi
+RST_FILES:=$(shell find $(srcdir) -name *.rst)
+SPHINX_CONFIG_FILES:=$(srcdir)/doc/conf.py $(srcdir)/../doc/baseconf.py
+SPHINX_FILES:=$(RST_FILES) $(SPHINX_CONFIG_FILES)
+
+stamp-build-info: $(SPHINX_FILES)
+   + if [ x$(HAS_SPHINX_BUILD) = xhas-sphinx-build ]; then \
+ make -C $(srcdir)/../doc info SOURCEDIR=$(abs_srcdir)/doc 
BUILDDIR=$(abs_doc_builddir)/info SPHINXBUILD=$(SPHINX_BUILD); \
+ cp ./doc/info/texinfo/libgomp.info libgomp.info; \
+   else true; fi
@touch $@
 
 
diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in

index 814ccd13dc0..4d0f2184e95 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -177,7 +177,7 @@ am__uninstall_files_from_dir = { \
 || { echo " ( cd '$$dir' && rm -f" $$files ")"; \
  $(am__cd) "$$dir" && rm -f $$files; }; \
   }
-am__installdirs = "$(DESTDIR)$(toolexeclibdir)" "$(DESTDIR)$(infodir)" \
+am__installdirs = "$(DESTDIR)$(toolexeclibdir)" \
"$(DESTDIR)$(fincludedir)" "$(DESTDIR)$(libsubincludedir)" \
"$(DESTDIR)$(toolexeclibdir)"
 LTLIBRARIES = $(toolexeclib_LTLIBRARIES)
@@ -269,16 +269,9 @@ am__v_FCLD_0 = @echo "  FCLD" $@;
 am__v_FCLD_1 =
 SOURCES = $(libgomp_plugin_gcn_la_SOURCES) \
$(libgomp_plugin_nvptx_la_SOURCES) $(libgomp_la_SOURCES)
-AM_V_DVIPS = $(am__v_DVIPS_@AM_V@)
-am__v_DVIPS_ = $(am__v_DVIPS_@AM_DEFAULT_V@)
-am__v_DVIPS_0 = @echo "  DVIPS   " $@;
-am__v_DVIPS_1 =
-INFO_DEPS = doc/info/texinfo/libgomp.info
-PDFS = doc/pdf/latex/libgomp.pdf
-HTMLS = doc/html/html/index.html
 RECURSIVE_TARGETS = all-recursive check-recursive cscopelist-recursive \
-   ctags-recursive html-recursive info-recursive \
-   install-data-recursive \
+   ctags-recursive dvi-recursive html-recursive info-recursive \
+   install-data-recursive install-dvi-recursive \
install-exec-recursive install-html-recursive \
install-info-recursive install-pdf-recursive \
install-ps-recursive install-recursive installcheck-recursive \
@@ -332,6 +325,7 @@ AWK = @AWK@
 CC = @CC@
 CCDEPMODE = @CCDEPMODE@
 CFLAGS = @CFLAGS@
+CONFIGURE_SPHINX_BUILD = @CONFIGURE_SPHINX_BUILD@
 CPP = @CPP@
 CPPFLAGS = @CPPFLAGS@
 CPU_COUNT = @CPU_COUNT@
@@ -350,6 +344,7 @@ FC = @FC@
 FCFLAGS = @FCFLAGS@
 FGREP = @FGREP@
 GREP = @GREP@
+HAS_SPHINX_BUILD = @HAS_SPHINX_BUILD@
 INSTALL = @INSTALL@
 INSTALL_DATA = @INSTALL_DATA@
 INSTALL_PROGRAM = @INSTALL_PROGRAM@
@@ -365,6 +360,7 @@ LIPO = @LIPO@
 LN_S = @LN_S@
 LTLIBOBJS = @LTLIBOBJS@
 MAINT = @MAINT@
+MAKEINFO = 

Re: [wwwdocs] Add httpd redirects for texinfo trunk docs and for each release series

2022-11-10 Thread Jonathan Wakely via Gcc-patches

On 09/11/22 15:41 +, Jonathan Wakely wrote:

I've tested that the redirects work on my own httpd server, and have
verified that no new sphinx-generated docs match these patterns, and no
old texinfo docs fail to match them (except for cases like index.html
where a new file exists with the same name anyway so we don't need a
redirect).


This has now been reported as PR web/107610. As I said there:

I've (temporarily) installed the redirects in a .htaccess on my own
site so you can test them.

Just replace "gcc.gnu.org" with "kayari.org" to check that it
redirects to a valid page of the gcc-12.2.0 docs:

https://kayari.org/onlinedocs/gccint/Test-Directives.html#Test-Directives
https://kayari.org/onlinedocs/gcc/Function-Attributes.html

This can be used to confirm that the regexes are working as intended.




For example, on gcc.gnu.org:

cd htdocs/onlinedocs/gcc-12.2.0

# All "missing" URLs are matched by these patterns:

for i in {gcc,cpp}/*.html ; do test -f ../$i || echo $i ; done | grep -E -v 
'^(gcc|cpp)/([[:upper:]].*|_00(5f|40).*|aarch64-feature-modifiers|c99-like-fast-enumeration-syntax|compatibility_005f.*|dashMF|eBPF-Options|fdollars-in-identifiers|lto-dump-Intro|(m68k|msp430|picoChip|x86|zSeries).*|trigraphs).*\.html$'

for i in gccint/*.html ; do test -f ../$i || echo $i ; done | grep -E -v 
'^gccint/([[:upper:]].*|(arm|define|input|poly|stack|window)_005f.*|compat-Testing|(epi|pro)logue-instruction-pattern|gcc-Directory|gcov-Testing|loop-iv|profopt-Testing|real-RTL-SSA-insns|shift-patterns|wi-arith.*)\.html$'

for i in cppinternals/*.html ; do test -f ../$i || echo $i ; done | grep -E -v 
'^cppinternals/([[:upper:]].*)\.html$'


# No still-valid URLs are matched:

for i in {gcc,cpp}/*.html ; do test -f ../$i && echo $i ; done | grep -E 
'^(gcc|cpp)/([[:upper:]].*|_00(5f|40).*|aarch64-feature-modifiers|c99-like-fast-enumeration-syntax|compatibility_005f.*|dashMF|eBPF-Options|fdollars-in-identifiers|lto-dump-Intro|(m68k|msp430|picoChip|x86|zSeries).*|trigraphs).*\.html$'

for i in gccint/*.html ; do test -f ../$i && echo $i ; done | grep -E 
'^gccint/([[:upper:]].*|(arm|define|input|poly|stack|window)_005f.*|compat-Testing|(epi|pro)logue-instruction-pattern|gcc-Directory|gcov-Testing|loop-iv|profopt-Testing|real-RTL-SSA-insns|shift-patterns|wi-arith.*)\.html$'

for i in  cppinternals/*.html ; do test -f ../$i && echo $i ; done | grep -E 
'^cppinternals/([[:upper:]].*)\.html$'


I haven't added redirects for other sub-dirs such as gccgo, gfortran,
libgomp etc. so if somebody cares about those, they should deal with
them.

OK for wwwdocs?

-- >8 --

Add redirects from /onlinedocs/gcc-X to the latest gcc-X.Y.0 release
(which will need to be updated when a release is made).

Also add redirects from URLs for old trunk docs such as
https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html
to the gcc-12 equivalent of that page.
---
htdocs/.htaccess | 14 ++
1 file changed, 14 insertions(+)

diff --git a/htdocs/.htaccess b/htdocs/.htaccess
index 18997d63..bf7124ea 100644
--- a/htdocs/.htaccess
+++ b/htdocs/.htaccess
@@ -79,3 +79,17 @@ Redirect   /onlinedocs/libc  
https://www.gnu.org/software/libc/manual/ht
Redirect   /onlinedocs/standards
https://www.gnu.org/prep/standards/html_node/

Redirect   /onlinedocs/ref  
https://gcc.gnu.org/onlinedocs/gcc-4.3.2/
+
+Redirect   /onlinedocs/gcc-5/   
https://gcc.gnu.org/onlinedocs/gcc-5.5.0/
+Redirect   /onlinedocs/gcc-6/   
https://gcc.gnu.org/onlinedocs/gcc-6.5.0/
+Redirect   /onlinedocs/gcc-7/   
https://gcc.gnu.org/onlinedocs/gcc-7.5.0/
+Redirect   /onlinedocs/gcc-8/   
https://gcc.gnu.org/onlinedocs/gcc-8.5.0/
+Redirect   /onlinedocs/gcc-9/   
https://gcc.gnu.org/onlinedocs/gcc-9.5.0/
+Redirect   /onlinedocs/gcc-10/  
https://gcc.gnu.org/onlinedocs/gcc-10.4.0/
+Redirect   /onlinedocs/gcc-11/  
https://gcc.gnu.org/onlinedocs/gcc-11.3.0/
+Redirect   /onlinedocs/gcc-12/  
https://gcc.gnu.org/onlinedocs/gcc-12.2.0/
+
+# Redirect URLs for old texinfo trunk docs to gcc-12
+RedirectMatch permanent 
/onlinedocs/(gcc|cpp)/([[:upper:]].*|_00(5f|40).*|aarch64-feature-modifiers|c99-like-fast-enumeration-syntax|compatibility_005f.*|dashMF|eBPF-Options|fdollars-in-identifiers|lto-dump-Intro|(m68k|msp430|picoChip|x86|zSeries).*|trigraphs).*\.html$
 https://gcc.gnu.org/onlinedocs/gcc-12/$1/$2.html
+RedirectMatch permanent 
/onlinedocs/gccint/([[:upper:]].*|(arm|define|input|poly|stack|window)_005f.*|compat-Testing|(epi|pro)logue-instruction-pattern|gcc-Directory|gcov-Testing|loop-iv|profopt-Testing|real-RTL-SSA-insns|shift-patterns|wi-arith.*)\.html$
 https://gcc.gnu.org/onlinedocs/gcc-12/gccint/$1.html
+RedirectMatch permanent /onlinedocs/cppinternals/([[:upper:]].*)\.html$ 
https://gcc.gnu.org/onlinedocs/gcc-12/gccint/$1.html




[PATCH 2/2] aarch64: Add support for widening LDAPR instructions

2022-11-10 Thread Andre Vieira (lists) via Gcc-patches

Hi,

This patch adds support for the widening LDAPR instructions.

Bootstrapped and regression tested on aarch64-none-linux-gnu.

OK for trunk?

2022-11-09  Andre Vieira  
    Kyrylo Tkachov  

gcc/ChangeLog:

    * config/aarch64/atomics.md 
(*aarch64_atomic_load_rcpc_zext): New pattern.

    (*aarch64_atomic_load_rcpc_zext): Likewise.

gcc/testsuite/ChangeLog:

    * gcc.target/aarch64/ldapr-ext.c: New test.diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 
9a9a30945c6e482a81a1bf446fe05d5efc462d32..77e5b29ad2c41215aa1ca904efb990b087010cef
 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -691,6 +691,28 @@
   }
 )
 
+(define_insn "*aarch64_atomic_load_rcpc_zext"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+(zero_extend:GPI
+  (unspec_volatile:ALLX
+[(match_operand:ALLX 1 "aarch64_sync_memory_operand" "Q")
+ (match_operand:SI 2 "const_int_operand")] ;; model
+   UNSPECV_LDAP)))]
+  "TARGET_RCPC"
+  "ldapr\t%0, %1"
+)
+
+(define_insn "*aarch64_atomic_load_rcpc_sext"
+  [(set (match_operand:GPI  0 "register_operand" "=r")
+(sign_extend:GPI
+  (unspec_volatile:ALLX
+[(match_operand:ALLX 1 "aarch64_sync_memory_operand" "Q")
+ (match_operand:SI 2 "const_int_operand")] ;; model
+   UNSPECV_LDAP)))]
+  "TARGET_RCPC"
+  "ldaprs\t%0, %1"
+)
+
 (define_insn "atomic_store"
   [(set (match_operand:ALLI 0 "aarch64_rcpc_memory_operand" "=Q,Ust")
 (unspec_volatile:ALLI
diff --git a/gcc/testsuite/gcc.target/aarch64/ldapr-ext.c 
b/gcc/testsuite/gcc.target/aarch64/ldapr-ext.c
new file mode 100644
index 
..5a788ffb8787291d43fe200d1d7803b901186912
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ldapr-ext.c
@@ -0,0 +1,94 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -std=c99" } */
+/* { dg-require-effective-target aarch64_rcpc_ok } */
+/* { dg-add-options aarch64_rcpc } */
+/* { dg-final { check-function-bodies "**" "" "" } } */
+#include 
+
+atomic_ullong u64;
+atomic_llong s64;
+atomic_uint u32;
+atomic_int s32;
+atomic_ushort u16;
+atomic_short s16;
+atomic_uchar u8;
+atomic_schar s8;
+
+#define TEST(name, ldsize, rettype)\
+rettype\
+test_##name (void) \
+{  \
+  return atomic_load_explicit (, memory_order_acquire); \
+}
+
+/*
+**test_u8_u64:
+**...
+** ldaprb  x0, \[x[0-9]+\]
+** ret
+*/
+
+TEST(u8_u64, u8, unsigned long long)
+
+/*
+**test_s8_s64:
+**...
+** ldaprsb x0, \[x[0-9]+\]
+** ret
+*/
+
+TEST(s8_s64, s8, long long)
+
+/*
+**test_u16_u64:
+**...
+** ldaprh  x0, \[x[0-9]+\]
+** ret
+*/
+
+TEST(u16_u64, u16, unsigned long long)
+
+/*
+**test_s16_s64:
+**...
+** ldaprsh x0, \[x[0-9]+\]
+** ret
+*/
+
+TEST(s16_s64, s16, long long)
+
+/*
+**test_u8_u32:
+**...
+** ldaprb  w0, \[x[0-9]+\]
+** ret
+*/
+
+TEST(u8_u32, u8, unsigned)
+
+/*
+**test_s8_s32:
+**...
+** ldaprsb w0, \[x[0-9]+\]
+** ret
+*/
+
+TEST(s8_s32, s8, int)
+
+/*
+**test_u16_u32:
+**...
+** ldaprh  w0, \[x[0-9]+\]
+** ret
+*/
+
+TEST(u16_u32, u16, unsigned)
+
+/*
+**test_s16_s32:
+**...
+** ldaprsh w0, \[x[0-9]+\]
+** ret
+*/
+
+TEST(s16_s32, s16, int)


[PATCH 1/2] aarch64: Enable the use of LDAPR for load-acquire semantics

2022-11-10 Thread Andre Vieira (lists) via Gcc-patches

Hello,

This patch enables the use of LDAPR for load-acquire semantics. After 
some internal investigation based on the work published by Podkopaev et 
al. (https://dl.acm.org/doi/10.1145/3290382) we can confirm that using 
LDAPR for the C++ load-acquire semantics is a correct relaxation.


Bootstrapped and regression tested on aarch64-none-linux-gnu.

OK for trunk?

2022-11-09  Andre Vieira  
    Kyrylo Tkachov  

gcc/ChangeLog:

    * config/aarch64/aarch64.h (AARCH64_ISA_RCPC): New Macro.
    (TARGET_RCPC): New Macro.
    * config/aarch64/atomics.md (atomic_load): Change into
    an expand.
    (aarch64_atomic_load_rcpc): New define_insn for ldapr.
    (aarch64_atomic_load): Rename of old define_insn for ldar.
    * config/aarch64/iterators.md (UNSPEC_LDAP): New unspec enum value.
    * 
doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
    (rcpc): Ammend documentation to mention the effects on code 
generation.


gcc/testsuite/ChangeLog:

    * gcc.target/aarch64/ldapr.c: New test.
    * lib/target-supports.exp (add_options_for_aarch64_rcpc): New 
options procedure.
    (check_effective_target_aarch64_rcpc_ok_nocache): New 
check-effective-target.

    (check_effective_target_aarch64_rcpc_ok): Likewise.
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 
e60f9bce023b2cd5e7233ee9b8c61fc93c1494c2..51a8aa02a5850d5c79255dbf7e0764ffdec73ccd
 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -221,6 +221,7 @@ enum class aarch64_feature : unsigned char {
 #define AARCH64_ISA_V9_3A  (aarch64_isa_flags & AARCH64_FL_V9_3A)
 #define AARCH64_ISA_MOPS  (aarch64_isa_flags & AARCH64_FL_MOPS)
 #define AARCH64_ISA_LS64  (aarch64_isa_flags & AARCH64_FL_LS64)
+#define AARCH64_ISA_RCPC   (aarch64_isa_flags & AARCH64_FL_RCPC)
 
 /* Crypto is an optional extension to AdvSIMD.  */
 #define TARGET_CRYPTO (AARCH64_ISA_CRYPTO)
@@ -328,6 +329,9 @@ enum class aarch64_feature : unsigned char {
 /* SB instruction is enabled through +sb.  */
 #define TARGET_SB (AARCH64_ISA_SB)
 
+/* RCPC loads from Armv8.3-a.  */
+#define TARGET_RCPC (AARCH64_ISA_RCPC)
+
 /* Apply the workaround for Cortex-A53 erratum 835769.  */
 #define TARGET_FIX_ERR_A53_835769  \
   ((aarch64_fix_a53_err835769 == 2)\
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 
bc95f6d9d15f190a3e33704b4def2860d5f339bd..801a62bf2ba432f35ae1931beb8c4405b77b36c3
 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -657,7 +657,42 @@
   }
 )
 
-(define_insn "atomic_load"
+(define_expand "atomic_load"
+  [(match_operand:ALLI 0 "register_operand" "=r")
+   (match_operand:ALLI 1 "aarch64_sync_memory_operand" "Q")
+   (match_operand:SI   2 "const_int_operand")]
+  ""
+  {
+/* If TARGET_RCPC and this is an ACQUIRE load, then expand to a pattern
+   using UNSPECV_LDAP.  */
+enum memmodel model = memmodel_from_int (INTVAL (operands[2]));
+if (TARGET_RCPC
+   && (is_mm_acquire (model)
+   || is_mm_acq_rel (model)))
+{
+  emit_insn (gen_aarch64_atomic_load_rcpc (operands[0], operands[1],
+operands[2]));
+}
+else
+{
+  emit_insn (gen_aarch64_atomic_load (operands[0], operands[1],
+   operands[2]));
+}
+DONE;
+  }
+)
+
+(define_insn "aarch64_atomic_load_rcpc"
+  [(set (match_operand:ALLI 0 "register_operand" "=r")
+(unspec_volatile:ALLI
+  [(match_operand:ALLI 1 "aarch64_sync_memory_operand" "Q")
+   (match_operand:SI 2 "const_int_operand")]   ;; model
+  UNSPECV_LDAP))]
+  "TARGET_RCPC"
+  "ldapr\t%0, %1"
+)
+
+(define_insn "aarch64_atomic_load"
   [(set (match_operand:ALLI 0 "register_operand" "=r")
 (unspec_volatile:ALLI
   [(match_operand:ALLI 1 "aarch64_sync_memory_operand" "Q")
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 
a8ad4e5ff215ade06c3ca13a24ef18d259afcb6c..d8c2f9d6c32d6f188d584c2e9d8fb36511624de6
 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -988,6 +988,7 @@
 UNSPECV_LX ; Represent a load-exclusive.
 UNSPECV_SX ; Represent a store-exclusive.
 UNSPECV_LDA; Represent an atomic load or 
load-acquire.
+UNSPECV_LDAP   ; Represent an atomic acquire load with RCpc 
semantics.
 UNSPECV_STL; Represent an atomic store or 
store-release.
 UNSPECV_ATOMIC_CMPSW   ; Represent an atomic compare swap.
 UNSPECV_ATOMIC_EXCHG   ; Represent an atomic exchange.
diff --git 
a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst 
b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/aarch64-options.rst
index 

Re: Rust frontend patches v3

2022-11-10 Thread Richard Biener via Gcc-patches
On Wed, Oct 26, 2022 at 10:16 AM  wrote:
>
> This is the fixed version of our previous patch set for gccrs - We've adressed
> the comments raised in our previous emails.
>
> This patch set does not contain any work that was not previously included, 
> such
> as closure support, the constant evaluator port, or the better implementation
> of target hooks by Iain Buclaw. They will follow up in subsequent patch sets.
>
> Thanks again to Open Source Security, inc and Embecosm who have accompanied us
> for this work.
>
> Many thanks to all of the contributors and our community, who made this
> possible.
>
> A very special thanks to Philip Herron, without whose mentoring I would have
> never been in a position to send these patches.
>
> You can see the current status of our work on our branch:
> https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/devel/rust/master
>
> The patch set contains the following:

Can you mark the patches that have been reviewed/approved?  Can you
maybe either split the series or organize it in a way to separate the
pieces touching common parts of GCC from the gcc/rust/ parts?
Can you separate testsuite infrastructure from actual tests, can
you mark/separate target specific changes?  And for those (then small)
changes CC the appropriate maintainers?

Thanks,
Richard.

> [PATCH Rust front-end v3 01/46] Use DW_ATE_UTF for the Rust 'char'
> [PATCH Rust front-end v3 02/46] gccrs: Add nessecary hooks for a Rust
> [PATCH Rust front-end v3 03/46] gccrs: Add Debug info testsuite
> [PATCH Rust front-end v3 04/46] gccrs: Add link cases testsuite
> [PATCH Rust front-end v3 05/46] gccrs: Add general compilation test
> [PATCH Rust front-end v3 06/46] gccrs: Add execution test cases
> [PATCH Rust front-end v3 07/46] gccrs: Add gcc-check-target
> [PATCH Rust front-end v3 08/46] gccrs: Add Rust front-end base AST
> [PATCH Rust front-end v3 09/46] gccrs: Add definitions of Rust Items
> [PATCH Rust front-end v3 10/46] gccrs: Add full definitions of Rust
> [PATCH Rust front-end v3 11/46] gccrs: Add Rust AST visitors
> [PATCH Rust front-end v3 12/46] gccrs: Add Lexer for Rust front-end
> [PATCH Rust front-end v3 13/46] gccrs: Add Parser for Rust front-end
> [PATCH Rust front-end v3 14/46] gccrs: Add Parser for Rust front-end
> [PATCH Rust front-end v3 15/46] gccrs: Add expansion pass for the
> [PATCH Rust front-end v3 16/46] gccrs: Add name resolution pass to
> [PATCH Rust front-end v3 17/46] gccrs: Add declarations for Rust HIR
> [PATCH Rust front-end v3 18/46] gccrs: Add HIR definitions and
> [PATCH Rust front-end v3 19/46] gccrs: Add AST to HIR lowering pass
> [PATCH Rust front-end v3 20/46] gccrs: Add wrapper for make_unique
> [PATCH Rust front-end v3 21/46] gccrs: Add port of FNV hash used
> [PATCH Rust front-end v3 22/46] gccrs: Add Rust ABI enum helpers
> [PATCH Rust front-end v3 23/46] gccrs: Add Base62 implementation
> [PATCH Rust front-end v3 24/46] gccrs: Add implementation of Optional
> [PATCH Rust front-end v3 25/46] gccrs: Add attributes checker
> [PATCH Rust front-end v3 26/46] gccrs: Add helpers mappings canonical
> [PATCH Rust front-end v3 27/46] gccrs: Add type resolution and trait
> [PATCH Rust front-end v3 28/46] gccrs: Add Rust type information
> [PATCH Rust front-end v3 29/46] gccrs: Add remaining type system
> [PATCH Rust front-end v3 30/46] gccrs: Add unsafe checks for Rust
> [PATCH Rust front-end v3 31/46] gccrs: Add const checker
> [PATCH Rust front-end v3 32/46] gccrs: Add privacy checks
> [PATCH Rust front-end v3 33/46] gccrs: Add dead code scan on HIR
> [PATCH Rust front-end v3 34/46] gccrs: Add unused variable scan
> [PATCH Rust front-end v3 35/46] gccrs: Add metadata ouptput pass
> [PATCH Rust front-end v3 36/46] gccrs: Add base for HIR to GCC
> [PATCH Rust front-end v3 37/46] gccrs: Add HIR to GCC GENERIC
> [PATCH Rust front-end v3 38/46] gccrs: Add HIR to GCC GENERIC
> [PATCH Rust front-end v3 39/46] gccrs: These are wrappers ported from
> [PATCH Rust front-end v3 40/46] gccrs: Add GCC Rust front-end
> [PATCH Rust front-end v3 41/46] gccrs: Add config-lang.in
> [PATCH Rust front-end v3 42/46] gccrs: Add lang-spec.h
> [PATCH Rust front-end v3 43/46] gccrs: Add lang.opt
> [PATCH Rust front-end v3 44/46] gccrs: Add compiler driver
> [PATCH Rust front-end v3 45/46] gccrs: Compiler proper interface
> [PATCH Rust front-end v3 46/46] gccrs: Add README, CONTRIBUTING and
>


[PATCH][GCC] arm: Add support for new frame unwinding instruction "0xb5".

2022-11-10 Thread Srinath Parvathaneni via Gcc-patches
Hi,

This patch adds support for Arm frame unwinding instruction "0xb5" [1]. When
an exception is taken and "0xb5" instruction is encounter during runtime
stack-unwinding, we use effective vsp as modifier in pointer authentication.
On completion of stack unwinding if "0xb5" instruction is not encountered
then CFA will be used as modifier in pointer authentication.

[1] https://github.com/ARM-software/abi-aa/releases/download/2022Q3/ehabi32.pdf

Regression tested on arm-none-eabi target and found no regressions.

Ok for master?

Regards,
Srinath.

gcc/ChangeLog:

2022-11-09  Srinath Parvathaneni  

* libgcc/config/arm/pr-support.c (__gnu_unwind_execute): Decode opcode
"0xb5".


### Attachment also inlined for ease of reply###


diff --git a/libgcc/config/arm/pr-support.c b/libgcc/config/arm/pr-support.c
index 
e48854587c667a959aa66ccc4982231f6ecc..73e4942a39b34a83c2da85def6b13e82ec501552
 100644
--- a/libgcc/config/arm/pr-support.c
+++ b/libgcc/config/arm/pr-support.c
@@ -107,7 +107,9 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
   _uw op;
   int set_pc;
   int set_pac = 0;
+  int set_pac_sp = 0;
   _uw reg;
+  _uw sp;
 
   set_pc = 0;
   for (;;)
@@ -124,10 +126,11 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
 #if defined(TARGET_HAVE_PACBTI)
  if (set_pac)
{
- _uw sp;
  _uw lr;
  _uw pac;
- _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32, );
+ if (!set_pac_sp)
+   _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32,
+);
  _Unwind_VRS_Get (context, _UVRSC_CORE, R_LR, _UVRSD_UINT32, );
  _Unwind_VRS_Get (context, _UVRSC_PAC, R_IP,
   _UVRSD_UINT32, );
@@ -259,7 +262,19 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
  continue;
}
 
- if ((op & 0xfc) == 0xb4)  /* Obsolete FPA.  */
+ /* Use current VSP as modifier in PAC validation.  */
+ if (op == 0xb5)
+   {
+ if (set_pac)
+   _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32,
+);
+ else
+   return _URC_FAILURE;
+ set_pac_sp = 1;
+ continue;
+   }
+
+ if ((op & 0xfd) == 0xb6)  /* Obsolete FPA.  */
return _URC_FAILURE;
 
  /* op & 0xf8 == 0xb8.  */



diff --git a/libgcc/config/arm/pr-support.c b/libgcc/config/arm/pr-support.c
index 
e48854587c667a959aa66ccc4982231f6ecc..73e4942a39b34a83c2da85def6b13e82ec501552
 100644
--- a/libgcc/config/arm/pr-support.c
+++ b/libgcc/config/arm/pr-support.c
@@ -107,7 +107,9 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
   _uw op;
   int set_pc;
   int set_pac = 0;
+  int set_pac_sp = 0;
   _uw reg;
+  _uw sp;
 
   set_pc = 0;
   for (;;)
@@ -124,10 +126,11 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
 #if defined(TARGET_HAVE_PACBTI)
  if (set_pac)
{
- _uw sp;
  _uw lr;
  _uw pac;
- _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32, );
+ if (!set_pac_sp)
+   _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32,
+);
  _Unwind_VRS_Get (context, _UVRSC_CORE, R_LR, _UVRSD_UINT32, );
  _Unwind_VRS_Get (context, _UVRSC_PAC, R_IP,
   _UVRSD_UINT32, );
@@ -259,7 +262,19 @@ __gnu_unwind_execute (_Unwind_Context * context, 
__gnu_unwind_state * uws)
  continue;
}
 
- if ((op & 0xfc) == 0xb4)  /* Obsolete FPA.  */
+ /* Use current VSP as modifier in PAC validation.  */
+ if (op == 0xb5)
+   {
+ if (set_pac)
+   _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP, _UVRSD_UINT32,
+);
+ else
+   return _URC_FAILURE;
+ set_pac_sp = 1;
+ continue;
+   }
+
+ if ((op & 0xfd) == 0xb6)  /* Obsolete FPA.  */
return _URC_FAILURE;
 
  /* op & 0xf8 == 0xb8.  */





Re: [DOCS] sphinx: use new Sphinx links

2022-11-10 Thread Tobias Burnus

Hi,

On 10.11.22 11:03, Gerald Pfeifer wrote:

On Thu, 10 Nov 2022, Martin Liška wrote:

https://gcc.gnu.org/install/ is back with a new face.

But it's not working properly due to some Content Security Policy:

Hmm, it worked in my testing before and I just tried again:
Firefox 106.0.1 (64-bit)


Did you open the console (F12)? If I do, I see the errors:

Content Security Policy: The page’s settings blocked the loading of a
resource at inline (“default-src”). That's for line 18, which is

Re: [DOCS] sphinx: use new Sphinx links

2022-11-10 Thread Martin Liška

On 11/10/22 11:03, Gerald Pfeifer wrote:

On Thu, 10 Nov 2022, Martin Liška wrote:

https://gcc.gnu.org/install/ is back with a new face.

But it's not working properly due to some Content Security Policy:


Hmm, it worked in my testing before and I just tried again:

Firefox 106.0.1 (64-bit) and now also Chrome 106.0.5249.119
and w3m.

Which browser are you using? Any particular add-ons or special security
settings?


Refused to apply inline style because it violates the following Content
Security Policy directive: "default-src 'self' http: https:". Either the
'unsafe-inline' keyword, a hash
('sha256-wAI2VKPX8IUBbq55XacEljWEKQc4Xc1nmwVsAjAplNU='), or a nonce
('nonce-...') is required to enable inline execution. Note also that
'style-src' was not explicitly set, so 'default-src' is used as a fallback.


That looks like it's related to some Javascript fun? Does sphinx pull in
something? O, it does. A lot.

I'm not using any Javascript blocker, though, so not sure why I am not
seeing any such warnings?

Searching for "+sphinx" and this message did not result in anything.

(It feels a bit curious how the position in the web server's file system
or a symlink could trigger something like that?)


Looking at the source code of index.html I am wondering about

   

versus all the .js inclusions later on.

And https://validator.w3.org/nu/?doc=https%3A%2F%2Fgcc.gnu.org%2Finstall%2F
and 
https://validator.w3.org/nu/?doc=https%3A%2F%2Fgcc.gnu.org%2Fonlinedocs%2Finstall%2F
appear equally (un)happy.

Gerald


Well, I can also reproduce it on my mobile phone.

Anyway, the difference is:

$ curl https://gcc.gnu.org/install/index.html -v &> bad.txt
$ curl https://gcc.gnu.org/onlinedocs/install/index.html -v &> good.txt

$ diff -u good.txt bad.txt
--- good.txt2022-11-10 11:33:45.293631904 +0100
+++ bad.txt 2022-11-10 11:33:37.813669264 +0100
@@ -32,31 +32,32 @@
 *  subjectAltName: host "gcc.gnu.org" matched cert's "gcc.gnu.org"
 *  issuer: C=US; O=Let's Encrypt; CN=R3
 *  SSL certificate verify ok.
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0* 
Using HTTP2, server supports multiplexing
+* Using HTTP2, server supports multiplexing
 * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: 
len=0
 } [5 bytes data]
 * h2h3 [:method: GET]
-* h2h3 [:path: /onlinedocs/install/index.html]
+* h2h3 [:path: /install/index.html]
 * h2h3 [:scheme: https]
 * h2h3 [:authority: gcc.gnu.org]
 * h2h3 [user-agent: curl/7.86.0]
 * h2h3 [accept: */*]
 * Using Stream ID: 1 (easy handle 0x555bf890)
 } [5 bytes data]
-> GET /onlinedocs/install/index.html HTTP/2
+> GET /install/index.html HTTP/2
 > Host: gcc.gnu.org
 > user-agent: curl/7.86.0
 > accept: */*
 >
 { [5 bytes data]
 < HTTP/2 200
-< date: Thu, 10 Nov 2022 10:33:45 GMT
+< date: Thu, 10 Nov 2022 10:33:37 GMT
 < server: Apache/2.4.37 (Red Hat Enterprise Linux) OpenSSL/1.1.1k 
mod_qos/11.70 mod_wsgi/4.6.4 Python/3.6 mod_perl/2.0.12 Perl/v5.26.3
 < last-modified: Wed, 09 Nov 2022 18:51:10 GMT
 < etag: "8232-5ed0e23e07250"
 < accept-ranges: bytes
 < content-length: 0
 < vary: Accept-Encoding
+< content-security-policy: default-src 'self' http: https:
 < strict-transport-security: max-age=16070400
 < content-type: text/html; charset=utf-8
 <
@@ -485,7 +486,7 @@
   
   
 

100 0  100 00 0  61514  0 --:--:-- --:--:-- --:--:-- 61494
100 0  100 00 0  62652  0 --:--:-- --:--:-- --:--:-- 62768
 * Connection #0 to host gcc.gnu.org left intact
 v>
 

===

See that the problematic for some reason uses "content-security-policy: default-src 
'self' http: https:".
And it uses 'Using HTTP2, server supports multiplexing'

Martin


[PATCH][GCC] arm: Add support for Cortex-X1C CPU.

2022-11-10 Thread Srinath Parvathaneni via Gcc-patches
Hi,

This patch adds the -mcpu support for the Arm Cortex-X1C CPU.

Regression tested on arm-none-eabi and bootstrapped on arm-none-linux-gnueabihf.

Ok for GCC master?

Regards,
Srinath.

gcc/ChangeLog:

2022-11-09  Srinath Parvathaneni  

   * config/arm/arm-cpus.in (cortex-x1c): Define new CPU.
   * config/arm/arm-tables.opt: Regenerate.
   * config/arm/arm-tune.md: Likewise.
   * doc/gcc/gcc-command-options/machine-dependent-options/arm-options.rst:
   Document Cortex-X1C CPU.

   gcc/testsuite/ChangeLog:

2022-11-09  Srinath Parvathaneni  

   * gcc.target/arm/multilib.exp: Add tests for Cortex-X1C.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 
5a63bc548e54dbfdce5d1df425bd615d81895d80..5ed4db340bc5d7c9a41e6d1a3f660bf2a97b058b
 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1542,6 +1542,17 @@ begin cpu cortex-x1
  part d44
 end cpu cortex-x1
 
+begin cpu cortex-x1c
+ cname cortexx1c
+ tune for cortex-a57
+ tune flags LDSCHED
+ architecture armv8.2-a+fp16+dotprod
+ option crypto add FP_ARMv8 CRYPTO
+ costs cortex_a57
+ vendor 41
+ part d4c
+end cpu cortex-x1c
+
 begin cpu neoverse-n1
  cname neoversen1
  alias !ares
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 
e6461abcc57cd485025f3e18535267c454662cbe..a10a09e36cd004165b6f1efddeb3bfc29d8337ac
 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -255,6 +255,9 @@ Enum(processor_type) String(cortex-a710) Value( 
TARGET_CPU_cortexa710)
 EnumValue
 Enum(processor_type) String(cortex-x1) Value( TARGET_CPU_cortexx1)
 
+EnumValue
+Enum(processor_type) String(cortex-x1c) Value( TARGET_CPU_cortexx1c)
+
 EnumValue
 Enum(processor_type) String(neoverse-n1) Value( TARGET_CPU_neoversen1)
 
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index 
abc290edd094179379f3856a3f8f64781e0c33f2..8af8c936abe31fb60e3de2fd713f4c6946c2a752
 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -46,7 +46,7 @@
cortexa73cortexa53,cortexa55,cortexa75,
cortexa76,cortexa76ae,cortexa77,
cortexa78,cortexa78ae,cortexa78c,
-   cortexa710,cortexx1,neoversen1,
+   cortexa710,cortexx1,cortexx1c,neoversen1,
cortexa75cortexa55,cortexa76cortexa55,neoversev1,
neoversen2,cortexm23,cortexm33,
cortexm35p,cortexm55,starmc1,
diff --git 
a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/arm-options.rst 
b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/arm-options.rst
index 
3315114969381995d47162b53abeb9bfc442fd28..d531eced20cbb583ecaba2ab3927937faf69b9de
 100644
--- a/gcc/doc/gcc/gcc-command-options/machine-dependent-options/arm-options.rst
+++ b/gcc/doc/gcc/gcc-command-options/machine-dependent-options/arm-options.rst
@@ -594,7 +594,7 @@ These :samp:`-m` options are defined for the ARM port:
   :samp:`cortex-r7`, :samp:`cortex-r8`, :samp:`cortex-r52`, 
:samp:`cortex-r52plus`,
   :samp:`cortex-m0`, :samp:`cortex-m0plus`, :samp:`cortex-m1`, 
:samp:`cortex-m3`,
   :samp:`cortex-m4`, :samp:`cortex-m7`, :samp:`cortex-m23`, :samp:`cortex-m33`,
-  :samp:`cortex-m35p`, :samp:`cortex-m55`, :samp:`cortex-x1`,
+  :samp:`cortex-m35p`, :samp:`cortex-m55`, :samp:`cortex-x1`, 
:samp:`cortex-x1c`,
   :samp:`cortex-m1.small-multiply`, :samp:`cortex-m0.small-multiply`,
   :samp:`cortex-m0plus.small-multiply`, :samp:`exynos-m1`, :samp:`marvell-pj4`,
   :samp:`neoverse-n1`, :samp:`neoverse-n2`, :samp:`neoverse-v1`, 
:samp:`xscale`,
diff --git a/gcc/testsuite/gcc.target/arm/multilib.exp 
b/gcc/testsuite/gcc.target/arm/multilib.exp
index 
2fa648c61dafebb663969198bf7849400a7547f6..f903f028a83f884bdc1521f810f7e70e4130a715
 100644
--- a/gcc/testsuite/gcc.target/arm/multilib.exp
+++ b/gcc/testsuite/gcc.target/arm/multilib.exp
@@ -450,6 +450,9 @@ if {[multilib_config "aprofile"] } {
{-march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard -mthumb} 
"thumb/v8-a+simd/hard"
{-march=armv7-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp -mthumb} 
"thumb/v7-a+simd/softfp"
{-march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=softfp -mthumb} 
"thumb/v8-a+simd/softfp"
+   {-mcpu=cortex-x1c -mfpu=auto -mfloat-abi=softfp -mthumb} 
"thumb/v8-a+simd/softfp"
+   {-mcpu=cortex-x1c -mfpu=auto -mfloat-abi=hard -mthumb} 
"thumb/v8-a+simd/hard"
+   {-mcpu=cortex-x1c -mfpu=auto -mfloat-abi=soft -mthumb} "thumb/v8-a/nofp"
 } {
check_multi_dir $opts $dir
 }



diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 
5a63bc548e54dbfdce5d1df425bd615d81895d80..5ed4db340bc5d7c9a41e6d1a3f660bf2a97b058b
 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1542,6 +1542,17 @@ begin cpu cortex-x1
  part d44
 end cpu cortex-x1
 
+begin cpu cortex-x1c
+ cname cortexx1c
+ 

[PATCH 2/2] Fortran: Add attribute flatten

2022-11-10 Thread Bernhard Reutner-Fischer via Gcc-patches
Bootstrapped and regtested cleanly on x86_unknown-linux.
The document bits will be rewritten for rst.
Ok for trunk if the prerequisite target_clones patch is approved?

gcc/fortran/ChangeLog:

* decl.cc (gfc_match_gcc_attributes): Handle flatten.
* f95-lang.cc (gfc_attribute_table): Add flatten.
* gfortran.texi: Document attribute flatten.

gcc/testsuite/ChangeLog:

* gfortran.dg/attr_flatten-1.f90: New test.
---
 gcc/fortran/decl.cc  |  8 +++-
 gcc/fortran/f95-lang.cc  |  2 +
 gcc/fortran/gfortran.texi|  8 
 gcc/testsuite/gfortran.dg/attr_flatten-1.f90 | 41 
 4 files changed, 57 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/attr_flatten-1.f90

diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc
index d312d4812b6..3d210c26eb5 100644
--- a/gcc/fortran/decl.cc
+++ b/gcc/fortran/decl.cc
@@ -11841,6 +11841,7 @@ gfc_match_gcc_attributes (void)
   for(;;)
 {
   char ch;
+  bool known_attr0args = false;
 
   if (gfc_match_name (name) != MATCH_YES)
return MATCH_ERROR;
@@ -11849,7 +11850,9 @@ gfc_match_gcc_attributes (void)
if (strcmp (name, ext_attr_list[id].name) == 0)
  break;
 
-  if (id == EXT_ATTR_LAST)
+  if (strcmp (name, "flatten") == 0)
+   known_attr0args = true; /* Handled below.  We do not need a bit.  */
+  else if (id == EXT_ATTR_LAST)
{
  gfc_error ("Unknown attribute in !GCC$ ATTRIBUTES statement at %C");
  return MATCH_ERROR;
@@ -11864,7 +11867,8 @@ gfc_match_gcc_attributes (void)
   || id == EXT_ATTR_DLLEXPORT
   || id == EXT_ATTR_CDECL
   || id == EXT_ATTR_STDCALL
-  || id == EXT_ATTR_FASTCALL)
+  || id == EXT_ATTR_FASTCALL
+  || known_attr0args)
attr.ext_attr_args
  = chainon (attr.ext_attr_args,
 build_tree_list (get_identifier (name), NULL_TREE));
diff --git a/gcc/fortran/f95-lang.cc b/gcc/fortran/f95-lang.cc
index 7154568aec5..ddb5b686cf6 100644
--- a/gcc/fortran/f95-lang.cc
+++ b/gcc/fortran/f95-lang.cc
@@ -101,6 +101,8 @@ static const struct attribute_spec gfc_attribute_table[] =
  gfc_handle_omp_declare_target_attribute, NULL },
   { "target_clones",  1, -1, true, false, false, false,
  gfc_handle_omp_declare_target_attribute, NULL },
+  { "flatten",0, 0, true,  false, false, false,
+ gfc_handle_omp_declare_target_attribute, NULL },
   { NULL,0, 0, false, false, false, false, NULL, NULL }
 };
 
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 06e4c8c00a1..be650f28b62 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -3280,6 +3280,14 @@ contains
 end module mymod
 @end smallexample
 
+@node flatten
+
+Procedures annotated with the @code{flatten} attribute have their
+callees inlined, if possible.
+Please refer to
+@ref{Top,,Common Function Attributes,gcc,Using the GNU Compiler Collection 
(GCC)}
+for details about the respective attribute.
+
 The attributes are specified using the syntax
 
 @code{!GCC$ ATTRIBUTES} @var{attribute-list} @code{::} @var{variable-list}
diff --git a/gcc/testsuite/gfortran.dg/attr_flatten-1.f90 
b/gcc/testsuite/gfortran.dg/attr_flatten-1.f90
new file mode 100644
index 000..0b72f1ba17c
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/attr_flatten-1.f90
@@ -0,0 +1,41 @@
+! { dg-do compile }
+! { dg-additional-options "-fdump-tree-optimized" }
+! Test __attribute__((flatten))
+!
+module attr_flttn_1_a
+  implicit none
+contains
+  subroutine sub1(i)
+integer, intent(in) :: i
+integer :: n
+do n = 1, i
+  print *, "marker1 ", i, i+n;
+enddo
+  end
+  subroutine sub2(i)
+integer, intent(in) :: i
+integer :: n
+do n = 1, i
+  print *, "marker2 ", i, i*i-n;
+enddo
+  end
+end module
+module attr_flttn_1_b
+  use attr_flttn_1_a
+contains
+  subroutine sub3
+!GCC$ ATTRIBUTES flatten :: sub3
+print *, "marker3 "
+call sub2(4711)
+call sub1(42)
+  end
+end module
+! Without the attribute flatten we would have 1 character write for each 
marker.
+! That would be 3 _gfortran_transfer_character_write.*marker
+! With the attribute, we have one for each sub plus marker1 and marker2
+! which were inlined into sub3.
+! So this gives 5 _gfortran_transfer_character_write.*marker
+! and there should be no calls to sub1 (); nor sub2 ();
+! { dg-final { scan-tree-dump-times { _gfortran_transfer_character_write 
.*?marker} 5 "optimized" } }
+! { dg-final { scan-tree-dump-not { sub1 \([^\)][^\)]*\);} "optimized" } }
+! { dg-final { scan-tree-dump-not { sub2 \([^\)][^\)]*\);} "optimized" } }
-- 
2.38.1



[PATCH 1/2] Fortran: Cleanup struct ext_attr_t

2022-11-10 Thread Bernhard Reutner-Fischer via Gcc-patches
Tiny cleanup opportunity since we now have ext_attr_args in
struct symbol_attribute.
Bootstrapped and regtested on x86_64-unknown-linux with no new
regressions.
Ok for trunk if the prerequisite was approved ([PATCH 2/2] Fortran: add
attribute target_clones) ?

gcc/fortran/ChangeLog:

* gfortran.h (struct ext_attr_t): Remove middle_end_name.
* trans-decl.cc (add_attributes_to_decl): Move building
tree_list to ...
* decl.cc (gfc_match_gcc_attributes): ... here. Add the attribute to
the tree_list for the middle end.

Cc: gfortran ML 
---
 gcc/fortran/decl.cc   | 35 +++
 gcc/fortran/gfortran.h|  1 -
 gcc/fortran/trans-decl.cc | 13 +
 3 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/gcc/fortran/decl.cc b/gcc/fortran/decl.cc
index 3a619dbdd34..d312d4812b6 100644
--- a/gcc/fortran/decl.cc
+++ b/gcc/fortran/decl.cc
@@ -11802,15 +11802,15 @@ gfc_match_gcc_attribute_args (bool require_string, 
bool allow_multiple)
 }
 
 const ext_attr_t ext_attr_list[] = {
-  { "dllimport",EXT_ATTR_DLLIMPORT,"dllimport" },
-  { "dllexport",EXT_ATTR_DLLEXPORT,"dllexport" },
-  { "cdecl",EXT_ATTR_CDECL,"cdecl" },
-  { "stdcall",  EXT_ATTR_STDCALL,  "stdcall"   },
-  { "fastcall", EXT_ATTR_FASTCALL, "fastcall"  },
-  { "no_arg_check", EXT_ATTR_NO_ARG_CHECK, NULL},
-  { "deprecated",   EXT_ATTR_DEPRECATED,   NULL   },
-  { "target_clones",EXT_ATTR_TARGET_CLONES,NULL   },
-  { NULL,   EXT_ATTR_LAST, NULL}
+  { "dllimport",EXT_ATTR_DLLIMPORT },
+  { "dllexport",EXT_ATTR_DLLEXPORT },
+  { "cdecl",EXT_ATTR_CDECL },
+  { "stdcall",  EXT_ATTR_STDCALL   },
+  { "fastcall", EXT_ATTR_FASTCALL, },
+  { "no_arg_check", EXT_ATTR_NO_ARG_CHECK  },
+  { "deprecated",   EXT_ATTR_DEPRECATED},
+  { "target_clones",EXT_ATTR_TARGET_CLONES },
+  { NULL,   EXT_ATTR_LAST  }
 };
 
 /* Match a !GCC$ ATTRIBUTES statement of the form:
@@ -11854,6 +11854,20 @@ gfc_match_gcc_attributes (void)
  gfc_error ("Unknown attribute in !GCC$ ATTRIBUTES statement at %C");
  return MATCH_ERROR;
}
+
+  /* Check for errors.
+If everything is fine, add attributes the middle-end has to know about.
+   */
+  if (!gfc_add_ext_attribute (, (ext_attr_id_t)id, 
_current_locus))
+   return MATCH_ERROR;
+  else if (id == EXT_ATTR_DLLIMPORT
+  || id == EXT_ATTR_DLLEXPORT
+  || id == EXT_ATTR_CDECL
+  || id == EXT_ATTR_STDCALL
+  || id == EXT_ATTR_FASTCALL)
+   attr.ext_attr_args
+ = chainon (attr.ext_attr_args,
+build_tree_list (get_identifier (name), NULL_TREE));
   else if (id == EXT_ATTR_TARGET_CLONES)
{
  attr_args
@@ -11864,9 +11878,6 @@ gfc_match_gcc_attributes (void)
 build_tree_list (get_identifier (name), attr_args));
}
 
-  if (!gfc_add_ext_attribute (, (ext_attr_id_t)id, 
_current_locus))
-   return MATCH_ERROR;
-
   gfc_gobble_whitespace ();
   ch = gfc_next_ascii_char ();
   if (ch == ':')
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index ce0cb61e647..c4deec0d5b8 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -847,7 +847,6 @@ typedef struct
 {
   const char *name;
   unsigned id;
-  const char *middle_end_name;
 }
 ext_attr_t;
 
diff --git a/gcc/fortran/trans-decl.cc b/gcc/fortran/trans-decl.cc
index 24cbd4cda28..7d5d2bdbb37 100644
--- a/gcc/fortran/trans-decl.cc
+++ b/gcc/fortran/trans-decl.cc
@@ -1436,18 +1436,7 @@ gfc_add_assign_aux_vars (gfc_symbol * sym)
 static tree
 add_attributes_to_decl (symbol_attribute sym_attr, tree list)
 {
-  unsigned id;
-  tree attr;
-
-  for (id = 0; id < EXT_ATTR_NUM; id++)
-if (sym_attr.ext_attr & (1 << id) && ext_attr_list[id].middle_end_name)
-  {
-   attr = build_tree_list (
-get_identifier (ext_attr_list[id].middle_end_name),
-NULL_TREE);
-   list = chainon (list, attr);
-  }
-  /* Add attribute args.  */
+  /* Add attributes and their arguments.  */
   if (sym_attr.ext_attr_args != NULL_TREE)
 list = chainon (list, sym_attr.ext_attr_args);
 
-- 
2.38.1



[PATCH 0/2] Fortran: Add attribute flatten

2022-11-10 Thread Bernhard Reutner-Fischer via Gcc-patches
Hi!

I could imagine that the flatten attribute might be useful.
Do we want to add support for it for gcc-13?

Bernhard Reutner-Fischer (2):
  Fortran: Cleanup struct ext_attr_t
  Fortran: Add attribute flatten

 gcc/fortran/decl.cc  | 41 +---
 gcc/fortran/f95-lang.cc  |  2 +
 gcc/fortran/gfortran.h   |  1 -
 gcc/fortran/gfortran.texi|  8 
 gcc/fortran/trans-decl.cc| 13 +--
 gcc/testsuite/gfortran.dg/attr_flatten-1.f90 | 41 
 6 files changed, 80 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/attr_flatten-1.f90

-- 
2.38.1



Re: [DOCS] sphinx: use new Sphinx links

2022-11-10 Thread Gerald Pfeifer
On Thu, 10 Nov 2022, Martin Liška wrote:
>> https://gcc.gnu.org/install/ is back with a new face.
> But it's not working properly due to some Content Security Policy:

Hmm, it worked in my testing before and I just tried again: 

Firefox 106.0.1 (64-bit) and now also Chrome 106.0.5249.119 
and w3m.

Which browser are you using? Any particular add-ons or special security 
settings?

> Refused to apply inline style because it violates the following Content
> Security Policy directive: "default-src 'self' http: https:". Either the
> 'unsafe-inline' keyword, a hash
> ('sha256-wAI2VKPX8IUBbq55XacEljWEKQc4Xc1nmwVsAjAplNU='), or a nonce
> ('nonce-...') is required to enable inline execution. Note also that
> 'style-src' was not explicitly set, so 'default-src' is used as a fallback.

That looks like it's related to some Javascript fun? Does sphinx pull in 
something? O, it does. A lot. 

I'm not using any Javascript blocker, though, so not sure why I am not
seeing any such warnings?

Searching for "+sphinx" and this message did not result in anything.

(It feels a bit curious how the position in the web server's file system 
or a symlink could trigger something like that?)


Looking at the source code of index.html I am wondering about

  

versus all the .js inclusions later on.

And https://validator.w3.org/nu/?doc=https%3A%2F%2Fgcc.gnu.org%2Finstall%2F
and 
https://validator.w3.org/nu/?doc=https%3A%2F%2Fgcc.gnu.org%2Fonlinedocs%2Finstall%2F
appear equally (un)happy.

Gerald


[committed] wwwdocs: readings: Remove linux-c6x.org

2022-11-10 Thread Gerald Pfeifer
I pushed this now.

Is this an indication we should deprecate the port, Bernd, or would you 
like to re-add a link at a later point?

Gerald


linux-c6x.org has been dead for at two-and-a-half months and a web
search did not reveal a good alternate site.
---
 htdocs/readings.html | 1 -
 1 file changed, 1 deletion(-)

diff --git a/htdocs/readings.html b/htdocs/readings.html
index df89bc9c..5e3db8c2 100644
--- a/htdocs/readings.html
+++ b/htdocs/readings.html
@@ -105,7 +105,6 @@ names.
  C6X
Manufacturer: Texas Instruments
Exact chip name: TMS320C6X
-   http://linux-c6x.org/;>Site for the Linux on C6X project
  
 
  CR16
-- 
2.38.0


Re: [PATCH] i386: Fix up ix86_expand_int_sse_cmp [PR107585]

2022-11-10 Thread Uros Bizjak via Gcc-patches
On Thu, Nov 10, 2022 at 10:29 AM Jakub Jelinek  wrote:
>
> Hi!
>
> The following patch fixes ICE on the testcase.  I've used GEN_INT
> incorrectly thinking the code punts on the problematic boundaries.
> It does, but only for LE and GE, i.e. signed comparisons, for unsigned
> the boundaries are 0 and unsigned maximum, so when say unsigned char
> adds one to 127 or subtracts one from 128 we need to canonicalize it.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
> ok for trunk?
>
> 2022-11-10  Jakub Jelinek  
>
> PR target/107585
> PR target/107546
> * config/i386/i386-expand.cc (ix86_expand_int_sse_cmp): Use
> gen_int_mode rather than GEN_INT.
>
> * gcc.dg/pr107585.c: New test.

OK as an obvious patch.

Thanks,
Uros.

>
> --- gcc/config/i386/i386-expand.cc.jj   2022-11-08 12:21:48.704047171 +0100
> +++ gcc/config/i386/i386-expand.cc  2022-11-09 14:40:12.157012775 +0100
> @@ -4540,7 +4540,8 @@ ix86_expand_int_sse_cmp (rtx dest, enum
>   rtvec v = rtvec_alloc (n_elts);
>   for (i = 0; i < n_elts; ++i)
> RTVEC_ELT (v, i)
> - = GEN_INT (INTVAL (CONST_VECTOR_ELT (cop1, i)) + 1);
> + = gen_int_mode (INTVAL (CONST_VECTOR_ELT (cop1, i)) + 1,
> + eltmode);
>   cop1 = gen_rtx_CONST_VECTOR (mode, v);
>   std::swap (cop0, cop1);
>   code = code == LE ? GT : GTU;
> @@ -4584,7 +4585,8 @@ ix86_expand_int_sse_cmp (rtx dest, enum
>   rtvec v = rtvec_alloc (n_elts);
>   for (i = 0; i < n_elts; ++i)
> RTVEC_ELT (v, i)
> - = GEN_INT (INTVAL (CONST_VECTOR_ELT (cop1, i)) - 1);
> + = gen_int_mode (INTVAL (CONST_VECTOR_ELT (cop1, i)) - 1,
> + eltmode);
>   cop1 = gen_rtx_CONST_VECTOR (mode, v);
>   code = code == GE ? GT : GTU;
>   break;
> --- gcc/testsuite/gcc.dg/pr107585.c.jj  2022-11-09 14:52:37.554779118 +0100
> +++ gcc/testsuite/gcc.dg/pr107585.c 2022-11-09 14:48:24.063258991 +0100
> @@ -0,0 +1,13 @@
> +/* PR target/107585 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +typedef unsigned char __attribute__((__vector_size__ (16))) V;
> +char c;
> +void bar (int);
> +
> +void
> +foo (void)
> +{
> +  bar (((V) (c <= (V){127}))[2]);
> +}
>
> Jakub
>


Re: [DOCS] sphinx: use new Sphinx links

2022-11-10 Thread Martin Liška

On 11/10/22 10:35, Gerald Pfeifer wrote:

On Thu, 10 Nov 2022, Martin Liška wrote:

What do you think of keeping the latest under this shorter and simpler
URL (too), though?

Works for me.

:

I believe a symlink (in the file system) on gcc.gnu.org could pull that
off.

Yep, please do so.


Done.

https://gcc.gnu.org/install/ is back with a new face.


But it's not working properly due to some Content Security Policy:

Refused to apply inline style because it violates the following Content Security Policy 
directive: "default-src 'self' http: https:". Either the 'unsafe-inline' 
keyword, a hash ('sha256-wAI2VKPX8IUBbq55XacEljWEKQc4Xc1nmwVsAjAplNU='), or a nonce 
('nonce-...') is required to enable inline execution. Note also that 'style-src' was not 
explicitly set, so 'default-src' is used as a fallback.

gcc.gnu.org/:42 Refused to execute inline script because it violates the following 
Content Security Policy directive: "default-src 'self' http: https:". Either 
the 'unsafe-inline' keyword, a hash 
('sha256-ySvT2PEZeueHGC1y2crNuNTfphBynFPP7i+U21fEgX0='), or a nonce ('nonce-...') is 
required to enable inline execution. Note also that 'script-src' was not explicitly set, 
so 'default-src' is used as a fallback.

gcc.gnu.org/:47 Refused to apply inline style because it violates the following Content 
Security Policy directive: "default-src 'self' http: https:". Either the 
'unsafe-inline' keyword, a hash ('sha256-biLFinpqYMtWHmXfkA1BPeCY0/fNt46SAZ+BBk5YUog='), 
or a nonce ('nonce-...') is required to enable inline execution. Note that hashes do not 
apply to event handlers, style attributes and javascript: navigations unless the 
'unsafe-hashes' keyword is present. Note also that 'style-src' was not explicitly set, so 
'default-src' is used as a fallback.

gcc.gnu.org/:202 Refused to load the image 'data:image/svg+xml;charset=utf-8,http://www.w3.org/2000/svg; viewBox="0 0 24 24" stroke-width="1.5" stroke="currentColor" fill="none" 
stroke-linecap="round" stroke-linejoin="round">' because it violates the following Content Security Policy directive: "default-src 'self' http: https:". Note that 'img-src' was not explicitly set, so 'default-src' is used as a fallback.


Can you please take a look at it?



Will you be reverting the link adjustments back from /onlinedocs/install/
to plain /install/ ?


Yes, I can do that.





Note how in style.mthml we have some special provisions for install/.

Over the last years I have reduced those to a large extent. There is still
a little bit post-processing going on right now including setting our CSS
and our favicon.

Well, the entire content of gcc.gnu.org/onlinedocs/install/ is *one* of
our documentations and there should not be anything special about it.
Does it make sense?


Yes, things have evolved historically and there was a time we
needed/wanted to treat /install especially, for example to retain
the same (white) background color across.

By now, if we are to make changes, we probably should rather make them
across all of /onlinedocs - favicon and our CSS being two such changes.
Not a critical priority, though, I guess.

Gerald




Re: [DOCS] sphinx: use new Sphinx links

2022-11-10 Thread Gerald Pfeifer
On Thu, 10 Nov 2022, Martin Liška wrote:
>> What do you think of keeping the latest under this shorter and simpler
>> URL (too), though?
> Works for me.
:
>> I believe a symlink (in the file system) on gcc.gnu.org could pull that
>> off.
> Yep, please do so.

Done.

https://gcc.gnu.org/install/ is back with a new face.

Will you be reverting the link adjustments back from /onlinedocs/install/
to plain /install/ ?


>> Note how in style.mthml we have some special provisions for install/.
>> 
>> Over the last years I have reduced those to a large extent. There is still
>> a little bit post-processing going on right now including setting our CSS
>> and our favicon.
> Well, the entire content of gcc.gnu.org/onlinedocs/install/ is *one* of 
> our documentations and there should not be anything special about it. 
> Does it make sense?

Yes, things have evolved historically and there was a time we 
needed/wanted to treat /install especially, for example to retain 
the same (white) background color across.

By now, if we are to make changes, we probably should rather make them 
across all of /onlinedocs - favicon and our CSS being two such changes.
Not a critical priority, though, I guess.

Gerald


[PATCH] i386: Fix up ix86_expand_int_sse_cmp [PR107585]

2022-11-10 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch fixes ICE on the testcase.  I've used GEN_INT
incorrectly thinking the code punts on the problematic boundaries.
It does, but only for LE and GE, i.e. signed comparisons, for unsigned
the boundaries are 0 and unsigned maximum, so when say unsigned char
adds one to 127 or subtracts one from 128 we need to canonicalize it.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2022-11-10  Jakub Jelinek  

PR target/107585
PR target/107546
* config/i386/i386-expand.cc (ix86_expand_int_sse_cmp): Use
gen_int_mode rather than GEN_INT.

* gcc.dg/pr107585.c: New test.

--- gcc/config/i386/i386-expand.cc.jj   2022-11-08 12:21:48.704047171 +0100
+++ gcc/config/i386/i386-expand.cc  2022-11-09 14:40:12.157012775 +0100
@@ -4540,7 +4540,8 @@ ix86_expand_int_sse_cmp (rtx dest, enum
  rtvec v = rtvec_alloc (n_elts);
  for (i = 0; i < n_elts; ++i)
RTVEC_ELT (v, i)
- = GEN_INT (INTVAL (CONST_VECTOR_ELT (cop1, i)) + 1);
+ = gen_int_mode (INTVAL (CONST_VECTOR_ELT (cop1, i)) + 1,
+ eltmode);
  cop1 = gen_rtx_CONST_VECTOR (mode, v);
  std::swap (cop0, cop1);
  code = code == LE ? GT : GTU;
@@ -4584,7 +4585,8 @@ ix86_expand_int_sse_cmp (rtx dest, enum
  rtvec v = rtvec_alloc (n_elts);
  for (i = 0; i < n_elts; ++i)
RTVEC_ELT (v, i)
- = GEN_INT (INTVAL (CONST_VECTOR_ELT (cop1, i)) - 1);
+ = gen_int_mode (INTVAL (CONST_VECTOR_ELT (cop1, i)) - 1,
+ eltmode);
  cop1 = gen_rtx_CONST_VECTOR (mode, v);
  code = code == GE ? GT : GTU;
  break;
--- gcc/testsuite/gcc.dg/pr107585.c.jj  2022-11-09 14:52:37.554779118 +0100
+++ gcc/testsuite/gcc.dg/pr107585.c 2022-11-09 14:48:24.063258991 +0100
@@ -0,0 +1,13 @@
+/* PR target/107585 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+typedef unsigned char __attribute__((__vector_size__ (16))) V;
+char c;
+void bar (int);
+
+void
+foo (void)
+{
+  bar (((V) (c <= (V){127}))[2]);
+}

Jakub



Re: [PATCH] match.pd: rewrite select to branchless expression

2022-11-10 Thread Richard Biener via Gcc-patches
On Wed, Nov 9, 2022 at 10:06 PM Michael Collison  wrote:
>
> Richard,
>
> Thanks for your feedback. I want to make sure I am following what you
> are recommending. Are you suggesting changing:
>
> (for op (bit_xor bit_ior)
> (simplify
> (cond (eq (bit_and @0 integer_onep@1)
> integer_zerop)
> @2
> (op:c @3 @2))
> (if (INTEGRAL_TYPE_P (type)
> && (INTEGRAL_TYPE_P (TREE_TYPE (@0
> (op (bit_and (negate (convert:type (bit_and @0 @1))) @3) @2
>
>
> to
>
> (for op (bit_xor bit_ior)
>   (simplify
>(cond (eq zero_one_valued_p@0
>  integer_zerop)
>  @1
>  (op:c @2 @1))
>(if (INTEGRAL_TYPE_P (type)
> && (INTEGRAL_TYPE_P (TREE_TYPE (@0
> (op (bit_and (negate (convert:type (bit_and @0 { build_one_cst
> (type); }))) @2) @1

in the replacement you'd simply use

 (op (bit_and (negate (convert:type @0)) @2) @1)))

that is, convert the [0,1] valued @0 to 'type' directly.  At least I can't see
how that can go wrong?

>
>
> On 11/9/22 02:41, Richard Biener wrote:
> > On Tue, Nov 8, 2022 at 9:02 PM Michael Collison  
> > wrote:
> >> This patches transforms (cond (and (x , 0x1) == 0), y, (z op y)) into
> >> (-(and (x , 0x1)) & z ) op y, where op is a '^' or a '|'. It also
> >> transforms (cond (and (x , 0x1) != 0), (z op y), y ) into (-(and (x ,
> >> 0x1)) & z ) op y.
> >>
> >> Matching this patterns allows GCC to generate branchless code for one of
> >> the functions in coremark.
> >>
> >> Bootstrapped and tested on x86 and RISC-V. Okay?
> >>
> >> Michael.
> >>
> >> 2022-11-08  Michael Collison  
> >>
> >>   * match.pd ((cond (and (x , 0x1) == 0), y, (z op y) )
> >>   -> (-(and (x , 0x1)) & z ) op y)
> >>
> >> 2022-11-08  Michael Collison  
> >>
> >>   * gcc.dg/tree-ssa/branchless-cond.c: New test.
> >>
> >> ---
> >>gcc/match.pd  | 22 
> >>.../gcc.dg/tree-ssa/branchless-cond.c | 26 +++
> >>2 files changed, 48 insertions(+)
> >>create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> >>
> >> diff --git a/gcc/match.pd b/gcc/match.pd
> >> index 194ba8f5188..722f517ac6d 100644
> >> --- a/gcc/match.pd
> >> +++ b/gcc/match.pd
> >> @@ -3486,6 +3486,28 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >>  (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
> >>  (max @2 @1))
> >>
> >> +/* (cond (and (x , 0x1) == 0), y, (z ^ y) ) -> (-(and (x , 0x1)) & z )
> >> ^ y */
> > Please write the match as a C expression in the comment, as present
> > it's a weird mix.  So x & 0x1 == 0 ? y : z  y -> (-(typeof(y))(x &
> > 0x1) & z)  y
> >
> >> +(for op (bit_xor bit_ior)
> >> + (simplify
> >> +  (cond (eq (bit_and @0 integer_onep@1)
> >> +integer_zerop)
> >> +@2
> >> +(op:c @3 @2))
> >> +  (if (INTEGRAL_TYPE_P (type)
> >> +   && (INTEGRAL_TYPE_P (TREE_TYPE (@0
> >> +   (op (bit_and (negate (convert:type (bit_and @0 @1))) @3) @2
> > Since you are literally keeping (bit_and @0 @1) and not matching @0 with
> > anything I suspect you could instead use
> >
> >   (simplify (cond (eq zero_one_valued_p@0 integer_zerop) ...
> >
> > eventually extending that to cover bit_and with one.  Do you need to guard
> > this against 'type' being a signed/unsigned 1-bit precision integer?
> >
> >> +
> >> +/* (cond (and (x , 0x1) != 0), (z ^ y), y ) -> (-(and (x , 0x1)) & z )
> >> ^ y */
> >> +(for op (bit_xor bit_ior)
> >> + (simplify
> >> +  (cond (ne (bit_and @0 integer_onep@1)
> >> +integer_zerop)
> >> +(op:c @3 @2)
> >> +@2)
> >> +  (if (INTEGRAL_TYPE_P (type)
> >> +   && (INTEGRAL_TYPE_P (TREE_TYPE (@0
> >> +   (op (bit_and (negate (convert:type (bit_and @0 @1))) @3) @2
> >> +
> >>/* Simplifications of shift and rotates.  */
> >>
> >>(for rotate (lrotate rrotate)
> >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> >> b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> >> new file mode 100644
> >> index 000..68087ae6568
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/branchless-cond.c
> >> @@ -0,0 +1,26 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> >> +
> >> +int f1(unsigned int x, unsigned int y, unsigned int z)
> >> +{
> >> +  return ((x & 1) == 0) ? y : z ^ y;
> >> +}
> >> +
> >> +int f2(unsigned int x, unsigned int y, unsigned int z)
> >> +{
> >> +  return ((x & 1) != 0) ? z ^ y : y;
> >> +}
> >> +
> >> +int f3(unsigned int x, unsigned int y, unsigned int z)
> >> +{
> >> +  return ((x & 1) == 0) ? y : z | y;
> >> +}
> >> +
> >> +int f4(unsigned int x, unsigned int y, unsigned int z)
> >> +{
> >> +  return ((x & 1) != 0) ? z | y : y;
> >> +}
> >> +
> >> +/* { dg-final { scan-tree-dump-times " -" 4 "optimized" } } */
> >> +/* { dg-final { scan-tree-dump-times " & " 8 "optimized" } } */
> >> +/* { dg-final { scan-tree-dump-not "if" "optimized" } } */
> >> --
> >> 2.34.1
> >>
> >>
> >>
> >>


Re: [PATCH] Optimize VEC_PERM_EXPR with same permutation index and operation [PR98167]

2022-11-10 Thread Richard Biener via Gcc-patches
On Thu, Nov 10, 2022 at 3:27 AM Hongyu Wang  wrote:
>
> Hi Prathamesh and Richard,
>
> Thanks for the review and nice suggestions!
>
> > > I guess the transform should work as long as mask is same for both
> > > vectors even if it's
> > > not constant ?
> >
> > Yes, please change accordingly (and maybe push separately).
> >
>
> Removed VECTOR_CST for integer ops.
>
> > > If this transform is meant only for VLS vectors, I guess you should
> > > bail out if TYPE_VECTOR_SUBPARTS is not constant,
> > > otherwise it will crash for VLA vectors.
> >
> > I suppose it's difficult to create a VLA permute that covers all elements
> > and that is not trivial though.  But indeed add ().is_constant to the
> > VECTOR_FLOAT_TYPE_P guard.
>
> Added.
>
> > Meh, that's quadratic!  I suggest to check .encoding 
> > ().encoded_full_vector_p ()
> > (as said I can't think of a non-full encoding that isn't trivial
> > but covers all elements) and then simply .qsort () the vector_builder
> > (it derives
> > from vec<>) so the scan is O(n log n).
>
> The .qsort () approach requires an extra cmp_func that IMO would not
> be feasible to be implemented in match.pd (I suppose lambda function
> would not be a good idea either).
> Another solution would be using hash_set but it does not work here for
> int64_t or poly_int64 type.
> So I kept current O(n^2) simple code here, and I suppose usually the
> permutation indices would be a small number even for O(n^2)
> complexity.

Well, with AVX512 v64qi that's 64*64 == 4096 cases to check.  I think
a lambda function is fine to use.  The alternative (used by the vectorizer
in some places) is to use sth like

 auto_sbitmap seen (nelts);
 for (i = 0; i < nelts; i++)
   {
 if (!bitmap_set_bit (seen, i))
   break;
 count++;
   }
 full_perm_p = count == nelts;

I'll note that you should still check .encoding ().encoded_full_vector_p ()
and only bother to check that case, that's a very simple check.

>
> Attached updated patch.
>
> Richard Biener via Gcc-patches  于2022年11月8日周二 
> 22:38写道:
>
>
> >
> > On Fri, Nov 4, 2022 at 7:44 AM Prathamesh Kulkarni via Gcc-patches
> >  wrote:
> > >
> > > On Fri, 4 Nov 2022 at 05:36, Hongyu Wang via Gcc-patches
> > >  wrote:
> > > >
> > > > Hi,
> > > >
> > > > This is a follow-up patch for PR98167
> > > >
> > > > The sequence
> > > >  c1 = VEC_PERM_EXPR (a, a, mask)
> > > >  c2 = VEC_PERM_EXPR (b, b, mask)
> > > >  c3 = c1 op c2
> > > > can be optimized to
> > > >  c = a op b
> > > >  c3 = VEC_PERM_EXPR (c, c, mask)
> > > > for all integer vector operation, and float operation with
> > > > full permutation.
> > > >
> > > > Bootstrapped & regrtested on x86_64-pc-linux-gnu.
> > > >
> > > > Ok for trunk?
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > PR target/98167
> > > > * match.pd: New perm + vector op patterns for int and fp vector.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > PR target/98167
> > > > * gcc.target/i386/pr98167.c: New test.
> > > > ---
> > > >  gcc/match.pd| 49 +
> > > >  gcc/testsuite/gcc.target/i386/pr98167.c | 44 ++
> > > >  2 files changed, 93 insertions(+)
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr98167.c
> > > >
> > > > diff --git a/gcc/match.pd b/gcc/match.pd
> > > > index 194ba8f5188..b85ad34f609 100644
> > > > --- a/gcc/match.pd
> > > > +++ b/gcc/match.pd
> > > > @@ -8189,3 +8189,52 @@ and,
> > > >   (bit_and (negate @0) integer_onep@1)
> > > >   (if (!TYPE_OVERFLOW_SANITIZED (type))
> > > >(bit_and @0 @1)))
> > > > +
> > > > +/* Optimize
> > > > +   c1 = VEC_PERM_EXPR (a, a, mask)
> > > > +   c2 = VEC_PERM_EXPR (b, b, mask)
> > > > +   c3 = c1 op c2
> > > > +   -->
> > > > +   c = a op b
> > > > +   c3 = VEC_PERM_EXPR (c, c, mask)
> > > > +   For all integer non-div operations.  */
> > > > +(for op (plus minus mult bit_and bit_ior bit_xor
> > > > +lshift rshift)
> > > > + (simplify
> > > > +  (op (vec_perm @0 @0 VECTOR_CST@2) (vec_perm @1 @1 VECTOR_CST@2))
> > > > +(if (VECTOR_INTEGER_TYPE_P (type))
> > > > + (vec_perm (op @0 @1) (op @0 @1) @2
> > > Just wondering, why should mask be CST here ?
> > > I guess the transform should work as long as mask is same for both
> > > vectors even if it's
> > > not constant ?
> >
> > Yes, please change accordingly (and maybe push separately).
> >
> > > > +
> > > > +/* Similar for float arithmetic when permutation constant covers
> > > > +   all vector elements.  */
> > > > +(for op (plus minus mult)
> > > > + (simplify
> > > > +  (op (vec_perm @0 @0 VECTOR_CST@2) (vec_perm @1 @1 VECTOR_CST@2))
> > > > +(if (VECTOR_FLOAT_TYPE_P (type))
> > > > + (with
> > > > +  {
> > > > +   tree perm_cst = @2;
> > > > +   vec_perm_builder builder;
> > > > +   bool full_perm_p = false;
> > > > +   if (tree_to_vec_perm_builder (, perm_cst))
> > > > + {
> > > > +   /* Create a 

Re: [DOCS] sphinx: use new Sphinx links

2022-11-10 Thread Martin Liška

On 11/10/22 09:28, Gerald Pfeifer wrote:

Hi Martin,

On Wed, 9 Nov 2022, Martin Liška wrote:

Gerald I would like to ask you for further server actions related
to the Sphinx documentation:


sure, happy to help!


1) https://gcc.gnu.org/install/ - for the future we will use
https://gcc.gnu.org/onlinedocs/install/


That's a (fair) bit longer and more complex URL. I understand the point
about cross referencing older GCC releases and see benefits with that.

What do you think of keeping the latest under this shorter and simpler
URL (too), though?


Hello.

Works for me.



I believe a symlink (in the file system) on gcc.gnu.org could pull that
off.


Yep, please do so.





So please remove content of /www/gcc/htdocs-preformatted/install


That'll make some things simpler.

Note how in style.mthml we have some special provisions for install/.

Over the last years I have reduced those to a large extent. There is still
a little bit post-processing going on right now including setting our CSS
and our favicon.

Should we see how to move those over to the new setup, or would you drop
that?


Well, the entire content of gcc.gnu.org/onlinedocs/install/ is *one* of our
documentations and there should not be anything special about it.
Does it make sense?

Cheers,
Martin



Gerald




Re: [PATCH v2 2/4] LoongArch: Add ftint{,rm,rp}.{w,l}.{s,d} instructions

2022-11-10 Thread Xi Ruoyao via Gcc-patches
On Thu, 2022-11-10 at 14:41 +0800, Lulu Cheng wrote:
> 
> 在 2022/11/9 下午9:53, Xi Ruoyao 写道:
> > +;; Convert floating-point numbers to integers
> > +(define_insn "2"
> > +  [(set (match_operand:ANYFI 0 "register_operand" "=f")
> > +   (unspec:ANYFI [(match_operand:ANYF 1 "register_operand"
> > "f")]
> > + LRINT))]
> > +  "TARGET_HARD_FLOAT &&
> > +   (
> > +    || flag_fp_int_builtin_inexact
> > +    || !flag_trapping_math)"
> > 
> +    || !flag_trapping_math
> 
> I think this condition is backwards.

I copied the logic from aarch64.md:6702.

Joseph: can you confirm that -ftrapping-math allows floor and ceil to
raise inexact exception?  The man page currently says:

The default is -ffp-int-builtin-inexact, allowing the exception to be
raised, unless C2X or a later C standard is selected.  This option does 
   ^^^
nothing unless -ftrapping-math is in effect.

To me it's not very clear that "this option" stands for -fno-fp-int-
builtin-inexact or -ffp-int-builtin-inexact.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [RFC] docs: remove documentation for unsupported releases

2022-11-10 Thread Martin Liška

On 11/10/22 08:29, Gerald Pfeifer wrote:

On Wed, 9 Nov 2022, Alexander Monakov wrote:

For this I would suggest using the  tag to neatly fold links
for old releases. Please see the attached patch.


Loving it, Alexander!

What do you guys think about unfolding all releases we, the GCC project,
currently support (per https://gcc.gnu.org that'd be 12.x, 11.x, and 10.x
at this point)?


Works for me!



Either way: yes, please (aka approved). :-)


Alexander, can you please install such change?

Thanks,
Martin



On Wed, 9 Nov 2022, Martin Liška wrote:

I do support the patch. It should help with the Google indexing (maybe).


And thank you for raising this, Martin!

Gerald




  1   2   >