[PATCH 1/9, revised] PowerPC: Map long double built-in functions if IEEE 128-bit long double.

2020-10-08 Thread Michael Meissner via Gcc-patches
PowerPC: Map long double built-in functions if IEEE 128-bit long double.

This patch is revised from the first version of the patch posted.  It uses the
names that are not in the user's namespace (i.e. __sinieee128 instead of
sinf128) that Joseph Myers suggested.

This patch goes through the built-in functions and changes the name of the
math, scanf, and printf built-in functions to use the functions that GLIBC
provides when long double uses the IEEE 128-bit representation.

Normally the mapping is done in the math.h and stdio.h files.  However, not
everybody uses these files, which means we also need to change the external
name for the built-in function within the compiler.

In addition, changing the name in GCC allows the Fortran compiler to
automatically use the correct name.

To map the math functions, typically this patch changes l to
__ieee128.  However there are some exceptions that are handled with this
patch.

To map the printf functions,  is mapped to __ieee128.

To map the scanf functions,  is mapped to __isoc99ieee128.

gcc/
2020-10-09  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_mangle_decl_assembler_name): Add
support for mapping built-in function names for long double
built-in functions if long double is IEEE 128-bit.

gcc/testsuite/
2020-10-09  Michael Meissner  

* gcc.target/powerpc/float128-longdouble-math.c: New test.
* gcc.target/powerpc/float128-longdouble-stdio.c: New test.
* gcc.target/powerpc/float128-math.c: Adjust test for new name
being generated.
---
 gcc/config/rs6000/rs6000.c| 162 -
 .../powerpc/float128-longdouble-math.c| 567 ++
 .../powerpc/float128-longdouble-stdio.c   |  37 ++
 .../gcc.target/powerpc/float128-math.c|   6 +-
 4 files changed, 738 insertions(+), 34 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-longdouble-math.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-longdouble-stdio.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index b28f4adf464..4c141c9f276 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -26897,56 +26897,156 @@ rs6000_globalize_decl_name (FILE * stream, tree decl)
library before you can switch the real*16 type at compile time.
 
We use the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change this name.  We
-   only do this if the default is that long double is IBM extended double, and
-   the user asked for IEEE 128-bit.  */
+   only do this transformation if the __float128 type is enabled.  This
+   prevents us from doing the transformation on older 32-bit ports that might
+   have enabled using IEEE 128-bit floating point as the default long double
+   type.  */
 
 static tree
 rs6000_mangle_decl_assembler_name (tree decl, tree id)
 {
-  if (!TARGET_IEEEQUAD_DEFAULT && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
-  && TREE_CODE (decl) == FUNCTION_DECL && DECL_IS_BUILTIN (decl) )
+  if (TARGET_FLOAT128_TYPE && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
+  && TREE_CODE (decl) == FUNCTION_DECL
+  && fndecl_built_in_p (decl, BUILT_IN_NORMAL))
 {
   size_t len = IDENTIFIER_LENGTH (id);
   const char *name = IDENTIFIER_POINTER (id);
+  const char *newname = NULL;
 
-  if (name[len - 1] == 'l')
+  /* See if it is one of the built-in functions with an unusual name.  */
+  switch (DECL_FUNCTION_CODE (decl))
{
- bool uses_ieee128_p = false;
- tree type = TREE_TYPE (decl);
- machine_mode ret_mode = TYPE_MODE (type);
+   default:
+ break;
 
- /* See if the function returns a IEEE 128-bit floating point type or
-complex type.  */
- if (ret_mode == TFmode || ret_mode == TCmode)
-   uses_ieee128_p = true;
- else
+   case BUILT_IN_DREML:
+ newname = "__remainderieee128";
+ break;
+
+   case BUILT_IN_GAMMAL:
+ newname = "__lgammaieee128";
+ break;
+
+   case BUILT_IN_GAMMAL_R:
+   case BUILT_IN_LGAMMAL_R:
+ newname = "__lgammaieee128_r";
+ break;
+
+   case BUILT_IN_NEXTTOWARD:
+ newname = "__nexttoward_to_ieee128";
+ break;
+
+   case BUILT_IN_NEXTTOWARDF:
+ newname = "__nexttowardf_to_ieee128";
+ break;
+
+   case BUILT_IN_NEXTTOWARDL:
+ newname = "__nexttowardieee128";
+ break;
+
+   case BUILT_IN_POW10L:
+ newname = "__exp10ieee128";
+ break;
+
+   case BUILT_IN_SCALBL:
+ newname = "__scalbnieee128";
+ break;
+
+   case BUILT_IN_SIGNIFICANDL:
+ newname = "__significandieee128";
+ break;
+
+   case BUILT_IN_SINCOSL:
+ newname = "__sincosieee128";
+ break;
+   }
+
+  /* Update the __builtin_*printf && __builtin_*scanf functions.  */
+  if (!newname)
+   {
+ const size_t printf_len = sizeof 

Re: [ Preprocessor ] [ Common ] Feature: Macros for identifying the wide and narrow execution string literal encoding

2020-10-08 Thread JeanHeyd Meneide via Gcc-patches
Dear Joseph,

On Thu, Oct 8, 2020 at 1:36 PM Joseph Myers  wrote:
>
> This documentation doesn't seem sufficient to use the macros.  Do they
> expand to (narrow) string literals?  To an unquoted sequence of
> characters?  I think from the implementation that the answer is strings
> (so, in particular, not usable for testing anything in #if conditionals),
> but the documentation ought to say so.  The test ought to verify the form
> of the expansion as well (even if it can't do anything useful at execution
> time, because if you make the macros reflect the command-line options they
> are character set names that are meaningful on the host, and any
> conversion functionality on the target may not use the same names as the
> host).

 You're right; sorry about that, I should have been more thorough!
I thought about adding a test to check the name itself (e.g, for
"UTF-8"), but that might make tests fail on platforms where the
default SOURCE_CHARSET from the dev files is not, in fact, UTF-8. I
could also try to pass some options but then I'd have to guarantee
that the encoding was available on all testable platforms, too...!

In the end, for the tests, I just initialize two "const char[]"
directly from the macro expansions to make sure we are getting
strings. It seems to work okay. Attached is the revised patch with
better docs and test!

Sincerely,
JeanHeyd

2020-10-08  JeanHeyd "ThePhD" Meneide  

* gcc/c-family/c-cppbuiltin.c: Add predefined macro
definitions for charsets
* gcc/doc/cpp.texi: Document new predefined macro.
* gcc/testsuite/c-c++-common/cpp/wide-narrow-predef-macros.c (new):
  New test for macro definitions to always exist.
* libcpp/include/cpplib.h: Add functions declarations for
  retrieving charset names
* libcpp/directives.c: Add function definitions to retrieve charset
  names.
* libcpp/internal.h: Add to/from name preservations
diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 74ecca8de8e..8de25786592 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -866,6 +866,13 @@ c_cpp_builtins (cpp_reader *pfile)
 
   define_language_independent_builtin_macros (pfile);
 
+  /* encoding definitions used by users and libraries  */
+  builtin_define_with_value ("__GNUC_EXECUTION_CHARSET_NAME",
+cpp_get_narrow_charset_name (pfile), 1);
+  builtin_define_with_value ("__GNUC_WIDE_EXECUTION_CHARSET_NAME",
+cpp_get_wide_charset_name (pfile), 1);
+
+
   if (c_dialect_cxx ())
   {
 int major;
diff --git a/gcc/doc/cpp.texi b/gcc/doc/cpp.texi
index 33f876ab706..90f1162add1 100644
--- a/gcc/doc/cpp.texi
+++ b/gcc/doc/cpp.texi
@@ -2451,6 +2451,15 @@ features are supported by GCC.
 @item __NO_MATH_ERRNO__
 This macro is defined if @option{-fno-math-errno} is used, or enabled
 by another option such as @option{-ffast-math} or by default.
+
+@item __GNUC_EXECUTION_CHARSET_NAME
+@itemx __GNUC_WIDE_EXECUTION_CHARSET_NAME
+These macros are defined to expand to a narrow string literal of
+the name of the narrow and wide compile-time execution character
+set used.  It directly reflects the name passed to the options
+@option{-fexec-charset} and @option{-fwide-exec-charset}, or the defaults
+documented for those options (that is, it can expand to something like 
+@code{"UTF-8"}).  @xref{Invocation}.
 @end table
 
 @node System-specific Predefined Macros
diff --git a/gcc/testsuite/c-c++-common/cpp/wide-narrow-predef-macros.c 
b/gcc/testsuite/c-c++-common/cpp/wide-narrow-predef-macros.c
new file mode 100644
index 000..d5440f8a61e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/wide-narrow-predef-macros.c
@@ -0,0 +1,13 @@
+/*
+   { dg-do compile }
+ */
+
+#if !defined(__GNUC_EXECUTION_CHARSET_NAME)
+#error "Required implementation macro for comple-time charset name is not 
present"
+#endif
+#if !defined(__GNUC_WIDE_EXECUTION_CHARSET_NAME)
+#error "Required implementation macro for wide comple-time charset name is not 
present"
+#endif
+
+const char narrow_name[] = __GNUC_EXECUTION_CHARSET_NAME;
+const char wide_name[] = __GNUC_WIDE_EXECUTION_CHARSET_NAME;
diff --git a/libcpp/charset.c b/libcpp/charset.c
index 28b81c9c864..3e5578b1390 100644
--- a/libcpp/charset.c
+++ b/libcpp/charset.c
@@ -638,6 +638,9 @@ init_iconv_desc (cpp_reader *pfile, const char *to, const 
char *from)
   char *pair;
   size_t i;
 
+  ret.to = to;
+  ret.from = from;
+
   if (!strcasecmp (to, from))
 {
   ret.func = convert_no_conversion;
diff --git a/libcpp/directives.c b/libcpp/directives.c
index f59718708e4..ad540872581 100644
--- a/libcpp/directives.c
+++ b/libcpp/directives.c
@@ -2571,6 +2571,20 @@ cpp_set_callbacks (cpp_reader *pfile, cpp_callbacks *cb)
   pfile->cb = *cb;
 }
 
+/* The narrow character set identifier.  */
+const char *
+cpp_get_narrow_charset_name (cpp_reader *pfile)
+{
+  return pfile->narrow_cset_desc.to;
+}
+
+/* The wide character set identifier.  */
+const 

[r11-3717 Regression] FAIL: gcc.dg/pr97315-1.c (test for excess errors) on Linux/x86_64

2020-10-08 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

214d514fafcd78cd54e4a4aa9ae08c89abf9cc57 is the first bad commit
commit 214d514fafcd78cd54e4a4aa9ae08c89abf9cc57
Author: Aldy Hernandez 
Date:   Thu Oct 8 11:15:23 2020 +0200

Fix PR97315 (part 1 of 2)

caused

FAIL: gcc.dg/pr97315-1.c (test for excess errors)

with GCC configured with

Configured with: ../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3717/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr97315-1.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr97315-1.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr97315-1.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr97315-1.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[pushed] c++: Fix member alias template in C++17 and up. [PR96805]

2020-10-08 Thread Jason Merrill via Gcc-patches
Here we're trying to push into a::c in order to instantiate t, but
were building a TYPENAME_TYPE for it because a isn't open yet.  Don't
do that when we know we're trying to enter the scope.

Tested x86_64-pc-linux-gnu, applying to trunk and 10.

gcc/cp/ChangeLog:

PR c++/96805
PR c++/96199
* pt.c (tsubst_aggr_type): Don't build a TYPENAME_TYPE when
entering_scope.
(tsubst_template_decl): Use tsubst_aggr_type.

gcc/testsuite/ChangeLog:

PR c++/96805
* g++.dg/cpp0x/alias-decl-pr96805.C: New test.
---
 gcc/cp/pt.c | 9 +++--
 gcc/testsuite/g++.dg/cpp0x/alias-decl-pr96805.C | 9 +
 2 files changed, 16 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-pr96805.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 9a83810b67c..555dc47b464 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13451,7 +13451,8 @@ tsubst_aggr_type (tree t,
 complain, in_decl);
  if (argvec == error_mark_node)
r = error_mark_node;
- else if (cxx_dialect >= cxx17 && dependent_scope_p (context))
+ else if (!entering_scope
+  && cxx_dialect >= cxx17 && dependent_scope_p (context))
{
  /* See maybe_dependent_member_ref.  */
  tree name = TYPE_IDENTIFIER (t);
@@ -14160,7 +14161,11 @@ tsubst_template_decl (tree t, tree args, 
tsubst_flags_t complain,
  class_p = true;
  inner = TREE_TYPE (inner);
}
-  inner = tsubst (inner, args, complain, in_decl);
+  if (class_p)
+   inner = tsubst_aggr_type (inner, args, complain,
+ in_decl, /*entering*/1);
+  else
+   inner = tsubst (inner, args, complain, in_decl);
 }
   --processing_template_decl;
   if (inner == error_mark_node)
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-pr96805.C 
b/gcc/testsuite/g++.dg/cpp0x/alias-decl-pr96805.C
new file mode 100644
index 000..c784f273441
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-pr96805.C
@@ -0,0 +1,9 @@
+// PR c++/96805
+// { dg-do compile { target c++11 } }
+
+template  class a {
+  template  struct c {
+template  using t = int;
+t m;
+  };
+};

base-commit: a500588aa523f1f04ae51809e476e3ebfcb081d3
-- 
2.18.1



Re: [PATCH] libstdc++: Make ranges::construct_at constexpr-friendly [PR95788]

2020-10-08 Thread Jonathan Wakely via Gcc-patches

On 07/10/20 12:10 -0400, Patrick Palka via Libstdc++ wrote:

On Wed, 30 Sep 2020, Patrick Palka wrote:


This rewrites ranges::construct_at in terms of std::construct_at so
that we can piggy back on the compiler's existing support for
recognizing placement new within std::construct_at during constexpr
evaluation instead of having to additionally teach the compiler about
ranges::construct_at.

While we're here, we should also make ranges::construct_at conditionally
noexcept like std::construct_at.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

libstdc++-v3/ChangeLog:

PR libstdc++/95788
* include/bits/ranges_uninitialized.h:
(__construct_at_fn::operator()): Just call std::construct_at.
Declare it conditionally noexcept.
* testsuite/20_util/specialized_algorithms/construct_at/95788.cc:
New test.


Here's an updated patch that std::-qualifies the calls to declval in its
function signature and adjusts formatting accordingly:


OK for trunk and gcc-10, thanks.



[PATCH] c++: Handle RANGE_EXPR indexes in init_subob_ctx [PR97328]

2020-10-08 Thread Patrick Palka via Gcc-patches
In the testcase below, we're ICEing during constexpr evaluation of the
CONSTRUCTOR {.data={{}, [1 ... 7]={}}} of type 'vector'.  The apparently
unique thing about this CONSTRUCTOR is that it has a RANGE_EXPR index
whose corresponding sub-aggregate initializer doesn't satisfy
reduced_constant_expression_p (because its field 't' is uninitialized).

This is a problem because init_subob_ctx currently punts if the
constructor index is a RANGE_EXPR, so when cxx_eval_bare_aggregate
recurses into this sub-aggregate initializer we trip over the
same_type_p assert in verify_ctor_sanity.

Fix this by making init_subob_ctx set up an appropriate sub-aggregate
initialization context even when the index is a RANGE_EXPR.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk and the 10 branch?

gcc/cp/ChangeLog:

PR c++/97328
* constexpr.c (init_subob_ctx): Don't punt if the index is a
RANGE_EXPR, instead build a sub-aggregate initialization context
with no subobject.

gcc/testsuite/ChangeLog:

PR c++/97328
* g++.dg/cpp2a/constexpr-init19.C: New test.
* g++.dg/cpp2a/constexpr-init20.C: New test.
---
 gcc/cp/constexpr.c| 13 +++--
 gcc/testsuite/g++.dg/cpp2a/constexpr-init19.C | 15 +++
 gcc/testsuite/g++.dg/cpp2a/constexpr-init20.C | 15 +++
 3 files changed, 37 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-init19.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-init20.C

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index a118f8a810b..e50a2a220cb 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3953,11 +3953,6 @@ init_subob_ctx (const constexpr_ctx *ctx, constexpr_ctx 
_ctx,
 {
   new_ctx = *ctx;
 
-  if (index && TREE_CODE (index) != INTEGER_CST
-  && TREE_CODE (index) != FIELD_DECL)
-/* This won't have an element in the new CONSTRUCTOR.  */
-return;
-
   tree type = initialized_type (value);
   if (!AGGREGATE_TYPE_P (type) && !VECTOR_TYPE_P (type))
 /* A non-aggregate member doesn't get its own CONSTRUCTOR.  */
@@ -3967,7 +3962,13 @@ init_subob_ctx (const constexpr_ctx *ctx, constexpr_ctx 
_ctx,
  update object to refer to the subobject and ctor to refer to
  the (newly created) sub-initializer.  */
   if (ctx->object)
-new_ctx.object = build_ctor_subob_ref (index, type, ctx->object);
+{
+  if (index == NULL_TREE || TREE_CODE (index) == RANGE_EXPR)
+   /* There's no well-defined subobject for this index.  */
+   new_ctx.object = NULL_TREE;
+  else
+   new_ctx.object = build_ctor_subob_ref (index, type, ctx->object);
+}
   tree elt = build_constructor (type, NULL);
   CONSTRUCTOR_NO_CLEARING (elt) = true;
   new_ctx.ctor = elt;
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-init19.C 
b/gcc/testsuite/g++.dg/cpp2a/constexpr-init19.C
new file mode 100644
index 000..d354c5ad609
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-init19.C
@@ -0,0 +1,15 @@
+// PR c++/97328
+// { dg-do compile { target c++20 } }
+
+struct vector {
+  struct storage {
+int t;
+constexpr storage() {}
+  } data[8];
+};
+
+constexpr auto foo() {
+  vector i;
+  return i;
+}
+auto val = foo();
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-init20.C 
b/gcc/testsuite/g++.dg/cpp2a/constexpr-init20.C
new file mode 100644
index 000..1a6ed8d86dd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-init20.C
@@ -0,0 +1,15 @@
+// PR c++/97328
+// { dg-do compile { target c++20 } }
+
+struct vector {
+  union storage {
+int t;
+constexpr storage() {}
+  } data[8];
+};
+
+constexpr auto foo() {
+  vector i;
+  return i;
+}
+auto val = foo();
-- 
2.29.0.rc0



Re: [PING][PATCH v2] combine: Don't turn (mult (extend x) 2^n) into extract [PR96998]

2020-10-08 Thread Segher Boessenkool
On Thu, Oct 08, 2020 at 11:21:26AM +0100, Alex Coplan wrote:
> Ping. The kernel is still broken on AArch64.

You *cannot* fix a correctness bug with a combine addition.  So please
fix the target bug first.

I haven't had time to look at your patch yet, sorry.


Segher


Re: make sincos take type from intrinsic formal, not from result assignment

2020-10-08 Thread Alexandre Oliva
On Oct  8, 2020, Richard Biener  wrote:

> OK with a minor nit, see below

>> I'm a little unhappy with the duplication of the CASE_MATHFN* sequence,
>> that ought to be kept in sync, , and considered turning that whole
>> sequence into a #define used in both places, but that would bloat the
>> patch further and make it less readable, so I figured I'd propose this
>> one first.  Please let me know if you agree this additional change would
>> make it better.

> Yeah, I guess so.

Thanks, I folded that change into the patch.  Bloated version below :-)

> use !types_compatible_p (type, t) here if we don't want to go
> with the stricter type == t (both yours and types_compatible_p
> would treat long double and double the same in case they
> are mapped to the same FP mode).  types_compatible_p
> is the appropriate GIMPLE predicate here.

Thanks, fixed, regstrapped.

Here's what I'm installing.


take type from intrinsic in sincos pass

From: Alexandre Oliva 

This is a first step towards enabling the sincos optimization in Ada.

The issue this patch solves is that sincos takes the type to be looked
up with mathfn_built_in from variables or temporaries passed as
arguments to SIN and COS intrinsics.  In Ada, different float types
may be used but, despite their representation equivalence, their
distinctness causes the optimization to be skipped, because they are
not the types that mathfn_built_in expects.

This patch introduces a function that maps intrinsics to the type
they're associated with, and uses that type, obtained from the
intrinsics used in calls to be optimized, to look up the correspoding
CEXPI intrinsic.

For the sake of defensive programming, when using the type obtained
from the intrinsic, it now checks that, if different types are found
for the used argument, or for other calls that use it, that the types
are interchangeable.


for  gcc/ChangeLog

* builtins.c (mathfn_built_in_type): New.
* builtins.h (mathfn_built_in_type): Declare.
* tree-ssa-math-opts.c (execute_cse_sincos_1): Use it to
obtain the type expected by the intrinsic.
---
 gcc/builtins.c   |  236 +-
 gcc/builtins.h   |1 
 gcc/tree-ssa-math-opts.c |   17 +++
 3 files changed, 164 insertions(+), 90 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index f91266e4..284926f 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -2160,95 +2160,98 @@ mathfn_built_in_2 (tree type, combined_fn fn)
 
   switch (fn)
 {
-CASE_MATHFN (ACOS)
-CASE_MATHFN (ACOSH)
-CASE_MATHFN (ASIN)
-CASE_MATHFN (ASINH)
-CASE_MATHFN (ATAN)
-CASE_MATHFN (ATAN2)
-CASE_MATHFN (ATANH)
-CASE_MATHFN (CBRT)
-CASE_MATHFN_FLOATN (CEIL)
-CASE_MATHFN (CEXPI)
-CASE_MATHFN_FLOATN (COPYSIGN)
-CASE_MATHFN (COS)
-CASE_MATHFN (COSH)
-CASE_MATHFN (DREM)
-CASE_MATHFN (ERF)
-CASE_MATHFN (ERFC)
-CASE_MATHFN (EXP)
-CASE_MATHFN (EXP10)
-CASE_MATHFN (EXP2)
-CASE_MATHFN (EXPM1)
-CASE_MATHFN (FABS)
-CASE_MATHFN (FDIM)
-CASE_MATHFN_FLOATN (FLOOR)
-CASE_MATHFN_FLOATN (FMA)
-CASE_MATHFN_FLOATN (FMAX)
-CASE_MATHFN_FLOATN (FMIN)
-CASE_MATHFN (FMOD)
-CASE_MATHFN (FREXP)
-CASE_MATHFN (GAMMA)
-CASE_MATHFN_REENT (GAMMA) /* GAMMA_R */
-CASE_MATHFN (HUGE_VAL)
-CASE_MATHFN (HYPOT)
-CASE_MATHFN (ILOGB)
-CASE_MATHFN (ICEIL)
-CASE_MATHFN (IFLOOR)
-CASE_MATHFN (INF)
-CASE_MATHFN (IRINT)
-CASE_MATHFN (IROUND)
-CASE_MATHFN (ISINF)
-CASE_MATHFN (J0)
-CASE_MATHFN (J1)
-CASE_MATHFN (JN)
-CASE_MATHFN (LCEIL)
-CASE_MATHFN (LDEXP)
-CASE_MATHFN (LFLOOR)
-CASE_MATHFN (LGAMMA)
-CASE_MATHFN_REENT (LGAMMA) /* LGAMMA_R */
-CASE_MATHFN (LLCEIL)
-CASE_MATHFN (LLFLOOR)
-CASE_MATHFN (LLRINT)
-CASE_MATHFN (LLROUND)
-CASE_MATHFN (LOG)
-CASE_MATHFN (LOG10)
-CASE_MATHFN (LOG1P)
-CASE_MATHFN (LOG2)
-CASE_MATHFN (LOGB)
-CASE_MATHFN (LRINT)
-CASE_MATHFN (LROUND)
-CASE_MATHFN (MODF)
-CASE_MATHFN (NAN)
-CASE_MATHFN (NANS)
-CASE_MATHFN_FLOATN (NEARBYINT)
-CASE_MATHFN (NEXTAFTER)
-CASE_MATHFN (NEXTTOWARD)
-CASE_MATHFN (POW)
-CASE_MATHFN (POWI)
-CASE_MATHFN (POW10)
-CASE_MATHFN (REMAINDER)
-CASE_MATHFN (REMQUO)
-CASE_MATHFN_FLOATN (RINT)
-CASE_MATHFN_FLOATN (ROUND)
-CASE_MATHFN_FLOATN (ROUNDEVEN)
-CASE_MATHFN (SCALB)
-CASE_MATHFN (SCALBLN)
-CASE_MATHFN (SCALBN)
-CASE_MATHFN (SIGNBIT)
-CASE_MATHFN (SIGNIFICAND)
-CASE_MATHFN (SIN)
-CASE_MATHFN (SINCOS)
-CASE_MATHFN (SINH)
-CASE_MATHFN_FLOATN (SQRT)
-CASE_MATHFN (TAN)
-CASE_MATHFN (TANH)
-CASE_MATHFN (TGAMMA)
-CASE_MATHFN_FLOATN (TRUNC)
-CASE_MATHFN (Y0)
-CASE_MATHFN (Y1)
+#define SEQ_OF_CASE_MATHFN \
+CASE_MATHFN (ACOS) \
+CASE_MATHFN (ACOSH)\
+

Re: [PING][PATCH] correct handling of indices into arrays with elements larger than 1 (PR c++/96511)

2020-10-08 Thread Jason Merrill via Gcc-patches

On 10/8/20 3:18 PM, Martin Sebor wrote:

On 10/7/20 3:01 PM, Jason Merrill wrote:

On 10/7/20 4:11 PM, Martin Sebor wrote:

...

For the various member functions, please include the comments 
with the definition as well as the in-class declaration.


Only one access_ref member function is defined out-of-line: 
offset_bounded().  I've adjusted the comment and copied it above

the function definition.


And size_remaining, as quoted above?


I have this in my tree:

/* Return the maximum amount of space remaining and if non-null, set
    argument to the minimum.  */

I'll add it when I commit the patch.



I also don't see a comment above the definition of offset_bounded in 
the new patch?


There is a comment in the latest patch.

...

The goal of conditionals is to avoid overwhelming the user with
excessive numbers that may not be meaningful or even relevant
to the warning.  I've corrected the function body, tweaked and
renamed the get_range function to get_offset_range to do a better
job of extracting ranges from the types of some nonconstant
expressions the front end passes it, and added a new test for
all this.  Attached is the new revision.


offset_bounded looks unchanged in the new patch.  It still returns 
true iff either the range is a single value or one of the bounds are 
unrepresentable in ptrdiff_t.  I'm still unclear how this corresponds 
to "Return true if OFFRNG is bounded to a subrange of possible offset 
values."


I don't think you're looking at the latest patch.  It has this:

+/* Return true if OFFRNG is bounded to a subrange of offset values
+   valid for the largest possible object.  */
+
  bool
  access_ref::offset_bounded () const
  {
-  if (offrng[0] == offrng[1])
-    return false;
-
    tree min = TYPE_MIN_VALUE (ptrdiff_type_node);
    tree max = TYPE_MAX_VALUE (ptrdiff_type_node);
-  return offrng[0] <= wi::to_offset (min) || offrng[1] >= wi::to_offset 
(max);
+  return wi::to_offset (min) <= offrng[0] && offrng[1] <= wi::to_offset 
(max);

  }

Here's a link to it in the archive:

https://gcc.gnu.org/pipermail/gcc-patches/2020-September/555019.html
https://gcc.gnu.org/pipermail/gcc-patches/attachments/20200928/9026783a/attachment-0003.bin 


Ah, yes, there are two patches in that email; the first introduces the 
broken offset_bounded, and the second one fixes it without mentioning 
that in the ChangeLog.  How about moving the fix to the first patch?


Jason



Re: [PATCH] adjust BB vectorization dump scanning

2020-10-08 Thread Thomas Schwinge
Hi Richard!

On 2020-10-08T13:34:02+0200, Richard Biener  wrote:
> It might be interesting to work on adding sth like
> dg-warning to look for -fopt-info-{optimized,missing} so
> we could directly annotate (not) vectorized loops instead of
> relying on fragile counts.

I'm maybe (likely?) misunderstanding what you're looking for, but just in
case I'm not, the following works already:

--- gcc/testsuite/gcc.dg/vect/bb-slp-1.c
+++ gcc/testsuite/gcc.dg/vect/bb-slp-1.c
@@ -1,4 +1,5 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-fopt-info-optimized-vec" } */

 #include 
 #include "tree-vect.h"
@@ -17,7 +18,7 @@ main1 (int dummy)

   for (i = 0; i < N; i++)
 {
-  *pout++ = *pin++;
+  *pout++ = *pin++; /* { dg-message "optimized: basic block part 
vectorized" } */
   *pout++ = *pin++;
   *pout++ = *pin++;
   *pout++ = *pin++;
@@ -55,4 +56,3 @@ int main (void)
 }

 /* { dg-final { scan-tree-dump-not "can't force alignment" "slp1" } } */
-/* { dg-final { scan-tree-dump-times "basic block vectorized" 1 "slp1" } } 
*/


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: libbacktrace integration for _GLIBCXX_DEBUG mode

2020-10-08 Thread François Dumont via Gcc-patches
I eventually consider your last remark about using weak symbols to 
inject libbacktrace calls when _GLIBCXX_DEBUG_BACKTRACE is defined.


    libstdc++: [_GLIBCXX_DEBUG] Integrate libbacktrace

  Add _GLIBCXX_DEBUG_BACKTRACE macro to ask for a backtrace on 
_GLIBCXX_DEBUG

    assertions using libbacktrace.

    * config/abi/pre/gnu.ver: Add new symbols.
    * include/debug/formatter.h [_GLIBCXX_DEBUG_BACKTRACE]:
    Include .
    [_GLIBCXX_DEBUG_BACKTRACE && BACKTRACE_SUPPORTED]:
    Include .
    [(!_GLIBCXX_DEBUG_BACKTRACE || !BACKTRACE_SUPPORTED) &&
    _GLIBCXX_USE_C99_STDINT_TR1]: Include .
    [BACKTRACE_SUPPORTED || _GLIBCXX_USE_C99_STDINT_TR1]
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (__gnu_debug::__create_backtrace_state): New.
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (__gnu_debug::__render_backtrace): New.
[_GLIBCXX_DEBUG_USE_LIBBACKTRACE](_Error_formatter::_M_print_backtrace):
    New.
[_GLIBCXX_DEBUG_USE_LIBBACKTRACE](_Error_formatter::_M_backtrace_state):
    New.
    * src/c++11/debug.cc: Include .
    (_Print_func_t): New.
    (print_word): Use '%.*s' format in fprintf to render only 
expected

    number of chars.
    (print_raw(PrintContext&, const char*, ptrdiff_t)): New.
    (print_function(PrintContext&, const char*, 
_Print_func_t)): New.

    (print_type): Use latter.
    (print_string(PrintContext&, const char*, const 
_Parameter*, size_t)):

    Change signature to...
    (print_string(PrintContext&, const char*, ptrdiff_t, const 
_Parameter*,
    size_t)): ...this and adapt. Remove intermediate buffer to 
render input

    string.
    (print_string(PrintContext&, const char*, ptrdiff_t)): New.
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (print_backtrace(void*, uintptr_t, const char*, int, const 
char*)): New.

    (_Error_formatter::_M_error()): Adapt.
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (__gnu_debug::__create_backtrace_state): New, weak symbol.
    [_GLIBCXX_DEBUG_USE_LIBBACKTRACE]
    (__gnu_debug::__render_backtrace): New, weak symbol.
    * doc/xml/manual/debug_mode.xml: Document 
_GLIBCXX_DEBUG_BACKTRACE.

    * doc/xml/manual/using.xml: Likewise.

Tested under Linux x86_64.

Ok to commit ?

François

On 13/06/19 10:46 pm, François Dumont wrote:

Here is a new proposal which I think take into account all your remarks.

I discovered the great "%.*s" printf format so I was able to do some 
cleanup on the function name without any allocation.


I also agree that counting the '>' or '<' is not reliable so I remove 
this and limit the cleanup to the __cxx1998 namespace and the __ 
uglification, it is still better than nothing.


I complete the doc as advised. I also added a note about making sure 
that _GLIBCXX_DEBUG_BACKTRACE is defined consistently throughout the 
application otherwise it would break the famous ODR rule.


I introduced a _GLIBCXX_DEBUG_USE_LIBBACKTRACE to know when the system 
is able to handle libbacktrace even if the user didn't activate it. I 
could undef if when I am not building the library but I don't remember 
if there is a macro to signal that library is being built ?


Here is an output sample now:

/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/string:177: 


In function:
    gnu_debug::basic_string<_CharT, _Traits, 
_Allocator>::basic_string(const

    _CharT*, gnu_debug::basic_string<_CharT, _Traits,
    _Allocator>::size_type, const _Allocator&) [with _CharT = char; 
_Traits

    = std::char_traits; _Allocator = std::allocator;
    gnu_debug::basic_string<_CharT, _Traits, _Allocator>::size_type = 
long

    unsigned int]

Backtrace:
    0x400afd char const* gnu_debug::check_stringlong>(char const*, unsigned long, char const*, unsigned int, char const*)
/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/string:56 

    0x400afd char const* gnu_debug::check_stringlong>(char const*, unsigned long, char const*, unsigned int, char const*)
/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/string:49 

    0x400afd gnu_debug::basic_string, 
std::allocator >::basic_string(char const*, unsigned long, 
std::allocator const&)
/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/string:177 


    0x400afd test01()
/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/21_strings/basic_string/debug/1_neg.cc:27 


    0x4009c8 main
/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/21_strings/basic_string/debug/1_neg.cc:32 



Error: __s != 0 || __n == 0.
XFAIL: 21_strings/basic_string/debug/1_neg.cc execution test

    * include/debug/formatter.h [_GLIBCXX_DEBUG_BACKTRACE]:
    Include .
    [_GLIBCXX_DEBUG_BACKTRACE && BACKTRACE_SUPPORTED]:
    Include .
    [(!_GLIBCXX_DEBUG_BACKTRACE || 

[r11-3729 Regression] FAIL: gcc.c-torture/execute/loop-13.c -O3 -g (test for excess errors) on Linux/x86_64

2020-10-08 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

181702ef8ab76afbf5d2cd4d7bc0cef613397d6e is the first bad commit
commit 181702ef8ab76afbf5d2cd4d7bc0cef613397d6e
Author: Richard Biener 
Date:   Tue Oct 6 15:47:15 2020 +0200

SLP vectorize multiple BBs at once

caused

FAIL: gcc.c-torture/execute/loop-13.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (internal compiler error)
FAIL: gcc.c-torture/execute/loop-13.c   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: gcc.c-torture/execute/loop-13.c   -O3 -g  (internal compiler error)
FAIL: gcc.c-torture/execute/loop-13.c   -O3 -g  (test for excess errors)

with GCC configured with

Configured with: ../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3729/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/loop-13.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PING][PATCH] correct handling of indices into arrays with elements larger than 1 (PR c++/96511)

2020-10-08 Thread Martin Sebor via Gcc-patches

On 10/7/20 3:01 PM, Jason Merrill wrote:

On 10/7/20 4:11 PM, Martin Sebor wrote:

...

For the various member functions, please include the comments 
with the definition as well as the in-class declaration.


Only one access_ref member function is defined out-of-line: 
offset_bounded().  I've adjusted the comment and copied it above

the function definition.


And size_remaining, as quoted above?


I have this in my tree:

/* Return the maximum amount of space remaining and if non-null, set
   argument to the minimum.  */

I'll add it when I commit the patch.



I also don't see a comment above the definition of offset_bounded in the 
new patch?


There is a comment in the latest patch.

...

The goal of conditionals is to avoid overwhelming the user with
excessive numbers that may not be meaningful or even relevant
to the warning.  I've corrected the function body, tweaked and
renamed the get_range function to get_offset_range to do a better
job of extracting ranges from the types of some nonconstant
expressions the front end passes it, and added a new test for
all this.  Attached is the new revision.


offset_bounded looks unchanged in the new patch.  It still returns true 
iff either the range is a single value or one of the bounds are 
unrepresentable in ptrdiff_t.  I'm still unclear how this corresponds to 
"Return true if OFFRNG is bounded to a subrange of possible offset values."


I don't think you're looking at the latest patch.  It has this:

+/* Return true if OFFRNG is bounded to a subrange of offset values
+   valid for the largest possible object.  */
+
 bool
 access_ref::offset_bounded () const
 {
-  if (offrng[0] == offrng[1])
-return false;
-
   tree min = TYPE_MIN_VALUE (ptrdiff_type_node);
   tree max = TYPE_MAX_VALUE (ptrdiff_type_node);
-  return offrng[0] <= wi::to_offset (min) || offrng[1] >= wi::to_offset 
(max);
+  return wi::to_offset (min) <= offrng[0] && offrng[1] <= wi::to_offset 
(max);

 }

Here's a link to it in the archive:

https://gcc.gnu.org/pipermail/gcc-patches/2020-September/555019.html
https://gcc.gnu.org/pipermail/gcc-patches/attachments/20200928/9026783a/attachment-0003.bin

Is this version okay to commit?

Martin



It still looks like the middle case is unreachable: if offrng[0] == 
offrng[1], offset_bounded returns false, so we don't get as far as 
testing it for the middle case.


Jason





libcpp: Directly peek for initial line marker

2020-10-08 Thread Nathan Sidwell

Using the tokenizer to sniff for an initial line marker for
preprocessed input is a little brittle, particularly with
-fdirectives-only.  If there is no marker we'll happily munch initial
comments.  This patch directly sniffs the buffer.  This is safe
because the initial line marker was machine generated and must be
right at the beginning of the file.  Anything else is not such a line
marker.  The same is true for the initial directory marker.  For that
tokenizing the string is simplest, but at that point it's either a
regular line marker or a directory marker.  If it's a regular marker,
unwinding tokens is fine.

libcpp/
* internal.h (enum include_type): Rename IT_MAIN_INJECT to
IT_PRE_MAIN.
* init.c (cpp_read_main_file): If there is no line marker, adjust
the initial line marker.
(read_original_filename): Return bool, peek the buffer directly
before trying to tokenize.
(read_original_directory): Likewise.  Directly prod the string
literal.
* files.c (_cpp_stack_file): Adjust for IT_PRE_MAIN change.

pushing to trunk,

nathan
--
Nathan Sidwell
diff --git i/libcpp/files.c w/libcpp/files.c
index b890b8ebf1e..5af41364d0a 100644
--- i/libcpp/files.c
+++ w/libcpp/files.c
@@ -948,10 +948,12 @@ _cpp_stack_file (cpp_reader *pfile, _cpp_file *file, include_type type,
 
   /* Add line map and do callbacks.  */
   _cpp_do_file_change (pfile, LC_ENTER, file->path,
-		   /* With preamble injection, start on line zero, so
-			  the preamble doesn't appear to have been
-			  included from line 1.  */
-		   type == IT_MAIN_INJECT ? 0 : 1, sysp);
+		   /* With preamble injection, start on line zero,
+			  so the preamble doesn't appear to have been
+			  included from line 1.  Likewise when
+			  starting preprocessed, we expect an initial
+			  locating line.  */
+		   type == IT_PRE_MAIN ? 0 : 1, sysp);
 
   return true;
 }
diff --git i/libcpp/init.c w/libcpp/init.c
index aba5854d357..84c0a9efa74 100644
--- i/libcpp/init.c
+++ w/libcpp/init.c
@@ -36,7 +36,7 @@ along with this program; see the file COPYING3.  If not see
 
 static void init_library (void);
 static void mark_named_operators (cpp_reader *, int);
-static void read_original_filename (cpp_reader *);
+static bool read_original_filename (cpp_reader *);
 static void read_original_directory (cpp_reader *);
 static void post_options (cpp_reader *);
 
@@ -681,94 +681,114 @@ cpp_read_main_file (cpp_reader *pfile, const char *fname, bool injecting)
 return NULL;
 
   _cpp_stack_file (pfile, pfile->main_file,
-		   injecting ? IT_MAIN_INJECT : IT_MAIN, 0);
+		   injecting || CPP_OPTION (pfile, preprocessed)
+		   ? IT_PRE_MAIN : IT_MAIN, 0);
 
   /* For foo.i, read the original filename foo.c now, for the benefit
  of the front ends.  */
   if (CPP_OPTION (pfile, preprocessed))
-read_original_filename (pfile);
+if (!read_original_filename (pfile))
+  {
+	/* We're on line 1 after all.  */
+	auto *last = linemap_check_ordinary
+	  (LINEMAPS_LAST_MAP (pfile->line_table, false));
+	last->to_line = 1;
+	/* Inform of as-if a file change.  */
+	_cpp_do_file_change (pfile, LC_RENAME_VERBATIM, LINEMAP_FILE (last),
+			 LINEMAP_LINE (last), LINEMAP_SYSP (last));
+  }
 
   return ORDINARY_MAP_FILE_NAME (LINEMAPS_LAST_ORDINARY_MAP (pfile->line_table));
 }
 
-/* For preprocessed files, if the first tokens are of the form # NUM.
-   handle the directive so we know the original file name.  This will
-   generate file_change callbacks, which the front ends must handle
-   appropriately given their state of initialization.  */
-static void
+/* For preprocessed files, if the very first characters are
+   '#[01]', then handle a line directive so we know the
+   original file name.  This will generate file_change callbacks,
+   which the front ends must handle appropriately given their state of
+   initialization.  We peek directly into the character buffer, so
+   that we're not confused by otherwise-skipped white space &
+   comments.  We can be very picky, because this should have been
+   machine-generated text (by us, no less).  This way we do not
+   interfere with the module directive state machine.  */
+
+static bool
 read_original_filename (cpp_reader *pfile)
 {
-  const cpp_token *token, *token1;
-
-  /* Lex ahead; if the first tokens are of the form # NUM, then
- process the directive, otherwise back up.  */
-  token = _cpp_lex_direct (pfile);
-  if (token->type == CPP_HASH)
+  auto *buf = pfile->buffer->next_line;
+
+  if (pfile->buffer->rlimit - buf > 4
+  && buf[0] == '#'
+  && buf[1] == ' '
+  // Also permit '1', as that's what used to be here
+  && (buf[2] == '0' || buf[2] == '1')
+  && buf[3] == ' ')
 {
-  pfile->state.in_directive = 1;
-  token1 = _cpp_lex_direct (pfile);
-  _cpp_backup_tokens (pfile, 1);
-  pfile->state.in_directive = 0;
-
-  /* If it's a #line directive, handle it.  */
-  if 

Re: [RS6000] rotate and mask constants

2020-10-08 Thread will schmidt via Gcc-patches
On Thu, 2020-10-08 at 09:36 +1030, Alan Modra via Gcc-patches wrote:
> Implement more two insn constants.  rotate_and_mask_constant covers
> 64-bit constants that can be formed by rotating a 16-bit signed
> constant, rotating a 16-bit signed constant masked on left or right
> (rldicl and rldicr), rotating a 16-bit signed constant masked by
> rldic, and unusual "lis; rldicl" and "lis; rldicr" patterns.  All the
> values possible for DImode rs6000_is_valid_and_mask are covered.

lgtm, 

Just a couple cosmetic nits, since I was reading through.. :-)

> 
> Bootstrapped and regression tested powerpc64le-linux.
> 
>   PR 94393


PR Target/94393 

(unless the hooks currently handle that for us? )


> gcc/

gcc/ChangeLog:

>   * config/rs6000/rs6000.c (rotate_and_mask_constant): New function.
>   (num_insns_constant_multi, rs6000_emit_set_long_const): Use it here.
>   * config/rs6000/rs6000.md (*movdi_internal64+1 splitter): Delete.
> gcc/testsuite/
>   * gcc.target/powerpc/rot_cst.h,
>   * gcc.target/powerpc/rot_cst1.c,
>   * gcc.target/powerpc/rot_cst2.c: New tests.
> 



> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 14ecbad5df4..9809d11f47a 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -1129,6 +1129,8 @@ static tree rs6000_handle_altivec_attribute (tree *, 
> tree, tree, int, bool *);
>  static tree rs6000_handle_struct_attribute (tree *, tree, tree, int, bool *);
>  static tree rs6000_builtin_vectorized_libmass (combined_fn, tree, tree);
>  static void rs6000_emit_set_long_const (rtx, HOST_WIDE_INT);
> +static bool rotate_and_mask_constant (unsigned HOST_WIDE_INT, HOST_WIDE_INT 
> *,
> +   int *, unsigned HOST_WIDE_INT *);
>  static int rs6000_memory_move_cost (machine_mode, reg_class_t, bool);
>  static bool rs6000_debug_rtx_costs (rtx, machine_mode, int, int, int *, 
> bool);
>  static int rs6000_debug_address_cost (rtx, machine_mode, addr_space_t,
> @@ -5877,7 +5879,7 @@ num_insns_constant_gpr (HOST_WIDE_INT value)
>  }
> 
>  /* Helper for num_insns_constant.  Allow constants formed by the
> -   num_insns_constant_gpr sequences, plus li -1, rldicl/rldicr/rlwinm,
> +   num_insns_constant_gpr sequences, and li/lis+rldicl/rldicr/rldic/rlwinm,
> and handle modes that require multiple gprs.  */
> 
>  static int
> @@ -5892,8 +5894,8 @@ num_insns_constant_multi (HOST_WIDE_INT value, 
> machine_mode mode)
>if (insns > 2
> /* We won't get more than 2 from num_insns_constant_gpr
>except when TARGET_POWERPC64 and mode is DImode or
> -  wider, so the register mode must be DImode.  */
> -   && rs6000_is_valid_and_mask (GEN_INT (low), DImode))
> +  wider.  */
> +   && rotate_and_mask_constant (low, NULL, NULL, NULL))
>   insns = 2;
>total += insns;
>/* If BITS_PER_WORD is the number of bits in HOST_WIDE_INT, doing
> @@ -9524,6 +9526,244 @@ rs6000_emit_set_const (rtx dest, rtx source)
>return true;
>  }
> 
> +/* Rotate DImode word, being careful to handle the case where
> +   HOST_WIDE_INT is larger than DImode.  */
> +
> +static inline unsigned HOST_WIDE_INT
> +rotate_di (unsigned HOST_WIDE_INT x, unsigned int shift)
> +{
> +  unsigned HOST_WIDE_INT mask_hi, mask_lo;
> +
> +  mask_hi = (HOST_WIDE_INT_1U << 63 << 1) - (HOST_WIDE_INT_1U << shift);
> +  mask_lo = (HOST_WIDE_INT_1U << shift) - 1;
> +  x = ((x << shift) & mask_hi) | ((x >> (64 - shift)) & mask_lo);
> +  x = (x ^ (HOST_WIDE_INT_1U << 63)) - (HOST_WIDE_INT_1U << 63);
> +  return x;
> +}
> +
> +/* Can C be formed by rotating a 16-bit positive value left by C16LSB?  */
> +
> +static inline bool
> +is_rotate_positive_constant (unsigned HOST_WIDE_INT c, int c16lsb,
> +  HOST_WIDE_INT *val, int *shift,
> +  unsigned HOST_WIDE_INT *mask)
> +{
> +  if ((c & ~(HOST_WIDE_INT_UC (0x7fff) << c16lsb)) == 0)
> +{
> +  /* eg. c = 1100   ... 
> +  -> val = 0x3000, shift = 49, mask = -1ull.  */
> +  if (val)
> + {
> +   c >>= c16lsb;
> +   /* Make the value and shift canonical in the sense of
> +  selecting the smallest value.  For the example above
> +  -> val = 3, shift = 61.  */
> +   int trail_zeros = ctz_hwi (c);
> +   c >>= trail_zeros;
> +   c16lsb += trail_zeros;
> +   *val = c;
> +   *shift = c16lsb;
> +   *mask = HOST_WIDE_INT_M1U;
> + }
> +  return true;
> +}
> +  return false;
> +}
> +
> +/* Can C be formed by rotating a 16-bit negative value left by C16LSB?  */
> +
> +static inline bool
> +is_rotate_negative_constant (unsigned HOST_WIDE_INT c, int c16lsb,
> +  HOST_WIDE_INT *val, int *shift,
> +  unsigned HOST_WIDE_INT *mask)
> +{
> +  if ((c | (HOST_WIDE_INT_UC (0x7fff) << c16lsb)) == HOST_WIDE_INT_M1U)
> +{
> +  if (val)
> + {
> +   c >>= c16lsb;
> +   

[PATCH] PR fortran/95979 - [10/11 Regression] ICE in get_kind, at fortran/simplify.c:129

2020-10-08 Thread Harald Anlauf
Dear all,

the present PR turned out to be fixable rather easily, once Paul had the
idea to add another attempt of simplification of elemental intrinsics
for array-valued arguments.  There was some fallout which required only
small adjustments, see commit message below.

Regtested cleanly on x86_64-pc-linux-gnu.

OK for master / 10-branch?

Thanks,
Harald


PR fortran/95979 - ICE in get_kind, at fortran/simplify.c:129

Simplification of the elemental intrinsic INDEX with constant array-valued
arguments failed with an ICE or did not reduce to a constant array, depending
also on the presence of the optional KIND argument.  Add a further attempt of
simplification in the case of elemental intrinsics, and make sure the KIND
argument is not removed prematurely during simplification of INDEX.

gcc/fortran/ChangeLog:

PR fortran/95979
* expr.c (gfc_check_init_expr): Fix check of return code from
gfc_intrinsic_func_interface.
* intrinsic.c (gfc_intrinsic_func_interface): Add further attempt
of simplification of elemental intrinsics with array arguments.
* iresolve.c (gfc_resolve_index_func): Keep optional KIND argument
for simplification of elemental use of INDEX.

gcc/testsuite/ChangeLog:

PR fortran/95979
* gfortran.dg/index_4.f90: New test.

diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index b87ae3d72a1..32d905ad179 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -2904,7 +2904,7 @@ gfc_check_init_expr (gfc_expr *e)
 		   && (e->value.function.isym->conversion == 1);

 	if (!conversion && (!gfc_is_intrinsic (sym, 0, e->where)
-	|| (m = gfc_intrinsic_func_interface (e, 0)) != MATCH_YES))
+	|| (m = gfc_intrinsic_func_interface (e, 0)) == MATCH_NO))
 	  {
 	gfc_error ("Function %qs in initialization expression at %L "
 		   "must be an intrinsic function",
diff --git a/gcc/fortran/intrinsic.c b/gcc/fortran/intrinsic.c
index ef33587a774..f4dfcf77e0b 100644
--- a/gcc/fortran/intrinsic.c
+++ b/gcc/fortran/intrinsic.c
@@ -5038,6 +5038,11 @@ got_specific:
   if (!sym->module)
 gfc_intrinsic_symbol (sym);

+  /* Have another stab at simplification since elemental intrinsics with array
+ actual arguments would be missed by the calls above to do_simplify.  */
+  if (isym->elemental)
+gfc_simplify_expr (expr, 1);
+
   return MATCH_YES;
 }

diff --git a/gcc/fortran/iresolve.c b/gcc/fortran/iresolve.c
index c2a4865f28f..994a9af4eb8 100644
--- a/gcc/fortran/iresolve.c
+++ b/gcc/fortran/iresolve.c
@@ -1296,11 +1296,7 @@ gfc_resolve_index_func (gfc_expr *f, gfc_actual_arglist *a)

   f->ts.type = BT_INTEGER;
   if (kind)
-{
-  f->ts.kind = mpz_get_si ((kind)->value.integer);
-  a_back->next = NULL;
-  gfc_free_actual_arglist (a_kind);
-}
+f->ts.kind = mpz_get_si ((kind)->value.integer);
   else
 f->ts.kind = gfc_default_integer_kind;

diff --git a/gcc/testsuite/gfortran.dg/index_4.f90 b/gcc/testsuite/gfortran.dg/index_4.f90
new file mode 100644
index 000..09093784c8c
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/index_4.f90
@@ -0,0 +1,19 @@
+! { dg-do run }
+! { dg-options "-fdump-tree-original" }
+! { dg-final { scan-tree-dump-times "string_index" 0 "original" } }
+! PR fortran/95979
+
+program p
+  implicit none
+  integer, parameter :: i0= index( 'abcd',  'b' , .true. , kind=4)
+  integer, parameter :: i1(*) = index(['abcd'], 'b' , .true. , kind=4)
+  integer, parameter :: i2(*) = index( 'abcd' ,['b'], .true. )
+  integer, parameter :: i3(*) = index( 'abcd' , 'b' ,[.true.])
+  integer, parameter :: i4(*) = index(['abcd'],['b'],[.true.], kind=8)
+  if (size (i1) /= 1) stop 1
+  if (size (i2) /= 1) stop 2
+  if (size (i3) /= 1) stop 3
+  if (size (i4) /= 1) stop 4
+  if (i0 /= 2)stop 5
+  if (i1(1) /= 2 .or. i2(1) /= 2 .or. i3(1) /= 2 .or. i4(1) /= 2) stop 6
+end


Re: [PATCH 7/8] [RS6000] rs6000_rtx_costs reduce cost for SETs

2020-10-08 Thread will schmidt via Gcc-patches
On Thu, 2020-10-08 at 09:27 +1030, Alan Modra via Gcc-patches wrote:
> The aim of this patch is to make rtx_costs for SETs closer to
> insn_cost for SETs.  One visible effect on powerpc code is increased
> if-conversion.
> 
>   * config/rs6000/rs6000.c (rs6000_rtx_costs): Reduce cost of SET
>   operands.
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 76aedbfae6f..d455aa52427 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -21684,6 +21684,35 @@ rs6000_rtx_costs (rtx x, machine_mode mode,
> int outer_code,
>   }
>return false;
> 
> +case SET:
> +  /* On entry the value in *TOTAL is the number of general
> purpose
> +  regs being set, multiplied by COSTS_N_INSNS (1).  Handle
> +  costing of set operands specially since in most cases we have
> +  an instruction rather than just a piece of RTL and should
> +  return a cost comparable to insn_cost.  That's a little
> +  complicated because in some cases the cost of SET operands is
> +  non-zero, see point 5 above and cost of PLUS for example, and
> +  in others it is zero, for example for (set (reg) (reg)).
> +  But (set (reg) (reg)) has the same insn_cost as
> +  (set (reg) (plus (reg) (reg))).  Hack around this by
> +  subtracting COSTS_N_INSNS (1) from the operand cost in cases
> +  were we add at least COSTS_N_INSNS (1) for some operation.

s/were/where/   :-)

> +  However, don't do so for constants.  Constants might cost
> +  more than zero when they require more than one instruction,
> +  and we do want the cost of extra instructions.  */
> +  {
> + rtx_code src_code = GET_CODE (SET_SRC (x));
> + if (src_code == CONST_INT
> + || src_code == CONST_DOUBLE
> + || src_code == CONST_WIDE_INT)
> +   return false;
> + int set_cost = (rtx_cost (SET_SRC (x), mode, SET, 1, speed)
> + + rtx_cost (SET_DEST (x), mode, SET, 0,
> speed));
> + if (set_cost >= COSTS_N_INSNS (1))
> +   *total += set_cost - COSTS_N_INSNS (1);
> + return true;
> +  }
> +
>  default:
>return false;
>  }

lgtm, 
thanks
-Will



[r11-3723 Regression] FAIL: gcc.dg/vect/bb-slp-subgroups-3.c scan-tree-dump-times slp2 "optimized: basic block" 2 on Linux/x86_64

2020-10-08 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

532e882f8872b1b4437e3a0fa8c61d2af2d999d4 is the first bad commit
commit 532e882f8872b1b4437e3a0fa8c61d2af2d999d4
Author: Richard Biener 
Date:   Thu Oct 8 11:53:51 2020 +0200

adjust BB vectorization dump scanning

caused

FAIL: gcc.dg/vect/bb-slp-pr78205.c -flto -ffat-lto-objects  
scan-tree-dump-times slp2 "optimized: basic block" 3
FAIL: gcc.dg/vect/bb-slp-pr78205.c scan-tree-dump-times slp2 "optimized: basic 
block" 3
FAIL: gcc.dg/vect/bb-slp-subgroups-3.c -flto -ffat-lto-objects  
scan-tree-dump-times slp2 "optimized: basic block" 2
FAIL: gcc.dg/vect/bb-slp-subgroups-3.c scan-tree-dump-times slp2 "optimized: 
basic block" 2

with GCC configured with

Configured with: ../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3723/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/bb-slp-pr78205.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/bb-slp-pr78205.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/bb-slp-subgroups-3.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/bb-slp-subgroups-3.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[r11-3705 Regression] FAIL: gcc.dg/vect/pr65947-3.c scan-tree-dump-times vect "LOOP VECTORIZED" 1 on Linux/x86_64

2020-10-08 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

dae673abd37d400408959497e50fe1f3fbef5533 is the first bad commit
commit dae673abd37d400408959497e50fe1f3fbef5533
Author: Richard Biener 
Date:   Wed Oct 7 10:42:12 2020 +0200

tree-optimization/97307 - improve sinking of loads

caused

FAIL: gcc.dg/vect/pr65947-3.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "LOOP VECTORIZED" 1
FAIL: gcc.dg/vect/pr65947-3.c scan-tree-dump-times vect "LOOP VECTORIZED" 1

with GCC configured with

Configured with: ../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-3705/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/pr65947-3.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/pr65947-3.c --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [ Preprocessor ] [ Common ] Feature: Macros for identifying the wide and narrow execution string literal encoding

2020-10-08 Thread Joseph Myers
On Thu, 8 Oct 2020, JeanHeyd Meneide via Gcc-patches wrote:

> * gcc/doc/cpp.texi: Document new predefined macro.

This documentation doesn't seem sufficient to use the macros.  Do they 
expand to (narrow) string literals?  To an unquoted sequence of 
characters?  I think from the implementation that the answer is strings 
(so, in particular, not usable for testing anything in #if conditionals), 
but the documentation ought to say so.  The test ought to verify the form 
of the expansion as well (even if it can't do anything useful at execution 
time, because if you make the macros reflect the command-line options they 
are character set names that are meaningful on the host, and any 
conversion functionality on the target may not use the same names as the 
host).

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH][GCC 9][AArch64] Implement ACLE intrinsics for FRINT[32,64][Z,X]

2020-10-08 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

I'd like to backport this patch to the GCC 9 branch to implement these 
Armv8.5-a intrinsics that should have been there.
The backport is fairly simple.

Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to GCC 9 branch.

This patch implements the ACLE intrinsics to access the FRINT[32,64][Z,X] 
scalar[1] and vector[2][3] instructions
from Armv8.5-a. These are enabled when the __ARM_FEATURE_FRINT macro is defined.

They're added in a fairly standard way through builtins and unspecs at the RTL 
level.

* config/aarch64/aarch64.md ("unspec"): Add UNSPEC_FRINT32Z,
UNSPEC_FRINT32X, UNSPEC_FRINT64Z, UNSPEC_FRINT64X.
(aarch64_): New define_insn.
* config/aarch64/aarch64.h (TARGET_FRINT): Define.
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
__ARM_FEATURE_FRINT when appropriate.
* config/aarch64/aarch64-simd-builtins.def: Add builtins for frint32z,
frint32x, frint64z, frint64x.
* config/aarch64/arm_acle.h (__rint32zf, __rint32z, __rint64zf,
__rint64z, __rint32xf, __rint32x, __rint64xf, __rint64x): Define.
* config/aarch64/arm_neon.h (vrnd32z_f32, vrnd32zq_f32, vrnd32z_f64,
vrnd32zq_f64, vrnd32x_f32, vrnd32xq_f32, vrnd32x_f64, vrnd32xq_f64,
vrnd64z_f32, vrnd64zq_f32, vrnd64z_f64, vrnd64zq_f64, vrnd64x_f32,
vrnd64xq_f32, vrnd64x_f64, vrnd64xq_f64): Define.
* config/aarch64/iterators.md (VSFDF): Define.
(FRINTNZX): Likewise.
(frintnzs_op): Likewise.

* gcc.target/aarch64/acle/rintnzx_1.c: New test.
* gcc.target/aarch64/simd/vrndnzx_1.c: Likewise.


frint-9.patch
Description: frint-9.patch


Re: libstdc++: std::iterator is deprecated since C++17

2020-10-08 Thread Andreas Schwab
On Okt 08 2020, korel ka via Gcc-patches wrote:

> diff --git a/libstdc++-v3/include/bits/stl_iterator.h
> b/libstdc++-v3/include/bits/stl_iterator.h
> index 2259f7c..13d5dbb 100644
> --- a/libstdc++-v3/include/bits/stl_iterator.h
> +++ b/libstdc++-v3/include/bits/stl_iterator.h

> @@ -625,8 +650,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>*/
>template
>  class back_insert_iterator
> +#if __cplusplus < 201703L
>  : public iterator
> +#endif
>  {
> +#if __cplusplus >= 201703L // C++17
> +public:
> +  // BEGIN: Iterator base types
> +  /// One of the @link iterator_tags tag types@endlink.
> +  using iterator_category  = output_iterator_tag;
> +  /// The type "pointed to" by the iterator.
> +  using value_type = void;
> +#if __cplusplus <= 201703L

This uses <=, but the occurrences in stream[buf]_iterator.h use ==.  I
think the latter is less confusing.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] libstdc++: Implement C++20 features for

2020-10-08 Thread Jonathan Wakely via Gcc-patches

On 07/10/20 18:15 -0700, Thomas Rodgers wrote:

@@ -500,6 +576,40 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
  }
#endif

+#if __cplusplus > 201703L && _GLIBCXX_USE_CXX11_ABI
+  basic_istringstream(ios_base::openmode __mode, const allocator_type& __a)
+  : __istream_type(), _M_stringbuf(__mode | ios_base::in, __a)
+  { this->init(&_M_stringbuf); }


All these & operators need to be std::__addressof(_M_stringbuf)
instead. _M_stringbuf potentially depends on program-defined types
(the traits and allocator classes) which means user namespaces are
considered for ADL and they could define a operator& that gets used.



+
+  explicit basic_istringstream(__string_type&& __str,
+  ios_base::openmode __mode = ios_base::in )
+  : __istream_type(), _M_stringbuf(std::move(__str), __mode | ios_base::in)
+  { this->init(&_M_stringbuf); }
+
+  template
+   basic_istringstream(const basic_string<_CharT, _Traits, _SAlloc>& __str,
+   const allocator_type& __a)
+   : basic_istringstream(__str, ios_base::in, __a)
+   { }
+
+  using __sv_type = basic_string_view;


This typedef seems to only be used once. Might as well just use
basic_string_view directly in the return type
of view().

Similarly in basic_ostringstream and basic_stringstream.


diff --git a/libstdc++-v3/src/c++20/Makefile.in 
b/libstdc++-v3/src/c++20/Makefile.in
new file mode 100644
index 000..0e2de19ae59
diff --git a/libstdc++-v3/src/c++20/sstream-inst.cc 
b/libstdc++-v3/src/c++20/sstream-inst.cc
new file mode 100644
index 000..c419176ae8e
--- /dev/null
+++ b/libstdc++-v3/src/c++20/sstream-inst.cc
@@ -0,0 +1,111 @@
+// Explicit instantiation file.
+
+// Copyright (C) 1997-2020 Free Software Foundation, Inc.


Just 2020 here.


+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+//
+// ISO C++ 14882:
+//
+
+#ifndef _GLIBCXX_USE_CXX11_ABI
+// Instantiations in this file use the new SSO std::string ABI unless included
+// by another file which defines _GLIBCXX_USE_CXX11_ABI=0.


This copy comment is misleading now if we're not actually going
to include it from another file to generate the old ABI symbols.

I think just define it unconditionally and add a comment saying that
these new symbols are only defines for the SSO string ABI.


+# define _GLIBCXX_USE_CXX11_ABI 1
+#endif
+#include 
+
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+template basic_stringbuf::basic_stringbuf(const allocator_type&);
+template basic_stringbuf::basic_stringbuf(ios_base::openmode,
+const allocator_type&);
+template basic_stringbuf::basic_stringbuf(__string_type&&,
+ios_base::openmode);
+template basic_stringbuf::basic_stringbuf(basic_stringbuf&&,
+const allocator_type&);
+template basic_stringbuf::allocator_type
+basic_stringbuf::get_allocator() const noexcept;
+template basic_stringbuf::__sv_type


Looks like this would be a bit simpler if it just used string_view
here, not basic_stringbuf::__sv_type, and wstring_view below
for the wchar_t specializations.

And you could use allocator instead of
basic_stringbuf::allocator_type.

That looks a little cleaner to me, but it's a matter of opinion.

That would be necessary anyway for the basic_*stringstream types if
they don't have the __sv_type any more.



diff --git a/libstdc++-v3/testsuite/27_io/basic_istringstream/cons/char/1.cc 
b/libstdc++-v3/testsuite/27_io/basic_istringstream/cons/char/1.cc
new file mode 100644
index 000..d93141fc232
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/basic_istringstream/cons/char/1.cc
@@ -0,0 +1,85 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free 

libstdc++: std::iterator is deprecated since C++17

2020-10-08 Thread korel ka via Gcc-patches
std::iterator considered as deprecated since C++17 and shouldn't be in use.
This patch marks std::iterator as deprecated using deprecated
attribute, and replace its usages with the required member types
inside each class.

libstdc++-v3/include/bits/ChangeLog:
stl_iterator_base_types.h: Add deprecated attribute to std::iterator.
stl_bvector.h: Replace std::iterator usage with required member types.
stl_iterator.h: Replace std::iterator usage with required member types.
stl_raw_storage_iter.h: Replace std::iterator usage with
required member types.
stream_iterator.h: Replace std::iterator usage with required
member types.
streambuf_iterator.h: Replace std::iterator usage with
required member types.

libstdc++-v3/include/ext/ChangeLog:
rope: Replace std::iterator usage with required member types.

---

diff --git a/libstdc++-v3/include/bits/stl_bvector.h
b/libstdc++-v3/include/bits/stl_bvector.h
index d6f5435..e92b895 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -140,8 +140,25 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 #endif

   struct _Bit_iterator_base
+#if __cplusplus < 201703L
   : public std::iterator
+#endif
   {
+#if __cplusplus >= 201703L // C++17
+// BEGIN: Iterator base types
+/// One of the @link iterator_tags tag types@endlink.
+using iterator_category = std::random_access_iterator_tag;
+/// The type "pointed to" by the iterator.
+using value_type   = bool;
+/// Distance between iterators is represented as this type.
+using difference_type  = ptrdiff_t;
+/// This type represents a pointer-to-value_type.
+using pointer  = bool*;
+/// This type represents a reference-to-value_type.
+using reference= bool&;
+// END: Iterator base types
+#endif
+
 _Bit_type * _M_p;
 unsigned int _M_offset;

diff --git a/libstdc++-v3/include/bits/stl_iterator.h
b/libstdc++-v3/include/bits/stl_iterator.h
index 2259f7c..13d5dbb 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -124,12 +124,35 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   */
   template
 class reverse_iterator
+#if __cplusplus < 201703L
 : public iterator::iterator_category,
  typename iterator_traits<_Iterator>::value_type,
  typename iterator_traits<_Iterator>::difference_type,
  typename iterator_traits<_Iterator>::pointer,
   typename iterator_traits<_Iterator>::reference>
+#endif
 {
+#if __cplusplus >= 201703L // C++17
+protected:
+  typedef iterator_traits<_Iterator> __traits_type;
+
+public:
+  // BEGIN: Iterator base types
+#if !__cpp_lib_concepts
+  /// One of the @link iterator_tags tag types@endlink.
+  using iterator_category  = typename __traits_type::iterator_category;
+#endif
+  /// The type "pointed to" by the iterator.
+  using value_type = typename
__traits_type::value_type;
+  /// Distance between iterators is represented as this type.
+  using difference_type= typename __traits_type::difference_type;
+  /// This type represents a pointer-to-value_type.
+  using pointer= typename __traits_type::pointer;
+  /// This type represents a reference-to-value_type.
+  using reference  = typename __traits_type::reference;
+  // END: Iterator base types
+   private:
+#endif
   template
friend class reverse_iterator;

@@ -143,14 +166,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

 protected:
   _Iterator current;
-
+#if __cplusplus < 201703L
   typedef iterator_traits<_Iterator>   __traits_type;
-
+#endif
 public:
   typedef _Iteratoriterator_type;
-  typedef typename __traits_type::difference_type  difference_type;
+#if __cplusplus < 201703L
+ typedef typename __traits_type::difference_type   difference_type;
   typedef typename __traits_type::pointer  pointer;
   typedef typename __traits_type::referencereference;
+#endif

 #if __cplusplus > 201703L && __cpp_lib_concepts
   using iterator_concept
@@ -625,8 +650,28 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   */
   template
 class back_insert_iterator
+#if __cplusplus < 201703L
 : public iterator
+#endif
 {
+#if __cplusplus >= 201703L // C++17
+public:
+  // BEGIN: Iterator base types
+  /// One of the @link iterator_tags tag types@endlink.
+  using iterator_category  = output_iterator_tag;
+  /// The type "pointed to" by the iterator.
+  using value_type = void;
+#if __cplusplus <= 201703L
+  /// Distance between iterators is represented as this type.
+ 

Re: libstdc++: Attempt to resolve PR83562

2020-10-08 Thread Liu Hao via Gcc-patches
在 2020/10/8 22:56, Jason Merrill 写道:
> On 10/7/20 10:52 PM, Liu Hao via Gcc-patches wrote:
>> [Please CC me as I am not subscribed to this list.]
> 
> Hmm, why isn't the mingw implementation used for all programs?  That would 
> avoid the bug.
> 

I am afraid the libstdc++ implementation has to be kept for compatibility, as 
the mingw-w64 one was only added two years
ago. Neither am I clear about MinGW.org.

> 
> This patch is a good start but won't actually fix the bug: the calling 
> convention only makes a difference when we actually
> call the function, at the line
> 
>     e->destructor (e->object);
> 
> in atexit_thread.cc, and your patch doesn't change the calling convention for 
> elt::destructor.
> 
> I'd think we should add the attribute to __cxa_cdtor_type, and use it more 
> consistently for __cxa_*atexit and __cxa_throw.
> 

This patch contains a hunk

```diff
@@ -52,7 +57,7 @@ namespace {
   // One element in a singly-linked stack of cleanups.
   struct elt
   {
-void (*destructor)(void *);
+void (_GLIBCXX_CDTOR_CALLABI *destructor)(void *);
 void *object;
 elt *next;
 #ifdef _GLIBCXX_THREAD_ATEXIT_WIN32
```

which should suffice. I am not against introduction of another macro for this 
calling convention thing, as the name
`__cxa_thread_atexit()` doesn't suggest its first argument is a non-static 
member function. (Technically I think it should
not be, so GCC's use of it to register the destructor looks like a bug. The 
call should go through a thunk.)




-- 
Best regards,
LH_Mouse



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 5/5] Conversions between 128-bit integer and floating point values.

2020-10-08 Thread will schmidt via Gcc-patches
On Mon, 2020-10-05 at 11:52 -0700, Carl Love wrote:
> Will, Segher:
> 
> This patch adds support for converting to/from 128-bit integers and
> 128-bit decimal floating point formats using the new P10 instructions
> dcffixqq and dctfixqq.  The new instructions are only used on P10 HW,
> otherwise the conversions continue to use the existing SW routines.
> 
> The changes from the previous version include:
> 
> Fixed up the change log entry issues noted by Will.
> 
> Regression tests reran on Power 9 LE with no regression errors.
> 
> Please let me know if it looks OK to commit to mainline.
> 
>   Carl 
> -
> 
> gcc/ChangeLog
> 
> 2020-10-05  Carl Love  
>   * config/rs6000/rs6000.md (floatti2, floatunsti2,
>   fix_truncti2, fixuns_truncti2): Add
>   define_insn for mode IEEE 128.
>   * libgcc/config/rs6000/fixkfti.c: Renamed to fixkfti-sw.c.
>   Update source function name.  White space fixes.

I'd move the generic 'Update source ... White space...' bits to the
patch description.  In addition to the 'Rename' statement, some form of
'Change calls of __fixkfti to __fixkfti_sw` would be more useful here.

>   * libgcc/config/rs6000/fixunskfti.c: Renamed to fixunskfti-sw.c.
>   Update source function name.  White space fixes.
>   * libgcc/config/rs6000/float128-hw.c (__floattikf_hw,
>   __floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw):
>   New functions.
ok
>   * libgcc/config/rs6000/float128-ifunc.c (SW_OR_HW_ISA3_1):
>   New macro.
>   (__floattikf_resolve, __floatuntikf_resolve, __fixkfti_resolve,
>   __fixunskfti_resolve): Add resolve functions.
>   (__floattikf, __floatuntikf, __fixkfti, __fixunskfti): New
>   functions.
ok
>   * libgcc/config/rs6000/float128-sed (floattitf, __floatuntitf,
>   __fixtfti, __fixunstfti): Add editor commands to change
>   names.
>   * libgcc/config/rs6000/float128-sed-hw (__floattitf,
>   __floatuntitf, __fixtfti, __fixunstfti): Add editor commands
>   to change names.
ok

>   * libgcc/config/rs6000/floattikf.c: Renamed to floattikf-sw.c.
>   * libgcc/config/rs6000/floatuntikf.c: Renamed to floatuntikf-sw.c.
>   * libgcc/config/rs6000/quaad-float128.h (__floattikf_sw,
>   __floatuntikf_sw, __fixkfti_sw, __fixunskfti_sw, __floattikf_hw,
>   __floatuntikf_hw, __fixkfti_hw, __fixunskfti_hw, __floattikf,
>   __floatuntikf, __fixkfti, __fixunskfti): New extern declarations.
>   * libgcc/config/rs6000/t-float128 (floattikf, floatuntikf,
>   fixkfti, fixunskfti): Remove file names from fp128_ppc_funcs.
>   (floattikf-sw, floatuntikf-sw, fixkfti-sw, fixunskfti-sw): Add
>   file names to fp128_ppc_funcs.

> 
> gcc/testsuite/ChangeLog
> 
> 2020-10-05  Carl Love  
>   * gcc.target/powerpc/fl128_conversions.c: New file.

fp128_conversions.c



> ---
>  gcc/config/rs6000/rs6000.md   |  36 +++
>  .../gcc.target/powerpc/fp128_conversions.c| 286 ++
>  .../config/rs6000/{fixkfti.c => fixkfti-sw.c} |   4 +-
>  .../rs6000/{fixunskfti.c => fixunskfti-sw.c}  |   7 +-
>  libgcc/config/rs6000/float128-hw.c|  24 ++
>  libgcc/config/rs6000/float128-ifunc.c |  44 ++-
>  libgcc/config/rs6000/float128-sed |   4 +
>  libgcc/config/rs6000/float128-sed-hw  |   4 +
>  .../rs6000/{floattikf.c => floattikf-sw.c}|   4 +-
>  .../{floatuntikf.c => floatuntikf-sw.c}   |   4 +-
>  libgcc/config/rs6000/quad-float128.h  |  17 +-
>  libgcc/config/rs6000/t-float128   |   3 +-
>  12 files changed, 417 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/fp128_conversions.c
>  rename libgcc/config/rs6000/{fixkfti.c => fixkfti-sw.c} (96%)
>  rename libgcc/config/rs6000/{fixunskfti.c => fixunskfti-sw.c} (90%)
>  rename libgcc/config/rs6000/{floattikf.c => floattikf-sw.c} (96%)
>  rename libgcc/config/rs6000/{floatuntikf.c => floatuntikf-sw.c} (96%)
> 
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 694ff70635e..5db5d0b4505 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -6390,6 +6390,42 @@
> xscvsxddp %x0,%x1"
>[(set_attr "type" "fp")])
> 
> +(define_insn "floatti2"
> +  [(set (match_operand:IEEE128 0 "vsx_register_operand" "=v")
> +   (float:IEEE128 (match_operand:TI 1 "vsx_register_operand" "v")))]
> +  "TARGET_POWER10"
> +{
> +  return  "xscvsqqp %0,%1";
> +}
> +  [(set_attr "type" "fp")])
> +
> +(define_insn "floatunsti2"
> +  [(set (match_operand:IEEE128 0 "vsx_register_operand" "=v")
> +   (unsigned_float:IEEE128 (match_operand:TI 1 "vsx_register_operand" 
> "v")))]
> +  "TARGET_POWER10"
> +{
> +  return  "xscvuqqp %0,%1";
> +}
> +  [(set_attr "type" "fp")])
> +
> +(define_insn "fix_truncti2"
> +  [(set (match_operand:TI 0 "vsx_register_operand" "=v")
> +   (fix:TI (match_operand:IEEE128 1 

Re: [PATCH] vrp: Fix up gcc.target/aarch64/pr90838.c [PR97312, PR94801]

2020-10-08 Thread Aldy Hernandez via Gcc-patches




On 10/8/20 5:08 PM, Jakub Jelinek wrote:

On Thu, Oct 08, 2020 at 04:55:07PM +0200, Aldy Hernandez via Gcc-patches wrote:

Yes, for max == 0 aka [0, 0] I wanted:
1) if mini == -1, i.e. the DEFINED_VALUE_AT_ZERO == 2 VALUE is -1, return [-1, 
-1]
2) if maxi == prec, i.e. DEFINED_VALUE_AT_ZERO == 2 VALUE is prec, return 
[prec, prec]


Ah, I see.  Do you mind commenting that?  Or perhaps you could spell it out
obviously like:

if (max == 0) {
...
if (DEFINED_VALUE_AT_ZERO)
// do special things
...
}

But whatever is fine.  I hope to never look at these bits ever again :).


Added several comments now (but just in gimple-range.cc, I assume
vr-values.c code is what you want to kill eventually).


Thanks, and yes... vr-values is on life support :).

Aldy



Re: [PATCH] vrp: Fix up gcc.target/aarch64/pr90838.c [PR97312, PR94801]

2020-10-08 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 08, 2020 at 04:55:07PM +0200, Aldy Hernandez via Gcc-patches wrote:
> > Yes, for max == 0 aka [0, 0] I wanted:
> > 1) if mini == -1, i.e. the DEFINED_VALUE_AT_ZERO == 2 VALUE is -1, return 
> > [-1, -1]
> > 2) if maxi == prec, i.e. DEFINED_VALUE_AT_ZERO == 2 VALUE is prec, return 
> > [prec, prec]
> 
> Ah, I see.  Do you mind commenting that?  Or perhaps you could spell it out
> obviously like:
> 
> if (max == 0) {
>   ...
>   if (DEFINED_VALUE_AT_ZERO)
>   // do special things
>   ...
> }
> 
> But whatever is fine.  I hope to never look at these bits ever again :).

Added several comments now (but just in gimple-range.cc, I assume
vr-values.c code is what you want to kill eventually).

2020-10-08  Jakub Jelinek  

PR tree-optimization/94801
PR target/97312
* vr-values.c (vr_values::extract_range_basic) : When stmt is not an internal-fn call or
C?Z_DEFINED_VALUE_AT_ZERO is not 2, assume argument is not zero
and thus use [0, prec-1] range unless it can be further improved.
For CTZ, don't update maxi from upper bound if it was previously prec.
* gimple-range.cc (gimple_ranger::range_of_builtin_call) : Likewise.

* gcc.dg/tree-ssa/pr94801.c: New test.

--- gcc/vr-values.c.jj  2020-10-07 10:47:47.065983121 +0200
+++ gcc/vr-values.c 2020-10-08 15:23:56.042631592 +0200
@@ -1208,34 +1208,42 @@ vr_values::extract_range_basic (value_ra
  mini = 0;
  maxi = 1;
  goto bitop_builtin;
- /* __builtin_c[lt]z* return [0, prec-1], except for
+ /* __builtin_clz* return [0, prec-1], except for
 when the argument is 0, but that is undefined behavior.
-On many targets where the CLZ RTL or optab value is defined
-for 0 the value is prec, so include that in the range
-by default.  */
+Always handle __builtin_clz* which can be only written
+by user as UB on 0 and so [0, prec-1] range, and the internal-fn
+calls depending on how CLZ_DEFINED_VALUE_AT_ZERO is defined.  */
CASE_CFN_CLZ:
  arg = gimple_call_arg (stmt, 0);
  prec = TYPE_PRECISION (TREE_TYPE (arg));
  mini = 0;
- maxi = prec;
+ maxi = prec - 1;
  mode = SCALAR_INT_TYPE_MODE (TREE_TYPE (arg));
- if (optab_handler (clz_optab, mode) != CODE_FOR_nothing
- && CLZ_DEFINED_VALUE_AT_ZERO (mode, zerov)
- /* Handle only the single common value.  */
- && zerov != prec)
-   /* Magic value to give up, unless vr0 proves
-  arg is non-zero.  */
-   mini = -2;
+ if (gimple_call_internal_p (stmt))
+   {
+ if (optab_handler (clz_optab, mode) != CODE_FOR_nothing
+ && CLZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2)
+   {
+ /* Handle only the single common value.  */
+ if (zerov == prec)
+   maxi = prec;
+ /* Magic value to give up, unless vr0 proves
+arg is non-zero.  */
+ else
+   mini = -2;
+   }
+   }
  if (TREE_CODE (arg) == SSA_NAME)
{
  const value_range_equiv *vr0 = get_value_range (arg);
  /* From clz of VR_RANGE minimum we can compute
 result maximum.  */
  if (vr0->kind () == VR_RANGE
- && TREE_CODE (vr0->min ()) == INTEGER_CST)
+ && TREE_CODE (vr0->min ()) == INTEGER_CST
+ && integer_nonzerop (vr0->min ()))
{
  maxi = prec - 1 - tree_floor_log2 (vr0->min ());
- if (maxi != prec)
+ if (mini == -2)
mini = 0;
}
  else if (vr0->kind () == VR_ANTI_RANGE
@@ -1251,9 +1259,14 @@ vr_values::extract_range_basic (value_ra
  if (vr0->kind () == VR_RANGE
  && TREE_CODE (vr0->max ()) == INTEGER_CST)
{
- mini = prec - 1 - tree_floor_log2 (vr0->max ());
- if (mini == prec)
-   break;
+ int newmini = prec - 1 - tree_floor_log2 (vr0->max ());
+ if (newmini == prec)
+   {
+ if (maxi == prec)
+   mini = prec;
+   }
+ else
+   mini = newmini;
}
}
  if (mini == -2)
@@ -1261,27 +1274,30 @@ vr_values::extract_range_basic (value_ra
  goto bitop_builtin;
  /* __builtin_ctz* return [0, prec-1], except for
 when the argument is 0, but that is undefined behavior.
-If there is a ctz optab for this mode and
-CTZ_DEFINED_VALUE_AT_ZERO, include that in the range,
-otherwise just assume 0 won't be seen.  */
+Always 

Re: [PATCH v3] c, c++: Implement -Wsizeof-array-div [PR91741]

2020-10-08 Thread Marek Polacek via Gcc-patches
Ping for the C parts.

On Mon, Sep 28, 2020 at 02:15:41PM -0400, Marek Polacek via Gcc-patches wrote:
> On Tue, Sep 22, 2020 at 04:07:41PM -0400, Jason Merrill wrote:
> > On 9/22/20 1:29 PM, Marek Polacek wrote:
> > > Ping.
> > 
> > The C++ change is OK.
> 
> Ping for the C parts.
> 
> > > On Tue, Sep 15, 2020 at 04:33:05PM -0400, Marek Polacek via Gcc-patches 
> > > wrote:
> > > > On Tue, Sep 15, 2020 at 09:04:41AM +0200, Jakub Jelinek via Gcc-patches 
> > > > wrote:
> > > > > On Mon, Sep 14, 2020 at 09:30:44PM -0400, Marek Polacek via 
> > > > > Gcc-patches wrote:
> > > > > > --- a/gcc/c/c-tree.h
> > > > > > +++ b/gcc/c/c-tree.h
> > > > > > @@ -147,6 +147,11 @@ struct c_expr
> > > > > >etc), so we stash a copy here.  */
> > > > > > source_range src_range;
> > > > > > +  /* True iff the sizeof expression was enclosed in parentheses.
> > > > > > + NB: This member is currently only initialized when 
> > > > > > .original_code
> > > > > > + is a SIZEOF_EXPR.  ??? Add a default constructor to this 
> > > > > > class.  */
> > > > > > +  bool parenthesized_p;
> > > > > > +
> > > > > > /* Access to the first and last locations within the source 
> > > > > > spelling
> > > > > >of this expression.  */
> > > > > > location_t get_start () const { return src_range.m_start; }
> > > > > 
> > > > > I think a magic tree code would be better, c_expr is used in too many 
> > > > > places
> > > > > and returned by many functions, so it is copied over and over.
> > > > > Even if you must add it, it would be better to change the struct 
> > > > > layout,
> > > > > because right now there are fields: tree, location_t, tree, 
> > > > > 2xlocation_t,
> > > > > which means 32-bit gap on 64-bit hosts before the second tree, so the 
> > > > > new
> > > > > field would fit in there.  But, if it is mostly uninitialized, it is 
> > > > > kind of
> > > > > unclean.
> > > > 
> > > > Ok, here's a version with PAREN_SIZEOF_EXPR.  It doesn't require 
> > > > changes to
> > > > c_expr, but adding a new tree code is always a pain...
> > > > 
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > 
> > > > -- >8 --
> > > > This patch implements a new warning, -Wsizeof-array-div.  It warns about
> > > > code like
> > > > 
> > > >int arr[10];
> > > >sizeof (arr) / sizeof (short);
> > > > 
> > > > where we have a division of two sizeof expressions, where the first
> > > > argument is an array, and the second sizeof does not equal the size
> > > > of the array element.  See e.g. 
> > > > .
> > > > 
> > > > Clang makes it possible to suppress the warning by parenthesizing the
> > > > second sizeof like this:
> > > > 
> > > >sizeof (arr) / (sizeof (short));
> > > > 
> > > > so I followed suit.  In the C++ FE this was rather easy, because
> > > > finish_parenthesized_expr already set TREE_NO_WARNING.  In the C FE
> > > > I've added a new tree code, PAREN_SIZEOF_EXPR, to discern between the
> > > > non-() and () versions.
> > > > 
> > > > This warning is enabled by -Wall.  An example of the output:
> > > > 
> > > > x.c:5:23: warning: expression does not compute the number of elements 
> > > > in this array; element type is ‘int’, not ‘short int’ 
> > > > [-Wsizeof-array-div]
> > > >  5 |   return sizeof (arr) / sizeof (short);
> > > >|  ~^~~~
> > > > x.c:5:25: note: add parentheses around ‘sizeof (short int)’ to silence 
> > > > this warning
> > > >  5 |   return sizeof (arr) / sizeof (short);
> > > >| ^~
> > > >| ( )
> > > > x.c:4:7: note: array ‘arr’ declared here
> > > >  4 |   int arr[10];
> > > >|   ^~~
> > > > 
> > > > gcc/c-family/ChangeLog:
> > > > 
> > > > PR c++/91741
> > > > * c-common.c (verify_tree): Handle PAREN_SIZEOF_EXPR.
> > > > (c_common_init_ts): Likewise.
> > > > * c-common.def (PAREN_SIZEOF_EXPR): New tree code.
> > > > * c-common.h (maybe_warn_sizeof_array_div): Declare.
> > > > * c-warn.c (sizeof_pointer_memaccess_warning): Unwrap NOP_EXPRs.
> > > > (maybe_warn_sizeof_array_div): New function.
> > > > * c.opt (Wsizeof-array-div): New option.
> > > > 
> > > > gcc/c/ChangeLog:
> > > > 
> > > > PR c++/91741
> > > > * c-parser.c (c_parser_binary_expression): Implement 
> > > > -Wsizeof-array-div.
> > > > (c_parser_postfix_expression): Set PAREN_SIZEOF_EXPR.
> > > > (c_parser_expr_list): Handle PAREN_SIZEOF_EXPR like SIZEOF_EXPR.
> > > > * c-tree.h (char_type_p): Declare.
> > > > * c-typeck.c (char_type_p): No longer static.
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > PR c++/91741
> > > > * typeck.c (cp_build_binary_op): Implement -Wsizeof-array-div.
> > > > 
> > > > gcc/ChangeLog:
> > > > 
> > > > PR c++/91741
> > > > * 

Re: libstdc++: Attempt to resolve PR83562

2020-10-08 Thread Jason Merrill via Gcc-patches

On 10/7/20 10:52 PM, Liu Hao via Gcc-patches wrote:

[Please CC me as I am not subscribed to this list.]
[This patch is only a draft and has not been tested at all.]



Some details have been discussed in [1]. mingw-w64 has got an implementation 
[2] [3] for static libraries, but it takes a
destructor using the `__thiscall` convention. As both functions have C linkage, 
they should agree with each other and should
behave simliarily.


Hmm, why isn't the mingw implementation used for all programs?  That 
would avoid the bug.



Considerations:

0) This is going to be an ABI breakage for i?86. I am not sure whether people 
would expect the old, broken behavior.


I'm sure they wouldn't.


1) GCC doesn't provide `__cxa_atexit()`, but it would suffer from the same 
problem. At the moment GCC calls `atexit()` to
register destructors so it appears to work.


Yes.


2) There is an explicit reference to `__cxa_atexit` in 'gcc/cp/decl.c' and it 
probably be changed, unless it is not used by
1?86-w64-mingw32 targets.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83562
[2] 
https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-crt/crt/cxa_thread_atexit.c
[3] 
https://sourceforge.net/p/mingw-w64/mingw-w64/ci/f3e0fbb40cbc9f8821db8bd8a0c4dae8ff671e9f/
[4] https://github.com/msys2/MINGW-packages/issues/7071


This patch is a good start but won't actually fix the bug: the calling 
convention only makes a difference when we actually call the function, 
at the line


e->destructor (e->object);

in atexit_thread.cc, and your patch doesn't change the calling 
convention for elt::destructor.


I'd think we should add the attribute to __cxa_cdtor_type, and use it 
more consistently for __cxa_*atexit and __cxa_throw.


Jason



Re: [PATCH] vrp: Fix up gcc.target/aarch64/pr90838.c [PR97312, PR94801]

2020-10-08 Thread Aldy Hernandez via Gcc-patches




On 10/8/20 4:39 PM, Jakub Jelinek wrote:

On Thu, Oct 08, 2020 at 04:28:37PM +0200, Aldy Hernandez wrote:

On 10/8/20 3:54 PM, Jakub Jelinek wrote:

On Thu, Oct 08, 2020 at 12:22:11PM +0200, Jakub Jelinek via Gcc-patches wrote:

Perhaps another way out of this would be document and enforce that
__builtin_c[lt]z{,l,ll} etc calls are undefined at zero, but C[TL]Z ifn
calls are defined there based on *_DEFINED_VALUE_AT_ZERO (*) == 2


Huh, that magic 2 is not obvious.  I guess we should document the values of
this macro in the source somewhere:


The 2 is documented in gccint documentation.


BTW.  There's no reason why the vr-values can't just call the
gimple_ranger::range_of_builtin_call.  In the original implementation we had
vr-values call the ranger version and trap if they differed.  I'm pretty
sure you can have vr-values call range_of_builtin_call with a value_range,
and things will get squashed down appropriately.  We should really only have
one version of this.  I'm not suggesting you do it, but I wouldn't object to
it ;-).


Will defer that to you or Andrew ;).


I can do it once the dust settles.




--- gcc/gimple-range.cc.jj  2020-10-08 11:55:25.498313173 +0200
+++ gcc/gimple-range.cc 2020-10-08 15:36:14.926945183 +0200



@@ -714,8 +730,14 @@ gimple_ranger::range_of_builtin_call (ir
  // the maximum.
  wide_int max = r.upper_bound ();
  if (max == 0)
-   break;
- maxi = wi::floor_log2 (max);
+   {
+ if (mini == -1)
+   maxi = -1;
+ else if (maxi == prec)
+   mini = prec;
+   }
+ else if (maxi != prec)
+   maxi = wi::floor_log2 (max);


Hmmm, if max == 0, that means the range is [0, 0], because if the highest
bound of r is 0, there's nothing left on the bottom but another 0 since R is
unsigned.  Is that what you meant?


Yes, for max == 0 aka [0, 0] I wanted:
1) if mini == -1, i.e. the DEFINED_VALUE_AT_ZERO == 2 VALUE is -1, return [-1, 
-1]
2) if maxi == prec, i.e. DEFINED_VALUE_AT_ZERO == 2 VALUE is prec, return 
[prec, prec]


Ah, I see.  Do you mind commenting that?  Or perhaps you could spell it 
out obviously like:


if (max == 0) {
...
if (DEFINED_VALUE_AT_ZERO)
// do special things
...
}

But whatever is fine.  I hope to never look at these bits ever again :).

Aldy



Re: [PATCH 4/5] Test 128-bit shifts for just the int128 type.

2020-10-08 Thread will schmidt via Gcc-patches
On Mon, 2020-10-05 at 11:52 -0700, Carl Love wrote:
> Will, Segher:
> 
> Patch 4 adds the vector 128-bit integer shift instruction support for
> the V1TI type.
> 
> The changes from the previous version include:
> 
> Fixed up the change log entry issues noted by Will.
> 
> Regression tests reran on Power 9 LE with no regression errors.
> 
> Please let me know if it looks OK to commit to mainline.
> 
>   Carl 
> -
> 
> gcc/ChangeLog
> 
> 2020-10-05  Carl Love  
>   * config/rs6000/altivec.md (altivec_vslq, altivec_vsrq):
>   Rename to altivec_vslq_, altivec_vsrq_, mode VEC_TI.
>   * config/rs6000/vector.md (VEC_TI): New name for VSX_TI iterator.

What was the old name?   (Maybe just 'New iterator' ?)
Ok, back from below.  this is new name and location for what was
previously named VSX_TI in vsx.md.
Wouldn't hurt to have a statement in the description to clarify that.
"This patch renames the VSX_TI iterator to VEC_TI, and updates the
users." 


>   (vashlv1ti3): Change to vashl3, mode VEC_TI.
>   (vlshrv1ti3): Change to vlshr3, mode VEC_TI.
>   * config/rs6000/vsx.md (VSX_TI): Remove define_mode_iterator.

> 
> gcc/testsuite/ChangeLog
> 
> 2020-10-05  Carl Love  
>   gcc.target/powerpc/int_128bit-runnable.c: Add shift_right, shift_left
>   tests.
> ---
>  gcc/config/rs6000/altivec.md  | 16 -
>  gcc/config/rs6000/vector.md   | 27 ---
>  gcc/config/rs6000/vsx.md  | 33 +--
>  .../gcc.target/powerpc/int_128bit-runnable.c  | 16 +++--
>  4 files changed, 52 insertions(+), 40 deletions(-)
> 

ok

> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 34a4731342a..5db3de3cc9f 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -2219,10 +2219,10 @@
>"vsl %0,%1,%2"
>[(set_attr "type" "vecsimple")])
> 
> -(define_insn "altivec_vslq"
> -  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> - (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> -  (match_operand:V1TI 2 "vsx_register_operand" "v")))]
> +(define_insn "altivec_vslq_"
> +  [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=v")
> + (ashift:VEC_TI (match_operand:VEC_TI 1 "vsx_register_operand" "v")
> +  (match_operand:VEC_TI 2 "vsx_register_operand" "v")))]
>"TARGET_POWER10"
>/* Shift amount in needs to be in bits[57:63] of 128-bit operand. */
>"vslq %0,%1,%2"
> @@ -2236,10 +2236,10 @@
>"vsr %0,%1,%2"
>[(set_attr "type" "vecsimple")])
> 
> -(define_insn "altivec_vsrq"
> -  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> - (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> -(match_operand:V1TI 2 "vsx_register_operand" "v")))]
> +(define_insn "altivec_vsrq_"
> +  [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=v")
> + (lshiftrt:VEC_TI (match_operand:VEC_TI 1 "vsx_register_operand" "v")
> +(match_operand:VEC_TI 2 "vsx_register_operand" 
> "v")))]
>"TARGET_POWER10"
>/* Shift amount in needs to be in bits[57:63] of 128-bit operand. */
>"vsrq %0,%1,%2"

ok


> diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
> index 0cca4232619..3ea3a91845a 100644
> --- a/gcc/config/rs6000/vector.md
> +++ b/gcc/config/rs6000/vector.md
> @@ -26,6 +26,9 @@
>  ;; Vector int modes
>  (define_mode_iterator VEC_I [V16QI V8HI V4SI V2DI])
> 
> +;; 128-bit int modes
> +(define_mode_iterator VEC_TI [V1TI TI])
> +
>  ;; Vector int modes for parity
>  (define_mode_iterator VEC_IP [V8HI
> V4SI
> @@ -1627,17 +1630,17 @@
>"")
> 
>  ;; No immediate version of this 128-bit instruction
> -(define_expand "vashlv1ti3"
> -  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> - (ashift:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> -  (match_operand:V1TI 2 "vsx_register_operand" "v")))]
> +(define_expand "vashl3"
> +  [(set (match_operand:VEC_TI 0 "vsx_register_operand" "=v")
> + (ashift:VEC_TI (match_operand:VEC_TI 1 "vsx_register_operand")
> +  (match_operand:VEC_TI 2 "vsx_register_operand")))]
>"TARGET_POWER10"
>  {
>/* Shift amount in needs to be put in bits[57:63] of 128-bit operand2. */
> -  rtx tmp = gen_reg_rtx (V1TImode);
> +  rtx tmp = gen_reg_rtx (mode);
> 
>emit_insn(gen_xxswapd_v1ti (tmp, operands[2]));
> -  emit_insn(gen_altivec_vslq (operands[0], operands[1], tmp));
> +  emit_insn(gen_altivec_vslq_ (operands[0], operands[1], tmp));
>DONE;
>  })
> 
> @@ -1650,17 +1653,17 @@
>"")
> 
>  ;; No immediate version of this 128-bit instruction
> -(define_expand "vlshrv1ti3"
> -  [(set (match_operand:V1TI 0 "vsx_register_operand" "=v")
> - (lshiftrt:V1TI (match_operand:V1TI 1 "vsx_register_operand" "v")
> -

Re: [PATCH] issue -Wstring-compare in more case (PR 95673)

2020-10-08 Thread Martin Sebor via Gcc-patches

Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555225.html

On 9/30/20 6:14 PM, Martin Sebor wrote:

-Wstring-compare triggers under the same strict conditions as
the strcmp/strncmp call is folded into a constant: only when
all the uses of the result are [in]equality expressions with
zero.  However, even when the call cannot be folded into
a constant because the result is in addition used in other
expressions besides equality to zero, GCC still sets the range
of the result to nonzero.  So in more complex functions where
some of the uses of the same result are in tests for equality
to zero and others in other expressions, the warning fails to
point out the very mistake it's designed to detect.

The attached change enhances the function that determines how
the strcmp/strncmp is used to also make it possible to detect
the mistakes in the multi-use situations.

Tested on x86_64-linux & by building Glibc and Binutils/GDB
and confirming it triggers no new warnings.

Martin




[committed] libstdc++: Add assertions for preconditions in sampling distributions [PR 82584]

2020-10-08 Thread Jonathan Wakely via Gcc-patches
These three distributions all require 0 < S where S is the sum of the
weights. When the sum is zero there's an undefined FP division by zero.
Add assertions to help users diagnose the problem.

libstdc++-v3/ChangeLog:

PR libstdc++/82584
* include/bits/random.tcc
(discrete_distribution::param_type::_M_initialize)
(piecewise_constant_distribution::param_type::_M_initialize)
(piecewise_linear_distribution::param_type::_M_initialize):
Add assertions for positive sums..
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line.

Tested powerpc64le-linux. Committed to trunk.

commit b2a96bf9dce41ee777b1669e4a8b4c6df3ff3613
Author: Jonathan Wakely 
Date:   Thu Oct 8 15:14:58 2020

libstdc++: Add assertions for preconditions in sampling distributions [PR 
82584]

These three distributions all require 0 < S where S is the sum of the
weights. When the sum is zero there's an undefined FP division by zero.
Add assertions to help users diagnose the problem.

libstdc++-v3/ChangeLog:

PR libstdc++/82584
* include/bits/random.tcc
(discrete_distribution::param_type::_M_initialize)
(piecewise_constant_distribution::param_type::_M_initialize)
(piecewise_linear_distribution::param_type::_M_initialize):
Add assertions for positive sums..
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line.

diff --git a/libstdc++-v3/include/bits/random.tcc 
b/libstdc++-v3/include/bits/random.tcc
index f67b2fe48f4..a921b9bf815 100644
--- a/libstdc++-v3/include/bits/random.tcc
+++ b/libstdc++-v3/include/bits/random.tcc
@@ -2630,6 +2630,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   const double __sum = std::accumulate(_M_prob.begin(),
   _M_prob.end(), 0.0);
+  __glibcxx_assert(__sum > 0);
   // Now normalize the probabilites.
   __detail::__normalize(_M_prob.begin(), _M_prob.end(), _M_prob.begin(),
__sum);
@@ -2794,6 +2795,7 @@ namespace __detail
 
   const double __sum = std::accumulate(_M_den.begin(),
   _M_den.end(), 0.0);
+  __glibcxx_assert(__sum > 0);
 
   __detail::__normalize(_M_den.begin(), _M_den.end(), _M_den.begin(),
__sum);
@@ -3016,6 +3018,7 @@ namespace __detail
  _M_cp.push_back(__sum);
  _M_m.push_back((_M_den[__k + 1] - _M_den[__k]) / __delta);
}
+  __glibcxx_assert(__sum > 0);
 
   //  Now normalize the densities...
   __detail::__normalize(_M_den.begin(), _M_den.end(), _M_den.begin(),
diff --git a/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc 
b/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc
index 139abbb3051..9cffc3d06f9 100644
--- a/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc
+++ b/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc
@@ -12,4 +12,4 @@ auto x = std::generate_canonical

Re: [PATCH] vrp: Fix up gcc.target/aarch64/pr90838.c [PR97312, PR94801]

2020-10-08 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 08, 2020 at 04:28:37PM +0200, Aldy Hernandez wrote:
> On 10/8/20 3:54 PM, Jakub Jelinek wrote:
> > On Thu, Oct 08, 2020 at 12:22:11PM +0200, Jakub Jelinek via Gcc-patches 
> > wrote:
> > > Perhaps another way out of this would be document and enforce that
> > > __builtin_c[lt]z{,l,ll} etc calls are undefined at zero, but C[TL]Z ifn
> > > calls are defined there based on *_DEFINED_VALUE_AT_ZERO (*) == 2
> 
> Huh, that magic 2 is not obvious.  I guess we should document the values of
> this macro in the source somewhere:

The 2 is documented in gccint documentation.

> BTW.  There's no reason why the vr-values can't just call the
> gimple_ranger::range_of_builtin_call.  In the original implementation we had
> vr-values call the ranger version and trap if they differed.  I'm pretty
> sure you can have vr-values call range_of_builtin_call with a value_range,
> and things will get squashed down appropriately.  We should really only have
> one version of this.  I'm not suggesting you do it, but I wouldn't object to
> it ;-).

Will defer that to you or Andrew ;).

> > --- gcc/gimple-range.cc.jj  2020-10-08 11:55:25.498313173 +0200
> > +++ gcc/gimple-range.cc 2020-10-08 15:36:14.926945183 +0200
> 
> > @@ -714,8 +730,14 @@ gimple_ranger::range_of_builtin_call (ir
> >   // the maximum.
> >   wide_int max = r.upper_bound ();
> >   if (max == 0)
> > -   break;
> > - maxi = wi::floor_log2 (max);
> > +   {
> > + if (mini == -1)
> > +   maxi = -1;
> > + else if (maxi == prec)
> > +   mini = prec;
> > +   }
> > + else if (maxi != prec)
> > +   maxi = wi::floor_log2 (max);
> 
> Hmmm, if max == 0, that means the range is [0, 0], because if the highest
> bound of r is 0, there's nothing left on the bottom but another 0 since R is
> unsigned.  Is that what you meant?

Yes, for max == 0 aka [0, 0] I wanted:
1) if mini == -1, i.e. the DEFINED_VALUE_AT_ZERO == 2 VALUE is -1, return [-1, 
-1]
2) if maxi == prec, i.e. DEFINED_VALUE_AT_ZERO == 2 VALUE is prec, return 
[prec, prec]
3) otherwise it is an UB case, ignore the argument range, so either [0, prec-1] 
or
   VARYING (the latter for the mini == -2 case)
The 1) and 2) cases would be well defined, and for 3) I'm worried that e.g.
during VRP iteration if at one point we see range of argument say [0, 1] and
determine for that [0, prec-1] range, then in another iteration the argument
range is narrowed to just [0, 0] and all of sudden the result would become
VARYING, I'd be afraid that would be against the rules.  Perhaps the right
thing is to set range to UNDEFINED in the 3) case.

Jakub



Re: [PATCH] vrp: Fix up gcc.target/aarch64/pr90838.c [PR97312, PR94801]

2020-10-08 Thread Aldy Hernandez via Gcc-patches




On 10/8/20 3:54 PM, Jakub Jelinek wrote:

On Thu, Oct 08, 2020 at 12:22:11PM +0200, Jakub Jelinek via Gcc-patches wrote:

Perhaps another way out of this would be document and enforce that
__builtin_c[lt]z{,l,ll} etc calls are undefined at zero, but C[TL]Z ifn
calls are defined there based on *_DEFINED_VALUE_AT_ZERO (*) == 2


Huh, that magic 2 is not obvious.  I guess we should document the values 
of this macro in the source somewhere:


defaults.h:
/* Indicate that CLZ and CTZ are undefined at zero.  */
#ifndef CLZ_DEFINED_VALUE_AT_ZERO
#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE)  0
#endif
#ifndef CTZ_DEFINED_VALUE_AT_ZERO
#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE)  0
#endif



The following patch implements that, i.e. __builtin_c?z* now take full
advantage of them being UB at zero, while the ifns are well defined at zero
if *_DEFINED_VALUE_AT_ZERO (*) == 2.  That is what fixes PR94801.

Furthermore, to fix PR97312, if it is well defined at zero and the value at
zero is prec, we don't lower the maximum unless the argument is known to be
non-zero.


Heh.  I was just fixing the gimple-range.cc version.  Thanks.

BTW.  There's no reason why the vr-values can't just call the 
gimple_ranger::range_of_builtin_call.  In the original implementation we 
had vr-values call the ranger version and trap if they differed.  I'm 
pretty sure you can have vr-values call range_of_builtin_call with a 
value_range, and things will get squashed down appropriately.  We should 
really only have one version of this.  I'm not suggesting you do it, but 
I wouldn't object to it ;-).



--- gcc/gimple-range.cc.jj  2020-10-08 11:55:25.498313173 +0200
+++ gcc/gimple-range.cc 2020-10-08 15:36:14.926945183 +0200



@@ -714,8 +730,14 @@ gimple_ranger::range_of_builtin_call (ir
  // the maximum.
  wide_int max = r.upper_bound ();
  if (max == 0)
-   break;
- maxi = wi::floor_log2 (max);
+   {
+ if (mini == -1)
+   maxi = -1;
+ else if (maxi == prec)
+   mini = prec;
+   }
+ else if (maxi != prec)
+   maxi = wi::floor_log2 (max);


Hmmm, if max == 0, that means the range is [0, 0], because if the 
highest bound of r is 0, there's nothing left on the bottom but another 
0 since R is unsigned.  Is that what you meant?


I think there was a bug in translation here:  It looks like the original 
code did:


  maxi = tree_floor_log2 (vr0->max ());
  /* For vr0 [0, 0] give up.  */
  if (maxi == -1)
break;

so perhaps the above (prior to your change) should have been:

wide_int max = r.upper_bound ();
maxi = wi::floor_log2 (max);
// For 0 give up.
if (maxi == -1)
  break;

You may want to adjust.

Again, thanks for working on this.
Aldy



Re: [PATCH] libstdc++: Diagnose visitors with different return types [PR95904]

2020-10-08 Thread Jonathan Wakely via Gcc-patches

On 05/10/20 22:35 +0300, Ville Voutilainen via Libstdc++ wrote:

On Mon, 5 Oct 2020 at 01:15, Ville Voutilainen
 wrote:

The patch is borked, doesn't pass tests, fixing...


Unborked, ok for trunk if full testsuite passes?


Assuming it has passed by now, OK. Thanks.



Re: [PATCH 3/5] Add TI to TD (128-bit DFP) and TD to TI support

2020-10-08 Thread will schmidt via Gcc-patches
On Mon, 2020-10-05 at 11:52 -0700, Carl Love wrote:
> Will, Segher:
> 
> Add support for converting to/from 128-bit integers and 128-bit 
> decimal floating point formats.
> 
> The updates from the previous version of the patch:
> 
> Just a fix for the change log per Will's comments.
> 
> No regression failures were found when run on a P9.
> 
> Please let me know if this is ready for mainline. 
> 
>Carl
> 
> --
> 
> 
> gcc/ChangeLog
> 
> 2020-10-05  Carl Love  
>   * config/rs6000/dfp.md (floattitd2, fixtdti2): New define_insns.
>   * config/rs6000/rs6000-call.c (P10V_BUILTIN_VCMPNET_P, 
> P10V_BUILTIN_VCMPAET_P):
>   New overloaded definitions.
> 
> gcc/testsuite/ChangeLog
> 
> 2020-10-05  Carl Love  
>   * gcc.target/powerpc/int_128bit-runnable.c:  Update test.


Maybe 'Add 128-bit DFP conversion tests' to give it better meaning.


> ---
>  gcc/config/rs6000/dfp.md  | 14 +
>  gcc/config/rs6000/rs6000-call.c   |  4 ++
>  .../gcc.target/powerpc/int_128bit-runnable.c  | 62 +++
>  3 files changed, 80 insertions(+)
> 
> diff --git a/gcc/config/rs6000/dfp.md b/gcc/config/rs6000/dfp.md
> index 8f822732bac..0e82e315fee 100644
> --- a/gcc/config/rs6000/dfp.md
> +++ b/gcc/config/rs6000/dfp.md
> @@ -222,6 +222,13 @@
>"dcffixq %0,%1"
>[(set_attr "type" "dfp")])
> 
> +(define_insn "floattitd2"
> +  [(set (match_operand:TD 0 "gpc_reg_operand" "=d")
> + (float:TD (match_operand:TI 1 "gpc_reg_operand" "v")))]
> +  "TARGET_POWER10"
> +  "dcffixqq %0,%1"
> +  [(set_attr "type" "dfp")])
> +
>  ;; Convert a decimal64/128 to a decimal64/128 whose value is an integer.
>  ;; This is the first stage of converting it to an integer type.
> 
> @@ -241,6 +248,13 @@
>"TARGET_DFP"
>"dctfix %0,%1"
>[(set_attr "type" "dfp")])
> +
> +(define_insn "fixtdti2"
> +  [(set (match_operand:TI 0 "gpc_reg_operand" "=v")
> + (fix:TI (match_operand:TD 1 "gpc_reg_operand" "d")))]
> +  "TARGET_POWER10"
> +  "dctfixqq %0,%1"
> +  [(set_attr "type" "dfp")])
> 
>  ;; Decimal builtin support


ok

> 
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 87fff5c1c80..8d00a25d806 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -4967,6 +4967,8 @@ const struct altivec_builtin_types 
> altivec_overloaded_builtins[] = {
>  RS6000_BTI_bool_V2DI, 0 },
>{ P9V_BUILTIN_VEC_VCMPNE_P, P10V_BUILTIN_VCMPNET_P,
>  RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 },
> +  { P9V_BUILTIN_VEC_VCMPNE_P, P10V_BUILTIN_VCMPNET_P,
> +RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 
> },
> 
>{ P9V_BUILTIN_VEC_VCMPNE_P, P9V_BUILTIN_VCMPNEFP_P,
>  RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
> @@ -5074,6 +5076,8 @@ const struct altivec_builtin_types 
> altivec_overloaded_builtins[] = {
>  RS6000_BTI_bool_V2DI, 0 },
>{ P9V_BUILTIN_VEC_VCMPAE_P, P10V_BUILTIN_VCMPAET_P,
>  RS6000_BTI_INTSI, RS6000_BTI_V1TI, RS6000_BTI_V1TI, 0 },
> +  { P9V_BUILTIN_VEC_VCMPAE_P, P10V_BUILTIN_VCMPAET_P,
> +RS6000_BTI_INTSI, RS6000_BTI_unsigned_V1TI, RS6000_BTI_unsigned_V1TI, 0 
> },
>{ P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEFP_P,
>  RS6000_BTI_INTSI, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
>{ P9V_BUILTIN_VEC_VCMPAE_P, P9V_BUILTIN_VCMPAEDP_P,

ok

> diff --git a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c 
> b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
> index 85ad544e22b..ec3dcf3dff1 100644
> --- a/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
> +++ b/gcc/testsuite/gcc.target/powerpc/int_128bit-runnable.c
> @@ -38,6 +38,7 @@
>  #if DEBUG
>  #include 
>  #include 
> +#include 
> 
> 
>  void print_i128(__int128_t val)
> @@ -59,6 +60,13 @@ int main ()
>__int128_t arg1, result;
>__uint128_t uarg2;
> 
> +  _Decimal128 arg1_dfp128, result_dfp128, expected_result_dfp128;
> +
> +  struct conv_t {
> +__uint128_t u128;
> +_Decimal128 d128;
> +  } conv, conv2;
> +
>vector signed long long int vec_arg1_di, vec_arg2_di;
>vector unsigned long long int vec_uarg1_di, vec_uarg2_di, vec_uarg3_di;
>vector unsigned long long int vec_uresult_di;
> @@ -2249,6 +2257,60 @@ int main ()
>  abort();
>  #endif
>}
> +  
> +  /* DFP to __int128 and __int128 to DFP conversions */
> +  /* Can't get printing of DFP values to work.  Print the DFP value as an
> + unsigned int so we can see the bit patterns.  */

Drop 'Can't get ...', just 'Print the DFP...' should be sufficient.

> +  conv.u128 = 0x2208ULL;
> +  conv.u128 = (conv.u128 << 64) | 0x4ULL;   //DFP bit pattern for integer 4
> +  expected_result_dfp128 = conv.d128;
> 
> +  arg1 = 4;
> +
> +  conv.d128 = (_Decimal128) arg1;
> +
> +  result_dfp128 = (_Decimal128) arg1;
> +  if (((conv.u128 >>64) != 0x2208ULL) &&
> +  ((conv.u128 & 

Re: Ping: [PATCH][Arm] Enable MVE SIMD modes for vectorization

2020-10-08 Thread Christophe Lyon via Gcc-patches
On Thu, 8 Oct 2020 at 16:08, Dennis Zhang  wrote:
>
> Hi Christophe,
>
> On 08/10/2020 14:14, Christophe Lyon wrote:
> > Hi,
> >
> >
> > On Tue, 6 Oct 2020 at 15:37, Dennis Zhang via Gcc-patches
> >  wrote:
> >>
> >> On 9/16/20 4:00 PM, Dennis Zhang wrote:
> >>> Hi all,
> >>>
> >>> This patch enables SIMD modes for MVE auto-vectorization.
> >>> In this patch, the integer and float MVE SIMD modes are returned by
> >>> arm_preferred_simd_mode (TARGET_VECTORIZE_PREFERRED_SIMD_MODE hook) when
> >>> MVE or MVE_FLOAT is enabled.
> >>> Then the expanders for auto-vectorization can be used for generating MVE
> >>> SIMD code.
> >>>
> >>> This patch also fixes bugs in MVE vreiterpretq_*.c tests which are
> >>> revealed by the enabled MVE SIMD modes.
> >>> The tests are for checking the MVE reinterpret intrinsics.
> >>> There are two functions in each of the tests. The two functions contain
> >>> the pattern of identical code so that they are folded in icf pass.
> >>> Because of icf, the instruction count only checks one function which is 8.
> >>> However when the SIMD modes are enabled, the estimation of the code size
> >>> becomes smaller so that inlining is applied after icf, then the
> >>> instruction count becomes 16 which causes failure of the tests.
> >>> Because the icf is not the expected pattern to be tested but causes
> >>> above issues, -fno-ipa-icf is applied to the tests to avoid unstable
> >>> instruction count.
> >>>
> >>> This patch is separated from
> >>> https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552104.html
> >>> because this part is not strongly connected to the aim of that one so
> >>> that causing confusion.
> >>>
> >>> Regtested and bootstraped.
> >>>
> >>> Is it OK for trunk please?
> >>>
> >>> Thanks
> >>> Dennis
> >>>
> >>> gcc/ChangeLog:
> >>>
> >>> 2020-09-15  Dennis Zhang  
> >>>
> >>>* config/arm/arm.c (arm_preferred_simd_mode): Enable MVE SIMD 
> >>> modes.
> >>>
> >
> > Since toolchain builds work again after Jakub's divmod fix, I'm now
> > facing another build error likely caused by this patch:
> > In file included from
> > /tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/coretypes.h:449:0,
> >   from
> > /tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:28:
> > /tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:
> > In function 'machine_mode arm_preferred_simd_mode(scalar_mode)':
> > ./insn-modes.h:196:71: error: temporary of non-literal type
> > 'scalar_int_mode' in a constant expression
> >   #define QImode (scalar_int_mode ((scalar_int_mode::from_int) E_QImode))
> > ^
> > /tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:28970:12:
> > note: in expansion of macro 'QImode'
> > case QImode:
> >
> > and similarly for the other cases.
> >
> > Does the build work for you?
> >
> > Thanks,
> >
> > Christophe
> >
>
> Thanks for the report. Sorry to see the error.
> I tested it for arm-none-eabi and arm-none-linux-gnueabi targets. I
> didn't get this error.
> Could you please help to show the configuration you use for your build?
> I will test and fix at once.
>

It fails on all of them for me. Does it work for you with current
master? (r11-3720-gf18eeb6b958acd5e1590ca4a73231486b749be9b)


> Thanks
> Dennis


RE: [PATCH] arm: [MVE] Add missing __arm_vcvtnq_u32_f32 intrinsic (PR 96914)

2020-10-08 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Gcc-patches  On Behalf Of
> Christophe Lyon via Gcc-patches
> Sent: 05 October 2020 10:58
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH] arm: [MVE] Add missing __arm_vcvtnq_u32_f32 intrinsic
> (PR 96914)
> 
> __arm_vcvtnq_u32_f32 was missing from arm_mve.h, although the s32_f32
> and
> [su]16_f16 versions were present.
> 
> This patch adds the missing version and testcase, which are
> cut-and-paste from the other versions.

Ok.
Thanks,
Kyrill

> 
> 2020-10-05  Christophe Lyon  
> 
>   gcc/
>   * config/arm/arm_mve.h (__arm_vcvtnq_u32_f32): New.
> 
>   gcc/testsuite/
>   * gcc.target/arm/mve/intrinsics/vcvtnq_u32_f32.c: New test.
> ---
>  gcc/config/arm/arm_mve.h|  8 
>  .../gcc.target/arm/mve/intrinsics/vcvtnq_u32_f32.c  | 13
> +
>  2 files changed, 21 insertions(+)
>  create mode 100644
> gcc/testsuite/gcc.target/arm/mve/intrinsics/vcvtnq_u32_f32.c
> 
> diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
> index 99cff41..ecff3d1 100644
> --- a/gcc/config/arm/arm_mve.h
> +++ b/gcc/config/arm/arm_mve.h
> @@ -643,6 +643,7 @@
>  #define vcvtpq_u16_f16(__a) __arm_vcvtpq_u16_f16(__a)
>  #define vcvtpq_u32_f32(__a) __arm_vcvtpq_u32_f32(__a)
>  #define vcvtnq_u16_f16(__a) __arm_vcvtnq_u16_f16(__a)
> +#define vcvtnq_u32_f32(__a) __arm_vcvtnq_u32_f32(__a)
>  #define vcvtmq_u16_f16(__a) __arm_vcvtmq_u16_f16(__a)
>  #define vcvtmq_u32_f32(__a) __arm_vcvtmq_u32_f32(__a)
>  #define vcvtaq_u16_f16(__a) __arm_vcvtaq_u16_f16(__a)
> @@ -17012,6 +17013,13 @@ __arm_vcvtnq_u16_f16 (float16x8_t __a)
>return __builtin_mve_vcvtnq_uv8hi (__a);
>  }
> 
> +__extension__ extern __inline uint32x4_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vcvtnq_u32_f32 (float32x4_t __a)
> +{
> +  return __builtin_mve_vcvtnq_uv4si (__a);
> +}
> +
>  __extension__ extern __inline uint16x8_t
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vcvtmq_u16_f16 (float16x8_t __a)
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcvtnq_u32_f32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcvtnq_u32_f32.c
> new file mode 100644
> index 000..b6d5eb9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcvtnq_u32_f32.c
> @@ -0,0 +1,13 @@
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +uint32x4_t
> +foo (float32x4_t a)
> +{
> +  return vcvtnq_u32_f32 (a);
> +}
> +
> +/* { dg-final { scan-assembler "vcvtn.u32.f32"  }  } */
> --
> 2.7.4



Re: Ping: [PATCH][Arm] Enable MVE SIMD modes for vectorization

2020-10-08 Thread Dennis Zhang via Gcc-patches

Hi Christophe,

On 08/10/2020 14:14, Christophe Lyon wrote:

Hi,


On Tue, 6 Oct 2020 at 15:37, Dennis Zhang via Gcc-patches
 wrote:


On 9/16/20 4:00 PM, Dennis Zhang wrote:

Hi all,

This patch enables SIMD modes for MVE auto-vectorization.
In this patch, the integer and float MVE SIMD modes are returned by
arm_preferred_simd_mode (TARGET_VECTORIZE_PREFERRED_SIMD_MODE hook) when
MVE or MVE_FLOAT is enabled.
Then the expanders for auto-vectorization can be used for generating MVE
SIMD code.

This patch also fixes bugs in MVE vreiterpretq_*.c tests which are
revealed by the enabled MVE SIMD modes.
The tests are for checking the MVE reinterpret intrinsics.
There are two functions in each of the tests. The two functions contain
the pattern of identical code so that they are folded in icf pass.
Because of icf, the instruction count only checks one function which is 8.
However when the SIMD modes are enabled, the estimation of the code size
becomes smaller so that inlining is applied after icf, then the
instruction count becomes 16 which causes failure of the tests.
Because the icf is not the expected pattern to be tested but causes
above issues, -fno-ipa-icf is applied to the tests to avoid unstable
instruction count.

This patch is separated from
https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552104.html
because this part is not strongly connected to the aim of that one so
that causing confusion.

Regtested and bootstraped.

Is it OK for trunk please?

Thanks
Dennis

gcc/ChangeLog:

2020-09-15  Dennis Zhang  

   * config/arm/arm.c (arm_preferred_simd_mode): Enable MVE SIMD modes.



Since toolchain builds work again after Jakub's divmod fix, I'm now
facing another build error likely caused by this patch:
In file included from
/tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/coretypes.h:449:0,
  from
/tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:28:
/tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:
In function 'machine_mode arm_preferred_simd_mode(scalar_mode)':
./insn-modes.h:196:71: error: temporary of non-literal type
'scalar_int_mode' in a constant expression
  #define QImode (scalar_int_mode ((scalar_int_mode::from_int) E_QImode))
^
/tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:28970:12:
note: in expansion of macro 'QImode'
case QImode:

and similarly for the other cases.

Does the build work for you?

Thanks,

Christophe



Thanks for the report. Sorry to see the error.
I tested it for arm-none-eabi and arm-none-linux-gnueabi targets. I 
didn't get this error.

Could you please help to show the configuration you use for your build?
I will test and fix at once.

Thanks
Dennis


Re: Disable tbaa for some call to call_may_clobber_ref_p

2020-10-08 Thread Richard Biener
On Thu, 8 Oct 2020, Jan Hubicka wrote:

> Hi,
> as discussed on IRC, I am testing the attached patch to disable TBAA for
> some uses of call_may_clobber_ref_p.
> 
> OK if testing passes?

OK.

Thanks,
Richard.

> Honza
> 
> gcc/ChangeLog:
> 
> 2020-10-08  Jan Hubicka  
> 
>   * tree-nrv.c (dest_safe_for_nrv_p): Disable tbaa in 
>   call_may_clobber_ref_p and ref_maybe_used_by_stmt_p.
>   * tree-tailcall.c (find_tail_calls): Likewise.
>   * tree-ssa-alias.c (call_may_clobber_ref_p): Add tbaa_p parameter.
>   * tree-ssa-alias.h (call_may_clobber_ref_p): Update prototype.
>   * tree-ssa-sccvn.c (vn_reference_lookup_3): Pass data->tbaa_p
>   to call_may_clobber_ref_p_1.
> 
> diff --git a/gcc/tree-nrv.c b/gcc/tree-nrv.c
> index 9d17a104f34..df757f70e71 100644
> --- a/gcc/tree-nrv.c
> +++ b/gcc/tree-nrv.c
> @@ -315,8 +315,8 @@ dest_safe_for_nrv_p (gcall *call)
>if (TREE_CODE (dest) == SSA_NAME)
>  return true;
>  
> -  if (call_may_clobber_ref_p (call, dest)
> -  || ref_maybe_used_by_stmt_p (call, dest))
> +  if (call_may_clobber_ref_p (call, dest, false)
> +  || ref_maybe_used_by_stmt_p (call, dest, false))
>  return false;
>  
>return true;
> diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
> index 97255987e79..ac7e020c5e2 100644
> --- a/gcc/tree-ssa-alias.c
> +++ b/gcc/tree-ssa-alias.c
> @@ -3320,12 +3320,12 @@ call_may_clobber_ref_p_1 (gcall *call, ao_ref *ref, 
> bool tbaa_p)
> return true, otherwise return false.  */
>  
>  bool
> -call_may_clobber_ref_p (gcall *call, tree ref)
> +call_may_clobber_ref_p (gcall *call, tree ref, bool tbaa_p)
>  {
>bool res;
>ao_ref r;
>ao_ref_init (, ref);
> -  res = call_may_clobber_ref_p_1 (call, , true);
> +  res = call_may_clobber_ref_p_1 (call, , tbaa_p);
>if (res)
>  ++alias_stats.call_may_clobber_ref_p_may_alias;
>else
> diff --git a/gcc/tree-ssa-alias.h b/gcc/tree-ssa-alias.h
> index 1dd02c0ea62..1561ead2941 100644
> --- a/gcc/tree-ssa-alias.h
> +++ b/gcc/tree-ssa-alias.h
> @@ -128,7 +128,7 @@ extern bool ref_maybe_used_by_stmt_p (gimple *, ao_ref *, 
> bool = true);
>  extern bool stmt_may_clobber_global_p (gimple *);
>  extern bool stmt_may_clobber_ref_p (gimple *, tree, bool = true);
>  extern bool stmt_may_clobber_ref_p_1 (gimple *, ao_ref *, bool = true);
> -extern bool call_may_clobber_ref_p (gcall *, tree);
> +extern bool call_may_clobber_ref_p (gcall *, tree, bool = true);
>  extern bool call_may_clobber_ref_p_1 (gcall *, ao_ref *, bool = true);
>  extern bool stmt_kills_ref_p (gimple *, tree);
>  extern bool stmt_kills_ref_p (gimple *, ao_ref *);
> diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
> index 014b7bdfd01..c139adb6130 100644
> --- a/gcc/tree-ssa-sccvn.c
> +++ b/gcc/tree-ssa-sccvn.c
> @@ -2586,7 +2586,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
> *data_,
>if (valueized_anything)
>   {
> bool res = call_may_clobber_ref_p_1 (as_a  (def_stmt),
> -ref);
> +ref, data->tbaa_p);
> for (unsigned i = 0; i < gimple_call_num_args (def_stmt); ++i)
>   gimple_call_set_arg (def_stmt, i, oldargs[i]);
> if (!res)
> diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
> index d81e9213460..44643bef5f1 100644
> --- a/gcc/tree-tailcall.c
> +++ b/gcc/tree-tailcall.c
> @@ -523,7 +523,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
>tree result_decl = DECL_RESULT (cfun->decl);
>if (result_decl
>&& may_be_aliased (result_decl)
> -  && ref_maybe_used_by_stmt_p (call, result_decl))
> +  && ref_maybe_used_by_stmt_p (call, result_decl, false))
>  return;
>  
>/* We found the call, check whether it is suitable.  */
> @@ -597,8 +597,8 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
>if (TREE_CODE (var) != PARM_DECL
> && auto_var_in_fn_p (var, cfun->decl)
> && may_be_aliased (var)
> -   && (ref_maybe_used_by_stmt_p (call, var)
> -   || call_may_clobber_ref_p (call, var)))
> +   && (ref_maybe_used_by_stmt_p (call, var, false)
> +   || call_may_clobber_ref_p (call, var, false)))
>   {
> if (!VAR_P (var))
>   {
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


[PATCH] vrp: Fix up gcc.target/aarch64/pr90838.c [PR97312, PR94801]

2020-10-08 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 08, 2020 at 12:22:11PM +0200, Jakub Jelinek via Gcc-patches wrote:
> Perhaps another way out of this would be document and enforce that
> __builtin_c[lt]z{,l,ll} etc calls are undefined at zero, but C[TL]Z ifn
> calls are defined there based on *_DEFINED_VALUE_AT_ZERO (*) == 2

The following patch implements that, i.e. __builtin_c?z* now take full
advantage of them being UB at zero, while the ifns are well defined at zero
if *_DEFINED_VALUE_AT_ZERO (*) == 2.  That is what fixes PR94801.

Furthermore, to fix PR97312, if it is well defined at zero and the value at
zero is prec, we don't lower the maximum unless the argument is known to be
non-zero.
For gimple-range.cc I guess we could improve it if needed e.g. by returning
a [0,7][32,32] range for .CTZ of e.g. [0,137], but for now it (roughly)
matches what vr-values.c does.

Ok for trunk if it passes bootstrap/regtest?

2020-10-08  Jakub Jelinek  

PR tree-optimization/94801
PR target/97312
* vr-values.c (vr_values::extract_range_basic) : When stmt is not an internal-fn call or
C?Z_DEFINED_VALUE_AT_ZERO is not 2, assume argument is not zero
and thus use [0, prec-1] range unless it can be further improved.
For CTZ, don't update maxi from upper bound if it was previously prec.
* gimple-range.cc (gimple_ranger::range_of_builtin_call) : Likewise.

* gcc.dg/tree-ssa/pr94801.c: New test.

--- gcc/vr-values.c.jj  2020-10-07 10:47:47.065983121 +0200
+++ gcc/vr-values.c 2020-10-08 15:23:56.042631592 +0200
@@ -1208,34 +1208,42 @@ vr_values::extract_range_basic (value_ra
  mini = 0;
  maxi = 1;
  goto bitop_builtin;
- /* __builtin_c[lt]z* return [0, prec-1], except for
+ /* __builtin_clz* return [0, prec-1], except for
 when the argument is 0, but that is undefined behavior.
-On many targets where the CLZ RTL or optab value is defined
-for 0 the value is prec, so include that in the range
-by default.  */
+Always handle __builtin_clz* which can be only written
+by user as UB on 0 and so [0, prec-1] range, and the internal-fn
+calls depending on how CLZ_DEFINED_VALUE_AT_ZERO is defined.  */
CASE_CFN_CLZ:
  arg = gimple_call_arg (stmt, 0);
  prec = TYPE_PRECISION (TREE_TYPE (arg));
  mini = 0;
- maxi = prec;
+ maxi = prec - 1;
  mode = SCALAR_INT_TYPE_MODE (TREE_TYPE (arg));
- if (optab_handler (clz_optab, mode) != CODE_FOR_nothing
- && CLZ_DEFINED_VALUE_AT_ZERO (mode, zerov)
- /* Handle only the single common value.  */
- && zerov != prec)
-   /* Magic value to give up, unless vr0 proves
-  arg is non-zero.  */
-   mini = -2;
+ if (gimple_call_internal_p (stmt))
+   {
+ if (optab_handler (clz_optab, mode) != CODE_FOR_nothing
+ && CLZ_DEFINED_VALUE_AT_ZERO (mode, zerov) == 2)
+   {
+ /* Handle only the single common value.  */
+ if (zerov == prec)
+   maxi = prec;
+ /* Magic value to give up, unless vr0 proves
+arg is non-zero.  */
+ else
+   mini = -2;
+   }
+   }
  if (TREE_CODE (arg) == SSA_NAME)
{
  const value_range_equiv *vr0 = get_value_range (arg);
  /* From clz of VR_RANGE minimum we can compute
 result maximum.  */
  if (vr0->kind () == VR_RANGE
- && TREE_CODE (vr0->min ()) == INTEGER_CST)
+ && TREE_CODE (vr0->min ()) == INTEGER_CST
+ && integer_nonzerop (vr0->min ()))
{
  maxi = prec - 1 - tree_floor_log2 (vr0->min ());
- if (maxi != prec)
+ if (mini == -2)
mini = 0;
}
  else if (vr0->kind () == VR_ANTI_RANGE
@@ -1251,9 +1259,14 @@ vr_values::extract_range_basic (value_ra
  if (vr0->kind () == VR_RANGE
  && TREE_CODE (vr0->max ()) == INTEGER_CST)
{
- mini = prec - 1 - tree_floor_log2 (vr0->max ());
- if (mini == prec)
-   break;
+ int newmini = prec - 1 - tree_floor_log2 (vr0->max ());
+ if (newmini == prec)
+   {
+ if (maxi == prec)
+   mini = prec;
+   }
+ else
+   mini = newmini;
}
}
  if (mini == -2)
@@ -1261,27 +1274,30 @@ vr_values::extract_range_basic (value_ra
  goto bitop_builtin;
  /* __builtin_ctz* return [0, prec-1], except for
 when the argument is 0, but that is undefined behavior.
-If there is a ctz optab 

Disable tbaa for some call to call_may_clobber_ref_p

2020-10-08 Thread Jan Hubicka
Hi,
as discussed on IRC, I am testing the attached patch to disable TBAA for
some uses of call_may_clobber_ref_p.

OK if testing passes?

Honza

gcc/ChangeLog:

2020-10-08  Jan Hubicka  

* tree-nrv.c (dest_safe_for_nrv_p): Disable tbaa in 
call_may_clobber_ref_p and ref_maybe_used_by_stmt_p.
* tree-tailcall.c (find_tail_calls): Likewise.
* tree-ssa-alias.c (call_may_clobber_ref_p): Add tbaa_p parameter.
* tree-ssa-alias.h (call_may_clobber_ref_p): Update prototype.
* tree-ssa-sccvn.c (vn_reference_lookup_3): Pass data->tbaa_p
to call_may_clobber_ref_p_1.

diff --git a/gcc/tree-nrv.c b/gcc/tree-nrv.c
index 9d17a104f34..df757f70e71 100644
--- a/gcc/tree-nrv.c
+++ b/gcc/tree-nrv.c
@@ -315,8 +315,8 @@ dest_safe_for_nrv_p (gcall *call)
   if (TREE_CODE (dest) == SSA_NAME)
 return true;
 
-  if (call_may_clobber_ref_p (call, dest)
-  || ref_maybe_used_by_stmt_p (call, dest))
+  if (call_may_clobber_ref_p (call, dest, false)
+  || ref_maybe_used_by_stmt_p (call, dest, false))
 return false;
 
   return true;
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 97255987e79..ac7e020c5e2 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -3320,12 +3320,12 @@ call_may_clobber_ref_p_1 (gcall *call, ao_ref *ref, 
bool tbaa_p)
return true, otherwise return false.  */
 
 bool
-call_may_clobber_ref_p (gcall *call, tree ref)
+call_may_clobber_ref_p (gcall *call, tree ref, bool tbaa_p)
 {
   bool res;
   ao_ref r;
   ao_ref_init (, ref);
-  res = call_may_clobber_ref_p_1 (call, , true);
+  res = call_may_clobber_ref_p_1 (call, , tbaa_p);
   if (res)
 ++alias_stats.call_may_clobber_ref_p_may_alias;
   else
diff --git a/gcc/tree-ssa-alias.h b/gcc/tree-ssa-alias.h
index 1dd02c0ea62..1561ead2941 100644
--- a/gcc/tree-ssa-alias.h
+++ b/gcc/tree-ssa-alias.h
@@ -128,7 +128,7 @@ extern bool ref_maybe_used_by_stmt_p (gimple *, ao_ref *, 
bool = true);
 extern bool stmt_may_clobber_global_p (gimple *);
 extern bool stmt_may_clobber_ref_p (gimple *, tree, bool = true);
 extern bool stmt_may_clobber_ref_p_1 (gimple *, ao_ref *, bool = true);
-extern bool call_may_clobber_ref_p (gcall *, tree);
+extern bool call_may_clobber_ref_p (gcall *, tree, bool = true);
 extern bool call_may_clobber_ref_p_1 (gcall *, ao_ref *, bool = true);
 extern bool stmt_kills_ref_p (gimple *, tree);
 extern bool stmt_kills_ref_p (gimple *, ao_ref *);
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 014b7bdfd01..c139adb6130 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -2586,7 +2586,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
   if (valueized_anything)
{
  bool res = call_may_clobber_ref_p_1 (as_a  (def_stmt),
-  ref);
+  ref, data->tbaa_p);
  for (unsigned i = 0; i < gimple_call_num_args (def_stmt); ++i)
gimple_call_set_arg (def_stmt, i, oldargs[i]);
  if (!res)
diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
index d81e9213460..44643bef5f1 100644
--- a/gcc/tree-tailcall.c
+++ b/gcc/tree-tailcall.c
@@ -523,7 +523,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
   tree result_decl = DECL_RESULT (cfun->decl);
   if (result_decl
   && may_be_aliased (result_decl)
-  && ref_maybe_used_by_stmt_p (call, result_decl))
+  && ref_maybe_used_by_stmt_p (call, result_decl, false))
 return;
 
   /* We found the call, check whether it is suitable.  */
@@ -597,8 +597,8 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
   if (TREE_CODE (var) != PARM_DECL
  && auto_var_in_fn_p (var, cfun->decl)
  && may_be_aliased (var)
- && (ref_maybe_used_by_stmt_p (call, var)
- || call_may_clobber_ref_p (call, var)))
+ && (ref_maybe_used_by_stmt_p (call, var, false)
+ || call_may_clobber_ref_p (call, var, false)))
{
  if (!VAR_P (var))
{


Re: [committed] libstdc++: Fix divide by zero in default template argument

2020-10-08 Thread Jonathan Wakely via Gcc-patches

On 08/10/20 00:45 +0100, Jonathan Wakely wrote:

libstdc++-v3/ChangeLog:

* include/bits/random.h (__detail::_Mod): Avoid divide by zero.
* 
testsuite/26_numerics/random/linear_congruential_engine/operators/call.cc:
New test.



That didn't work properly in all cases. Here's a better fix.

Tested powerpc64le-linux. Committed to trunk.


commit c06617a79b41da37d80d7e88a3dbc56818f3e706
Author: Jonathan Wakely 
Date:   Thu Oct 8 14:01:00 2020

libstdc++: Avoid divide by zero in default template arguments

My previous attempt to fix this only worked when m is a power of two.
There is still a bug when a=00 and !has_single_bit(m).

Instead of trying to make _Mod work for a==0 this change ensures that we
never instantiate it with a==0. For C++17 we can use if-constexpr, but
otherwise we need to use a different multipler. It doesn't matter what
we use, as it won't actually be called, only instantiated.

libstdc++-v3/ChangeLog:

* include/bits/random.h (__detail::_Mod): Revert last change.
(__detail::__mod): Do not use _Mod for a==0 case.
* testsuite/26_numerics/random/linear_congruential_engine/operators/call.cc:
Check other cases with a==0. Also check runtime results.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line.

diff --git a/libstdc++-v3/include/bits/random.h b/libstdc++-v3/include/bits/random.h
index 920f3d91513..0be1191e07d 100644
--- a/libstdc++-v3/include/bits/random.h
+++ b/libstdc++-v3/include/bits/random.h
@@ -109,7 +109,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 template= __m - 1),
- bool __schrage_ok = __a != 0 && __m % __a < __m / __a>
+ bool __schrage_ok = __m % __a < __m / __a>
   struct _Mod
   {
 	typedef typename _Select_uint_least_t
   inline _Tp
   __mod(_Tp __x)
-  { return _Mod<_Tp, __m, __a, __c>::__calc(__x); }
+  {
+	if _GLIBCXX17_CONSTEXPR (__a == 0)
+	  return __c;
+	else
+	  {
+	// _Mod must not be instantiated with a == 0
+	constexpr _Tp __a1 = __a ? __a : 1;
+	return _Mod<_Tp, __m, __a1, __c>::__calc(__x);
+	  }
+  }
 
 /*
  * An adaptor class for converting the output of any Generator into
diff --git a/libstdc++-v3/testsuite/26_numerics/random/linear_congruential_engine/operators/call.cc b/libstdc++-v3/testsuite/26_numerics/random/linear_congruential_engine/operators/call.cc
index d1fff6d0a5d..aa2402f 100644
--- a/libstdc++-v3/testsuite/26_numerics/random/linear_congruential_engine/operators/call.cc
+++ b/libstdc++-v3/testsuite/26_numerics/random/linear_congruential_engine/operators/call.cc
@@ -15,13 +15,50 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do compile { target c++11 } }
+// { dg-do run { target c++11 } }
 
 #include 
+#include 
 
-unsigned
+void
 test01()
 {
   std::linear_congruential_engine l;
-  return l(); // this used to result in divide by zero
+  auto r = l(); // this used to result in divide by zero
+  VERIFY( r == 0 );
+  l.seed(2);
+  r = l();
+  VERIFY( r == 0 );
+  VERIFY( l() == 0 );
+}
+
+void
+test02()
+{
+  std::linear_congruential_engine l;
+  auto r = l(); // this used to result in a different divide by zero
+  VERIFY( r == 0 );
+  l.seed(2);
+  r = l();
+  VERIFY( r == 0 );
+  VERIFY( l() == 0 );
+}
+
+void
+test03()
+{
+  std::linear_congruential_engine l;
+  auto r = l();
+  VERIFY( r == 2 );
+  l.seed(4);
+  r = l();
+  VERIFY( r == 2 );
+  VERIFY( l() == 2 );
+}
+
+int main()
+{
+  test01();
+  test02();
+  test03();
 }
diff --git a/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc b/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc
index f808132e9ea..139abbb3051 100644
--- a/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc
+++ b/libstdc++-v3/testsuite/26_numerics/random/pr60037-neg.cc
@@ -10,6 +10,6 @@ std::__detail::_Adaptor aurng(urng);
 auto x = std::generate_canonical::digits>(urng);
 
-// { dg-error "static assertion failed: template argument must be a floating point type" "" { target *-*-* } 158 }
+// { dg-error "static assertion failed: template argument must be a floating point type" "" { target *-*-* } 167 }
 
 // { dg-error "static assertion failed: template argument must be a floating point type" "" { target *-*-* } 3281 }


[committed] libstdc++: Add C++11 member functions for ios::failure in old ABI

2020-10-08 Thread Jonathan Wakely via Gcc-patches
The new constructors that C++11 added to std::ios_base::failure were
missing for the old ABI. This adds them, but just ignores the
std::error_code argument (because there's nowhere to store it).

This also adds a code() member, which should be provided by the
std::system_error base class, but that base class isn't present in the
old ABI.

This allows the old ios::failure to be used in code that expects the new
API, although with reduced functionality.

libstdc++-v3/ChangeLog:

* include/bits/ios_base.h (ios_base::failure): Add constructors
takeing error_code argument. Add code() member function.
* testsuite/27_io/ios_base/failure/cxx11.cc: Allow test to
run for the old ABI but do not check for derivation from
std::system_error.
* testsuite/27_io/ios_base/failure/error_code.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

commit f997b67550144c6c0562f94c9b9cb932125d0444
Author: Jonathan Wakely 
Date:   Thu Oct 8 14:03:52 2020

libstdc++: Add C++11 member functions for ios::failure in old ABI

The new constructors that C++11 added to std::ios_base::failure were
missing for the old ABI. This adds them, but just ignores the
std::error_code argument (because there's nowhere to store it).

This also adds a code() member, which should be provided by the
std::system_error base class, but that base class isn't present in the
old ABI.

This allows the old ios::failure to be used in code that expects the new
API, although with reduced functionality.

libstdc++-v3/ChangeLog:

* include/bits/ios_base.h (ios_base::failure): Add constructors
takeing error_code argument. Add code() member function.
* testsuite/27_io/ios_base/failure/cxx11.cc: Allow test to
run for the old ABI but do not check for derivation from
std::system_error.
* testsuite/27_io/ios_base/failure/error_code.cc: New test.

diff --git a/libstdc++-v3/include/bits/ios_base.h 
b/libstdc++-v3/include/bits/ios_base.h
index 4ac026547b4..729ebd1768e 100644
--- a/libstdc++-v3/include/bits/ios_base.h
+++ b/libstdc++-v3/include/bits/ios_base.h
@@ -289,6 +289,24 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   virtual const char*
   what() const throw();
 
+#if __cplusplus >= 201103L
+  // Define the new members required by C++11,
+  // even though the error_code cannot be stored.
+
+  explicit
+  failure(const string& __s, const error_code&) noexcept
+  : failure(__s)
+  { }
+
+  explicit
+  failure(const char* __s, const error_code& = error_code{})
+  : failure(string(__s))
+  { }
+
+  // Stand-in for system_error::code() but returning by value.
+  error_code code() const noexcept { return error_code{}; }
+#endif
+
 private:
   string _M_msg;
 };
diff --git a/libstdc++-v3/testsuite/27_io/ios_base/failure/cxx11.cc 
b/libstdc++-v3/testsuite/27_io/ios_base/failure/cxx11.cc
index b78db49ed5a..06827f662c0 100644
--- a/libstdc++-v3/testsuite/27_io/ios_base/failure/cxx11.cc
+++ b/libstdc++-v3/testsuite/27_io/ios_base/failure/cxx11.cc
@@ -16,14 +16,15 @@
 // .
 
 // { dg-do run { target c++11 } }
-// { dg-require-effective-target cxx11-abi }
 
 #include 
 #include 
 
 using test_type = std::ios_base::failure;
 
+#if _GLIBCXX_USE_CXX11_ABI
 static_assert( std::is_base_of::value, "base" );
+#endif
 
 void
 test01()
diff --git a/libstdc++-v3/testsuite/27_io/ios_base/failure/error_code.cc 
b/libstdc++-v3/testsuite/27_io/ios_base/failure/error_code.cc
new file mode 100644
index 000..a377dc047c9
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/ios_base/failure/error_code.cc
@@ -0,0 +1,48 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++11 } }
+
+#include 
+#include 
+
+void
+test01()
+{
+  std::error_code ec, def_ec;
+#if _GLIBCXX_USE_CXX11_ABI
+  // For the new ABI code() should return the constructor argument.
+  ec = std::make_error_code(std::errc::executable_format_error);
+  def_ec = std::io_errc::stream;
+#else
+  // For the old ABI code() always returns a default-constructed error_code.
+#endif
+  std::ios_base::failure e1("string literal");
+  

Re: [PATCH] match.pd: Fix up FFS -> CTZ + 1 optimization [PR97325]

2020-10-08 Thread Aldy Hernandez via Gcc-patches




On 10/8/20 2:49 PM, Jakub Jelinek wrote:

On Thu, Oct 08, 2020 at 11:46:36AM +0200, Aldy Hernandez via Gcc-patches wrote:

This was a pasto by yours truly while porting the __builtin code from
vr_values::extract_range_basic.

I've tested Andrew's patch, and pushed it.


Note for next time, I think you should use --author when committing a patch
on someone else's behalf.


I did not know that.  Will do.



And no testcase was included, I'm including one below.


Thanks.



Anyway, this PR and the other CTZ related discussions led me to discover a
bug I've made earlier, CLZ/CTZ builtins have unsigned arguments and e.g.
both the vr-values.cc and now gimple-range.cc code heavily relies on that,
but __builtin_ffs has a signed operand and this optimization was incorrectly


Heh.  It looks like the clz/ctz code has plenty of subtle bugs and 
inconsistencies.  I have another fix coming up.


Aldy


making the operand signed too, so I guess it would greatly confuse VRP in
some cases.

Ok for trunk if it passes bootstrap/regtest?

2020-10-08  Jakub Jelinek  

PR tree-optimization/97325
* match.pd (FFS(nonzero) -> CTZ(nonzero) + 1): Cast argument to
corresponding unsigned type.

* gcc.c-torture/execute/pr97325.c: New test.

--- gcc/match.pd.jj 2020-10-01 10:40:10.057756683 +0200
+++ gcc/match.pd2020-10-08 14:39:27.347249559 +0200
@@ -6196,7 +6196,8 @@ (define_operator_list COND_TERNARY
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
&& direct_internal_fn_supported_p (IFN_CTZ, TREE_TYPE (@0),
 OPTIMIZE_FOR_SPEED))
-  (plus (CTZ:type @0) { build_one_cst (type); })))
+  (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
+   (plus (CTZ:type (convert:utype @0)) { build_one_cst (type); }
  #endif
  
  (for ffs (BUILT_IN_FFS BUILT_IN_FFSL BUILT_IN_FFSLL

--- gcc/testsuite/gcc.c-torture/execute/pr97325.c.jj2020-10-08 
14:29:59.740440126 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr97325.c   2020-10-08 
14:27:22.518708841 +0200
@@ -0,0 +1,15 @@
+/* PR tree-optimization/97325 */
+
+unsigned long long
+foo (unsigned long long c)
+{
+  return c ? __builtin_ffs (-(unsigned short) c) : 0;
+}
+
+int
+main ()
+{
+  if (foo (2) != 2)
+__builtin_abort ();
+  return 0;
+}


Jakub





Re: Ping: [PATCH][Arm] Enable MVE SIMD modes for vectorization

2020-10-08 Thread Christophe Lyon via Gcc-patches
Hi,


On Tue, 6 Oct 2020 at 15:37, Dennis Zhang via Gcc-patches
 wrote:
>
> On 9/16/20 4:00 PM, Dennis Zhang wrote:
> > Hi all,
> >
> > This patch enables SIMD modes for MVE auto-vectorization.
> > In this patch, the integer and float MVE SIMD modes are returned by
> > arm_preferred_simd_mode (TARGET_VECTORIZE_PREFERRED_SIMD_MODE hook) when
> > MVE or MVE_FLOAT is enabled.
> > Then the expanders for auto-vectorization can be used for generating MVE
> > SIMD code.
> >
> > This patch also fixes bugs in MVE vreiterpretq_*.c tests which are
> > revealed by the enabled MVE SIMD modes.
> > The tests are for checking the MVE reinterpret intrinsics.
> > There are two functions in each of the tests. The two functions contain
> > the pattern of identical code so that they are folded in icf pass.
> > Because of icf, the instruction count only checks one function which is 8.
> > However when the SIMD modes are enabled, the estimation of the code size
> > becomes smaller so that inlining is applied after icf, then the
> > instruction count becomes 16 which causes failure of the tests.
> > Because the icf is not the expected pattern to be tested but causes
> > above issues, -fno-ipa-icf is applied to the tests to avoid unstable
> > instruction count.
> >
> > This patch is separated from
> > https://gcc.gnu.org/pipermail/gcc-patches/2020-August/552104.html
> > because this part is not strongly connected to the aim of that one so
> > that causing confusion.
> >
> > Regtested and bootstraped.
> >
> > Is it OK for trunk please?
> >
> > Thanks
> > Dennis
> >
> > gcc/ChangeLog:
> >
> > 2020-09-15  Dennis Zhang  
> >
> >   * config/arm/arm.c (arm_preferred_simd_mode): Enable MVE SIMD modes.
> >

Since toolchain builds work again after Jakub's divmod fix, I'm now
facing another build error likely caused by this patch:
In file included from
/tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/coretypes.h:449:0,
 from
/tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:28:
/tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:
In function 'machine_mode arm_preferred_simd_mode(scalar_mode)':
./insn-modes.h:196:71: error: temporary of non-literal type
'scalar_int_mode' in a constant expression
 #define QImode (scalar_int_mode ((scalar_int_mode::from_int) E_QImode))
   ^
/tmp/2601185_2.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/arm/arm.c:28970:12:
note: in expansion of macro 'QImode'
   case QImode:

and similarly for the other cases.

Does the build work for you?

Thanks,

Christophe

> > gcc/testsuite/ChangeLog:
> >
> > 2020-09-15  Dennis Zhang  
> >
> >   * gcc.target/arm/mve/intrinsics/vreinterpretq_f16.c: Use additional
> >   option -fno-ipa-icf and change the instruction count from 8 to 16.
> >   * gcc.target/arm/mve/intrinsics/vreinterpretq_f32.c: Likewise.
> >   * gcc.target/arm/mve/intrinsics/vreinterpretq_s16.c: Likewise.
> >   * gcc.target/arm/mve/intrinsics/vreinterpretq_s32.c: Likewise.
> >   * gcc.target/arm/mve/intrinsics/vreinterpretq_s64.c: Likewise.
> >   * gcc.target/arm/mve/intrinsics/vreinterpretq_s8.c: Likewise.
> >   * gcc.target/arm/mve/intrinsics/vreinterpretq_u16.c: Likewise.
> >   * gcc.target/arm/mve/intrinsics/vreinterpretq_u32.c: Likewise.
> >   * gcc.target/arm/mve/intrinsics/vreinterpretq_u64.c: Likewise.
> >   * gcc.target/arm/mve/intrinsics/vreinterpretq_u8.c: Likewise.
> >
>
> Ping: https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554100.html


Re: [PATCH] match.pd: Fix up FFS -> CTZ + 1 optimization [PR97325]

2020-10-08 Thread Richard Biener
On Thu, 8 Oct 2020, Jakub Jelinek wrote:

> On Thu, Oct 08, 2020 at 11:46:36AM +0200, Aldy Hernandez via Gcc-patches 
> wrote:
> > This was a pasto by yours truly while porting the __builtin code from
> > vr_values::extract_range_basic.
> > 
> > I've tested Andrew's patch, and pushed it.
> 
> Note for next time, I think you should use --author when committing a patch
> on someone else's behalf.
> 
> And no testcase was included, I'm including one below.
> 
> Anyway, this PR and the other CTZ related discussions led me to discover a
> bug I've made earlier, CLZ/CTZ builtins have unsigned arguments and e.g.
> both the vr-values.cc and now gimple-range.cc code heavily relies on that,
> but __builtin_ffs has a signed operand and this optimization was incorrectly
> making the operand signed too, so I guess it would greatly confuse VRP in
> some cases.
> 
> Ok for trunk if it passes bootstrap/regtest?

OK.

Richard.

> 2020-10-08  Jakub Jelinek  
> 
>   PR tree-optimization/97325
>   * match.pd (FFS(nonzero) -> CTZ(nonzero) + 1): Cast argument to
>   corresponding unsigned type.
> 
>   * gcc.c-torture/execute/pr97325.c: New test.
> 
> --- gcc/match.pd.jj   2020-10-01 10:40:10.057756683 +0200
> +++ gcc/match.pd  2020-10-08 14:39:27.347249559 +0200
> @@ -6196,7 +6196,8 @@ (define_operator_list COND_TERNARY
>   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>&& direct_internal_fn_supported_p (IFN_CTZ, TREE_TYPE (@0),
>OPTIMIZE_FOR_SPEED))
> -  (plus (CTZ:type @0) { build_one_cst (type); })))
> +  (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
> +   (plus (CTZ:type (convert:utype @0)) { build_one_cst (type); }
>  #endif
>  
>  (for ffs (BUILT_IN_FFS BUILT_IN_FFSL BUILT_IN_FFSLL
> --- gcc/testsuite/gcc.c-torture/execute/pr97325.c.jj  2020-10-08 
> 14:29:59.740440126 +0200
> +++ gcc/testsuite/gcc.c-torture/execute/pr97325.c 2020-10-08 
> 14:27:22.518708841 +0200
> @@ -0,0 +1,15 @@
> +/* PR tree-optimization/97325 */
> +
> +unsigned long long
> +foo (unsigned long long c)
> +{
> +  return c ? __builtin_ffs (-(unsigned short) c) : 0;
> +}
> +
> +int
> +main ()
> +{
> +  if (foo (2) != 2)
> +__builtin_abort ();
> +  return 0;
> +}
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: [PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Wilco Dijkstra via Gcc-patches
Hi Jakub,

> On Thu, Oct 08, 2020 at 11:37:24AM +, Wilco Dijkstra via Gcc-patches 
> wrote:
>> Which optimizations does it enable that aren't possible if the value is 
>> defined?
>
> See bugzilla.  Note other compilers heavily optimize on those builtins
> undefined at value zero.

You mean the PR94801, PR94793, PR95863 you mentioned before? The first doesn't
seem to be a useful optimization (would anyone ever write that?), the other 2 
would
benefit from clz(0) being well defined. In particular, x86 without BMI would 
greatly
benefit from setting CTZ_DEFINED_VALUE_AT_ZERO to 2.

So I fail to see any "heavy" optimizations here that show a benefit of keeping 
the value
undefined at zero.

>> > We just should make sure that we optimize code like x ? __builtin_c[lt]z 
>> > (x) : 32;
>> > etc. properly (and I believe we do).
>> 
>> I think we do, but both the external and internal documentation are not clear
>> enough that most targets actually do define a value and will optimize for it.
>> Otherwise we wouldn't have this bug now...
>
> The documentation is very clear that the builtins are undefined at zero,
> that is all that matters for users.

If we don't change the undefinedness, at least we should try to explain the 
above
idiom as a way to get a well-defined range that still results in a single 
instruction
on most targets.

Cheers,
Wilco


[PATCH] match.pd: Fix up FFS -> CTZ + 1 optimization [PR97325]

2020-10-08 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 08, 2020 at 11:46:36AM +0200, Aldy Hernandez via Gcc-patches wrote:
> This was a pasto by yours truly while porting the __builtin code from
> vr_values::extract_range_basic.
> 
> I've tested Andrew's patch, and pushed it.

Note for next time, I think you should use --author when committing a patch
on someone else's behalf.

And no testcase was included, I'm including one below.

Anyway, this PR and the other CTZ related discussions led me to discover a
bug I've made earlier, CLZ/CTZ builtins have unsigned arguments and e.g.
both the vr-values.cc and now gimple-range.cc code heavily relies on that,
but __builtin_ffs has a signed operand and this optimization was incorrectly
making the operand signed too, so I guess it would greatly confuse VRP in
some cases.

Ok for trunk if it passes bootstrap/regtest?

2020-10-08  Jakub Jelinek  

PR tree-optimization/97325
* match.pd (FFS(nonzero) -> CTZ(nonzero) + 1): Cast argument to
corresponding unsigned type.

* gcc.c-torture/execute/pr97325.c: New test.

--- gcc/match.pd.jj 2020-10-01 10:40:10.057756683 +0200
+++ gcc/match.pd2020-10-08 14:39:27.347249559 +0200
@@ -6196,7 +6196,8 @@ (define_operator_list COND_TERNARY
  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
   && direct_internal_fn_supported_p (IFN_CTZ, TREE_TYPE (@0),
 OPTIMIZE_FOR_SPEED))
-  (plus (CTZ:type @0) { build_one_cst (type); })))
+  (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
+   (plus (CTZ:type (convert:utype @0)) { build_one_cst (type); }
 #endif
 
 (for ffs (BUILT_IN_FFS BUILT_IN_FFSL BUILT_IN_FFSLL
--- gcc/testsuite/gcc.c-torture/execute/pr97325.c.jj2020-10-08 
14:29:59.740440126 +0200
+++ gcc/testsuite/gcc.c-torture/execute/pr97325.c   2020-10-08 
14:27:22.518708841 +0200
@@ -0,0 +1,15 @@
+/* PR tree-optimization/97325 */
+
+unsigned long long
+foo (unsigned long long c)
+{
+  return c ? __builtin_ffs (-(unsigned short) c) : 0;
+}
+
+int
+main ()
+{
+  if (foo (2) != 2)
+__builtin_abort ();
+  return 0;
+}


Jakub



[PATCH] SLP vectorize multiple BBs at once

2020-10-08 Thread Richard Biener
This work from Martin Liska was motivated by gcc.dg/vect/bb-slp-22.c
which shows how poorly we currently BB vectorize code like

  a0 = in[0] + 23;
  a1 = in[1] + 142;
  a2 = in[2] + 2;
  a3 = in[3] + 31;

  if (x > y)
{
  b[0] = a0;
  b[1] = a1;
  b[2] = a2;
  b[3] = a3;
}
  else
{
  out[0] = a0 * (x + 1);
  out[1] = a1 * (y + 1);
  out[2] = a2 * (x + 1);
  out[3] = a3 * (y + 1);
}

namely by vectorizing the stores but not the common load (and add)
they are feeded with.

Thus with the following patch we change the BB vectorizer from
operating on a single basic-block at a time to consider somewhat
larger regions (but not the whole function yet because of issues
with vector size iteration).

I took the opportunity to remove the fancy region iterations again
now that we operate on BB granularity and in the end need to visit
PHI nodes as well.

Final re-bootstrap/regtest running on x86_64-unknown-linux-gnu.

Richard.

2020-10-08  Martin Liska  
Richard Biener  

* tree-vectorizer.h (_bb_vec_info::const_iterator): Remove.
(_bb_vec_info::const_reverse_iterator): Likewise.
(_bb_vec_info::region_stmts): Likewise.
(_bb_vec_info::reverse_region_stmts): Likewise.
(_bb_vec_info::_bb_vec_info): Adjust.
(_bb_vec_info::bb): Remove.
(_bb_vec_info::region_begin): Remove.
(_bb_vec_info::region_end): Remove.
(_bb_vec_info::bbs): New vector of BBs.
(vect_slp_function): Declare.
* tree-vect-patterns.c (vect_determine_precisions): Use
regular stmt iteration.
(vect_pattern_recog): Likewise.
* tree-vect-slp.c: Include cfganal.h, tree-eh.h and tree-cfg.h.
(vect_build_slp_tree_1): Properly refuse to vectorize
volatile and throwing stmts.
(vect_build_slp_tree_2): Pass group-size down to
get_vectype_for_scalar_type.
(_bb_vec_info::_bb_vec_info): Use regular stmt iteration,
adjust for changed region specification.
(_bb_vec_info::~_bb_vec_info): Likewise.
(vect_slp_check_for_constructors): Likewise.
(vect_slp_region): Likewise.
(vect_slp_bbs): New worker operating on a vector of BBs.
(vect_slp_bb): Wrap it.
(vect_slp_function): New function splitting the function
into multi-BB regions.
(vect_create_constant_vectors): Handle the case of inserting
after a throwing def.
(vect_schedule_slp_instance): Adjust.
* tree-vectorizer.c (vec_info::remove_stmt): Simplify again.
(vec_info::insert_seq_on_entry): Adjust.
(pass_slp_vectorize::execute): Also init PHIs.  Call
vect_slp_function.

* gcc.dg/vect/bb-slp-22.c: Adjust.
* gfortran.dg/pr68627.f: Likewise.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-22.c |   2 +-
 gcc/testsuite/gfortran.dg/pr68627.f   |   2 +-
 gcc/tree-vect-patterns.c  |  38 +++--
 gcc/tree-vect-slp.c   | 228 ++
 gcc/tree-vectorizer.c |  19 ++-
 gcc/tree-vectorizer.h |  93 +--
 6 files changed, 203 insertions(+), 179 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-22.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-22.c
index 21552300e11..92cc2a51abd 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-22.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-22.c
@@ -63,5 +63,5 @@ int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" } } */
+/* { dg-final { scan-tree-dump-times "optimized: basic block" 1 "slp2" } } */
 /* { dg-final { scan-tree-dump "vectorizing SLP node starting from: _\[0-9\]+ 
= _\[0-9\]+ \\\* a0" "slp2" { target vect_int_mult  } } } */
diff --git a/gcc/testsuite/gfortran.dg/pr68627.f 
b/gcc/testsuite/gfortran.dg/pr68627.f
index 7754b72ff58..2ff1f3c77c8 100644
--- a/gcc/testsuite/gfortran.dg/pr68627.f
+++ b/gcc/testsuite/gfortran.dg/pr68627.f
@@ -1,6 +1,6 @@
 ! { dg-do compile { target { { i?86-*-* x86_64-*-* } && { ! { ia32 } } } } }
 
-! { dg-options "-Ofast -mavx512f -ffixed-xmm1 -ffixed-xmm2 -ffixed-xmm3 
-ffixed-xmm4 -ffixed-xmm5 -ffixed-xmm6 -ffixed-xmm7 -ffixed-xmm8 -ffixed-xmm9 
-ffixed-xmm10 -ffixed-xmm11 -ffixed-xmm12 -ffixed-xmm13 -ffixed-xmm14 
-ffixed-xmm15" }
+! { dg-options "-Ofast -fno-tree-slp-vectorize -mavx512f -ffixed-xmm1 
-ffixed-xmm2 -ffixed-xmm3 -ffixed-xmm4 -ffixed-xmm5 -ffixed-xmm6 -ffixed-xmm7 
-ffixed-xmm8 -ffixed-xmm9 -ffixed-xmm10 -ffixed-xmm11 -ffixed-xmm12 
-ffixed-xmm13 -ffixed-xmm14 -ffixed-xmm15" }
 
   IMPLICIT REAL*8(A-H,O-Z)
   ALLOCATABLE DD1(:), DD2(:), WY(:,:)
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index d626c5f7362..71e4e106202 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -5123,12 +5123,14 @@ vect_determine_precisions (vec_info *vinfo)
   else
 {
   bb_vec_info bb_vinfo = as_a  (vinfo);
-  for (gimple *stmt : bb_vinfo->reverse_region_stmts 

[PATCH] tree-optimization/97330 - fix bad load sinking

2020-10-08 Thread Richard Biener
This fixes bad placement of sunk loads.

Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed.

2020-10-08  Richard Biener  

PR tree-optimization/97330
* tree-ssa-sink.c (statement_sink_location): Avoid skipping
PHIs when they dominate the insert location.

* gcc.dg/torture/pr97330-1.c: New testcase.
* gcc.dg/torture/pr97330-2.c: Likewise.
---
 gcc/testsuite/gcc.dg/torture/pr97330-1.c | 19 +++
 gcc/testsuite/gcc.dg/torture/pr97330-2.c | 14 ++
 gcc/tree-ssa-sink.c  |  4 +++-
 3 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr97330-1.c
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr97330-2.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr97330-1.c 
b/gcc/testsuite/gcc.dg/torture/pr97330-1.c
new file mode 100644
index 000..7dce5bd90d4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr97330-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+
+typedef int a;
+typedef char b;
+int c;
+void d(e, f, dst, g, avail, h) int e;
+b *f, *dst;
+a g, avail;
+int h;
+{
+  b i = *f;
+  if (e)
+goto j;
+  while (avail) {
+*dst = i;
+  j:
+avail -= c;
+  }
+}
diff --git a/gcc/testsuite/gcc.dg/torture/pr97330-2.c 
b/gcc/testsuite/gcc.dg/torture/pr97330-2.c
new file mode 100644
index 000..a0644831cfd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr97330-2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+
+int a, b, d;
+char c, e;
+void f(void) {
+  char g = c;
+  if (b)
+goto h;
+  while (d) {
+e = c;
+  h:
+d -= a;
+  }
+}
diff --git a/gcc/tree-ssa-sink.c b/gcc/tree-ssa-sink.c
index ba8e5577e9c..207aae2818a 100644
--- a/gcc/tree-ssa-sink.c
+++ b/gcc/tree-ssa-sink.c
@@ -391,8 +391,10 @@ statement_sink_location (gimple *stmt, basic_block frombb,
  if (gimple_code (use_stmt) == GIMPLE_PHI)
{
  /* In case the PHI node post-dominates the current insert 
location
-we can disregard it.  */
+we can disregard it.  But make sure it is not dominating
+it as well as can happen in a CFG cycle.  */
  if (commondom != bb
+ && !dominated_by_p (CDI_DOMINATORS, commondom, bb)
  && dominated_by_p (CDI_POST_DOMINATORS, commondom, bb))
continue;
  bb = EDGE_PRED (bb, PHI_ARG_INDEX_FROM_USE (use_p))->src;
-- 
2.26.2


PING^2 [PATCH] x86: Add

2020-10-08 Thread H.J. Lu via Gcc-patches
On Fri, Oct 2, 2020 at 5:51 AM H.J. Lu  wrote:
>
> On Wed, Sep 23, 2020 at 10:58 AM H.J. Lu  wrote:
> >
> > For sources which can't use any vector instructions,  and
> >  cannot be included for compiler intrinsics:
> >
> > $ echo "#include " | gcc -S -O2 -mno-sse -mno-mmx -x c -
> > In file included from /usr/include/stdlib.h:1013,
> >  from 
> > /usr/lib/gcc/x86_64-redhat-linux/10/include/mm_malloc.h:27,
> >  from 
> > /usr/lib/gcc/x86_64-redhat-linux/10/include/xmmintrin.h:34,
> >  from 
> > /usr/lib/gcc/x86_64-redhat-linux/10/include/immintrin.h:29,
> >  from 
> > /usr/lib/gcc/x86_64-redhat-linux/10/include/x86intrin.h:32,
> >  from :1:
> > /usr/include/bits/stdlib-float.h: In function ‘atof’:
> > /usr/include/bits/stdlib-float.h:26:1: error: SSE register return with SSE 
> > disabled
> >26 | {
> >   | ^
> > $
> >
> > libgcc/config/i386/shadow-stack-unwind.h has a workaround:
> >
> > /* NB: We need _get_ssp and _inc_ssp from .  But we can't
> >include  which ends up including , which
> >includes  and  unconditionally.  But we can't
> >include any libc system headers unconditionally from libgcc.  Avoid
> >including  here by defining _IMMINTRIN_H_INCLUDED.  */
> >  #define _IMMINTRIN_H_INCLUDED
> >  #include 
> >  #undef _IMMINTRIN_H_INCLUDED
> >
> > Add a standalone intrinsic header file, , to provide
> > integer only intrinsics.  All integer only intrinsics are placed in
> > .   and  simply include
> > .
> >
> > Add the FSF copyright to ,  and
> > .
> >
> > gcc/
> >
> > PR target/97148
> > * config.gcc (extra_headers): Add x86gprintrin.h.
> > * config/i386/adxintrin.h: Check _X86GPRINTRIN_H_INCLUDED for
> > .
> > * config/i386/bmi2intrin.h: Likewise.
> > * config/i386/bmiintrin.h: Likewise.
> > * config/i386/cetintrin.h: Likewise.
> > * config/i386/cldemoteintrin.h: Likewise.
> > * config/i386/clflushoptintrin.h: Likewise.
> > * config/i386/clwbintrin.h: Likewise.
> > * config/i386/fxsrintrin.h: Likewise.
> > * config/i386/ia32intrin.h: Likewise.
> > * config/i386/lwpintrin.h: Likewise.
> > * config/i386/lzcntintrin.h: Likewise.
> > * config/i386/movdirintrin.h: Likewise.
> > * config/i386/pkuintrin.h: Likewise.
> > * config/i386/rdseedintrin.h: Likewise.
> > * config/i386/rtmintrin.h: Likewise.
> > * config/i386/serializeintrin.h: Likewise.
> > * config/i386/tbmintrin.h: Likewise.
> > * config/i386/waitpkgintrin.h: Likewise.
> > * config/i386/xsavecintrin.h: Likewise.
> > * config/i386/xsaveintrin.h: Likewise.
> > * config/i386/xsaveoptintrin.h: Likewise.
> > * config/i386/xsavesintrin.h: Likewise.
> > * config/i386/xtestintrin.h: Likewise.
> > * config/i386/enqcmdintrin.h: Check _X86GPRINTRIN_H_INCLUDED for
> > .  Replace  with 
> > in the error message.
> > * config/i386/immintrin.h: Include  instead of
> > , , ,
> > , , ,
> > , , , ,
> > , ,
> > , , ,
> > , , ,
> > , ,  and
> > .
> > (_wbinvd): Moved to config/i386/x86gprintrin.h.
> > (_rdrand16_step): Likewise.
> > (_rdrand32_step): Likewise.
> > (_rdpid_u32): Likewise.
> > (_readfsbase_u32): Likewise.
> > (_readfsbase_u64): Likewise.
> > (_readgsbase_u32): Likewise.
> > (_readgsbase_u64): Likewise.
> > (_writefsbase_u32): Likewise.
> > (_writefsbase_u64): Likewise.
> > (_writegsbase_u32): Likewise.
> > (_writegsbase_u64): Likewise.
> > (_rdrand64_step): Likewise.
> > (_ptwrite64): Likewise.
> > (_ptwrite32): Likewise.
> > * config/i386/x86gprintrin.h: New file.
> > * config/i386/pconfigintrin.h: Add the FSF copyright.  Check
> > _X86GPRINTRIN_H_INCLUDED for .
> > * config/i386/tsxldtrkintrin.h: Likewise.
> > * config/i386/wbnoinvdintrin.h: Likewise.
> > * config/i386/x86intrin.h: Include .  Don't
> > include , , ,
> > ,  and .
> >
> > gcc/testsuite/
> >
> > * gcc.target/i386/avx-1.c (__builtin_ia32_lwpval32): New to
> > support  included in .
> > (__builtin_ia32_lwpval64): Likewise.
> > (__builtin_ia32_lwpins32): Likewise.
> > (__builtin_ia32_lwpins64): Likewise.
> > (__builtin_ia32_bextri_u32): New to support 
> > included in .
> > (__builtin_ia32_bextri_u64): Likewise.
> > * gcc.target/i386/x86gprintrin-1.c: New test.
> > * gcc.target/i386/x86gprintrin-2.c: Likewise.
> > * gcc.target/i386/x86gprintrin-3.c: Likewise.
> > * gcc.target/i386/x86gprintrin-4.c: Likewise.
> > * gcc.target/i386/x86gprintrin-4a.c: Likewise.
> > * gcc.target/i386/x86gprintrin-5.c: 

[ Preprocessor ] [ Common ] Feature: Macros for identifying the wide and narrow execution string literal encoding

2020-10-08 Thread JeanHeyd Meneide via Gcc-patches
Dear GCC,

 This patch adds macros to the general preprocessor that allow
users to understand what the execution and wide execution charsets
are, which are used for "bark" and L"meow" literals in C-family
languages.

 The goal of this is to enable individuals in capable languages
like C and C++ to determine the encoding of string literals and,
hopefully, transcode those literals to other encodings. For example,
data stored in `char[]` on an IBM machine with EBCDIC can be
(losslessly, or with a compile-time warning) transcoded to UTF-8 at
constant-expression time in C++ with this patch, without having to use
a series of complicated architecture and platform definitions to
figure out what the encoding of string literals and wide character
literals might be.

 The names are meaningful because they are tied directly to iconv,
which means there is a strong reference between the name and the code
that encodes/decodes a proper sequence. Therefore, we just present the
names.

 Does this sound useful?

Sincerely,
JeanHeyd

Patch notes: the strings passed to the charset creation routines are
all static and/or allocated far at the beginning of the program and
never deallocated until the end of the invocation, so it seems safe to
just store a normal pointer to it.

2020-10-08  JeanHeyd "ThePhD" Meneide  

* gcc/c-family/c-cppbuiltin.c: Add predefined macro
definitions for charsets
* gcc/doc/cpp.texi: Document new predefined macro.
* gcc/testsuite/c-c++-common/cpp/wide-narrow-predef-macros.c (new):
  New test for macro definitions to always exist.
* libcpp/include/cpplib.h: Add functions declarations for
  retrieving charset names
* libcpp/directives.c: Add function definitions to retrieve charset
  names.
* libcpp/internal.h: Add to/from name preservations
diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 74ecca8de8e..8de25786592 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -866,6 +866,13 @@ c_cpp_builtins (cpp_reader *pfile)
 
   define_language_independent_builtin_macros (pfile);
 
+  /* encoding definitions used by users and libraries  */
+  builtin_define_with_value ("__GNUC_EXECUTION_CHARSET_NAME",
+cpp_get_narrow_charset_name (pfile), 1);
+  builtin_define_with_value ("__GNUC_WIDE_EXECUTION_CHARSET_NAME",
+cpp_get_wide_charset_name (pfile), 1);
+
+
   if (c_dialect_cxx ())
   {
 int major;
diff --git a/gcc/doc/cpp.texi b/gcc/doc/cpp.texi
index 33f876ab706..55fa5739812 100644
--- a/gcc/doc/cpp.texi
+++ b/gcc/doc/cpp.texi
@@ -2451,6 +2451,13 @@ features are supported by GCC.
 @item __NO_MATH_ERRNO__
 This macro is defined if @option{-fno-math-errno} is used, or enabled
 by another option such as @option{-ffast-math} or by default.
+
+@item __GNUC_EXECUTION_CHARSET_NAME
+@itemx __GNUC_WIDE_EXECUTION_CHARSET_NAME
+These macros are defined to the name of the narrow and wide compile-time
+execution character set used.  It directly reflects the name passed to
+the options @option{-fexec-charset} and @option{-fwide-exec-charset},
+or the defaults documented for those options.  @xref{Invocation}.
 @end table
 
 @node System-specific Predefined Macros
diff --git a/gcc/testsuite/c-c++-common/cpp/wide-narrow-predef-macros.c 
b/gcc/testsuite/c-c++-common/cpp/wide-narrow-predef-macros.c
new file mode 100644
index 000..0f55e6915f4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/cpp/wide-narrow-predef-macros.c
@@ -0,0 +1,10 @@
+/*
+   { dg-do compile }
+ */
+
+#if !defined(__GNUC_EXECUTION_CHARSET_NAME)
+#error "Required implementation macro for compile-time charset name is not 
present"
+#endif
+#if !defined(__GNUC_WIDE_EXECUTION_CHARSET_NAME)
+#error "Required implementation macro for wide compile-time charset name is 
not present"
+#endif
diff --git a/libcpp/charset.c b/libcpp/charset.c
index 28b81c9c864..3e5578b1390 100644
--- a/libcpp/charset.c
+++ b/libcpp/charset.c
@@ -638,6 +638,9 @@ init_iconv_desc (cpp_reader *pfile, const char *to, const 
char *from)
   char *pair;
   size_t i;
 
+  ret.to = to;
+  ret.from = from;
+
   if (!strcasecmp (to, from))
 {
   ret.func = convert_no_conversion;
diff --git a/libcpp/directives.c b/libcpp/directives.c
index f59718708e4..ad540872581 100644
--- a/libcpp/directives.c
+++ b/libcpp/directives.c
@@ -2571,6 +2571,20 @@ cpp_set_callbacks (cpp_reader *pfile, cpp_callbacks *cb)
   pfile->cb = *cb;
 }
 
+/* The narrow character set identifier.  */
+const char *
+cpp_get_narrow_charset_name (cpp_reader *pfile)
+{
+  return pfile->narrow_cset_desc.to;
+}
+
+/* The wide character set identifier.  */
+const char *
+cpp_get_wide_charset_name (cpp_reader *pfile)
+{
+  return pfile->wide_cset_desc.to;
+}
+
 /* The dependencies structure.  (Creates one if it hasn't already been.)  */
 class mkdeps *
 cpp_get_deps (cpp_reader *pfile)
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 

Re: make sincos take type from intrinsic formal, not from result assignment

2020-10-08 Thread Richard Biener via Gcc-patches
On Wed, Oct 7, 2020 at 7:15 PM Alexandre Oliva  wrote:
>
> On Oct  6, 2020, Richard Biener  wrote:
>
> > So how about that mathfn_type helper instead of hard-wring this logic
> > in sincos()?
>
> Like this?
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

OK with a minor nit, see below

> I'm a little unhappy with the duplication of the CASE_MATHFN* sequence,
> that ought to be kept in sync, , and considered turning that whole
> sequence into a #define used in both places, but that would bloat the
> patch further and make it less readable, so I figured I'd propose this
> one first.  Please let me know if you agree this additional change would
> make it better.

Yeah, I guess so.

> take type from intrinsic in sincos pass
>
> From: Alexandre Oliva 
>
> This is a first step towards enabling the sincos optimization in Ada.
>
> The issue this patch solves is that sincos takes the type to be looked
> up with mathfn_built_in from variables or temporaries passed as
> arguments to SIN and COS intrinsics.  In Ada, different float types
> may be used but, despite their representation equivalence, their
> distinctness causes the optimization to be skipped, because they are
> not the types that mathfn_built_in expects.
>
> This patch introduces a function that maps intrinsics to the type
> they're associated with, and uses that type, obtained from the
> intrinsics used in calls to be optimized, to look up the correspoding
> CEXPI intrinsic.
>
> For the sake of defensive programming, when using the type obtained
> from the intrinsic, it now checks that, if different types are found
> for the used argument, or for other calls that use it, that the types
> are interchangeable.
>
>
> for  gcc/ChangeLog
>
> * builtins.c (mathfn_built_in_type): New.
> * builtins.h (mathfn_built_in_type): Declare.
> * tree-ssa-math-opts.c (execute_cse_sincos_1): Use it to
> obtain the type expected by the intrinsic.
> ---
>  gcc/builtins.c   |  147 
> ++
>  gcc/builtins.h   |1
>  gcc/tree-ssa-math-opts.c |   17 -
>  3 files changed, 162 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index f91266e4..5649242 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -2160,6 +2160,7 @@ mathfn_built_in_2 (tree type, combined_fn fn)
>
>switch (fn)
>  {
> +  /* Copied to mathfn_built_in_type, please keep in sync.  */
>  CASE_MATHFN (ACOS)
>  CASE_MATHFN (ACOSH)
>  CASE_MATHFN (ASIN)
> @@ -2278,6 +2279,10 @@ mathfn_built_in_2 (tree type, combined_fn fn)
>  return END_BUILTINS;
>  }
>
> +#undef CASE_MATHFN
> +#undef CASE_MATHFN_FLOATN
> +#undef CASE_MATHFN_REENT
> +
>  /* Return mathematic function equivalent to FN but operating directly on 
> TYPE,
> if available.  If IMPLICIT_P is true use the implicit builtin declaration,
> otherwise use the explicit declaration.  If we can't do the conversion,
> @@ -2313,6 +2318,148 @@ mathfn_built_in (tree type, enum built_in_function fn)
>return mathfn_built_in_1 (type, as_combined_fn (fn), /*implicit=*/ 1);
>  }
>
> +/* Return the type associated with a built in function, i.e., the one
> +   to be passed to mathfn_built_in to get the type-specific
> +   function.  */
> +
> +tree
> +mathfn_built_in_type (combined_fn fn)
> +{
> +#define CASE_MATHFN(MATHFN)\
> +  case BUILT_IN_##MATHFN:  \
> +return double_type_node;   \
> +  case BUILT_IN_##MATHFN##F:   \
> +return float_type_node;\
> +  case BUILT_IN_##MATHFN##L:   \
> +return long_double_type_node;
> +
> +#define CASE_MATHFN_FLOATN(MATHFN) \
> +  CASE_MATHFN(MATHFN)  \
> +  case BUILT_IN_##MATHFN##F16: \
> +return float16_type_node;  \
> +  case BUILT_IN_##MATHFN##F32: \
> +return float32_type_node;  \
> +  case BUILT_IN_##MATHFN##F64: \
> +return float64_type_node;  \
> +  case BUILT_IN_##MATHFN##F128:\
> +return float128_type_node; \
> +  case BUILT_IN_##MATHFN##F32X:\
> +return float32x_type_node; \
> +  case BUILT_IN_##MATHFN##F64X:\
> +return float64x_type_node; \
> +  case BUILT_IN_##MATHFN##F128X:   \
> +return float128x_type_node;
> +
> +/* Similar to above, but appends _R after any F/L suffix.  */
> +#define CASE_MATHFN_REENT(MATHFN) \
> +  case BUILT_IN_##MATHFN##_R:  \
> +return double_type_node;   \
> +  case BUILT_IN_##MATHFN##F_R: \
> +return float_type_node;\
> +  case BUILT_IN_##MATHFN##L_R: \
> +return long_double_type_node;
> +
> +  switch (fn)
> +{
> +  /* 

Re: [PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 08, 2020 at 11:37:24AM +, Wilco Dijkstra via Gcc-patches wrote:
> Which optimizations does it enable that aren't possible if the value is 
> defined?

See bugzilla.  Note other compilers heavily optimize on those builtins
undefined at value zero.

> > We just should make sure that we optimize code like x ? __builtin_c[lt]z 
> > (x) : 32;
> > etc. properly (and I believe we do).
> 
> I think we do, but both the external and internal documentation are not clear
> enough that most targets actually do define a value and will optimize for it.
> Otherwise we wouldn't have this bug now...

The documentation is very clear that the builtins are undefined at zero,
that is all that matters for users.

Jakub



Re: [PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Wilco Dijkstra via Gcc-patches
Hi Jakub, 

> Having it undefined allows optimizations, and has been that way for years.

Which optimizations does it enable that aren't possible if the value is defined?

> We just should make sure that we optimize code like x ? __builtin_c[lt]z (x) 
> : 32;
> etc. properly (and I believe we do).

I think we do, but both the external and internal documentation are not clear
enough that most targets actually do define a value and will optimize for it.
Otherwise we wouldn't have this bug now...

Cheers,
Wilco

[PATCH] adjust BB vectorization dump scanning

2020-10-08 Thread Richard Biener
This adjusts BB vectorization testcases to look for the number of
SLP subgraphs vectorized rather than for the number of basic blocks
we've found opportunities in because followup patches will play
with the granularity we work on, vectorizing multiple basic blocks
at a time.

Together with this, because I noticed when looking at non-obvious
mismatches, I avoid analyzing group-size 1 SLP instances which
result in pointless V1mode vectorizations.

It might be interesting to work on adding sth like
dg-warning to look for -fopt-info-{optimized,missing} so
we could directly annotate (not) vectorized loops instead of
relying on fragile counts.

Bootstrapped / tested on x86_64-unknown-linux-gnu, pushed.

Richard.

2020-10-08  Richard Biener  

* tree-vectorizer.c (try_vectorize_loop_1): Do not dump
"basic block vectorized".
(pass_slp_vectorize::execute): Likewise.
* tree-vect-slp.c (vect_analyze_slp_instance): Avoid
re-analyzing split single stmts.

* g++.dg/vect/slp-pr50819.cc: Adjust.
* gcc.dg/vect/bb-slp-1.c: Adjust.
* gcc.dg/vect/bb-slp-10.c: Adjust.
* gcc.dg/vect/bb-slp-11.c: Adjust.
* gcc.dg/vect/bb-slp-13.c: Adjust.
* gcc.dg/vect/bb-slp-14.c: Adjust.
* gcc.dg/vect/bb-slp-15.c: Adjust.
* gcc.dg/vect/bb-slp-16.c: Adjust.
* gcc.dg/vect/bb-slp-17.c: Adjust.
* gcc.dg/vect/bb-slp-18.c: Adjust.
* gcc.dg/vect/bb-slp-19.c: Adjust.
* gcc.dg/vect/bb-slp-2.c: Adjust.
* gcc.dg/vect/bb-slp-20.c: Adjust.
* gcc.dg/vect/bb-slp-21.c: Adjust.
* gcc.dg/vect/bb-slp-22.c: Adjust.
* gcc.dg/vect/bb-slp-23.c: Adjust.
* gcc.dg/vect/bb-slp-24.c: Adjust.
* gcc.dg/vect/bb-slp-25.c: Adjust.
* gcc.dg/vect/bb-slp-26.c: Adjust.
* gcc.dg/vect/bb-slp-27.c: Adjust.
* gcc.dg/vect/bb-slp-28.c: Adjust.
* gcc.dg/vect/bb-slp-29.c: Adjust.
* gcc.dg/vect/bb-slp-3.c: Adjust.
* gcc.dg/vect/bb-slp-30.c: Adjust.
* gcc.dg/vect/bb-slp-31.c: Adjust.
* gcc.dg/vect/bb-slp-34.c: Adjust.
* gcc.dg/vect/bb-slp-35.c: Adjust.
* gcc.dg/vect/bb-slp-36.c: Adjust.
* gcc.dg/vect/bb-slp-38.c: Adjust.
* gcc.dg/vect/bb-slp-4.c: Adjust.
* gcc.dg/vect/bb-slp-45.c: Adjust.
* gcc.dg/vect/bb-slp-46.c: Adjust.
* gcc.dg/vect/bb-slp-48.c: Adjust.
* gcc.dg/vect/bb-slp-5.c: Adjust.
* gcc.dg/vect/bb-slp-6.c: Adjust.
* gcc.dg/vect/bb-slp-7.c: Adjust.
* gcc.dg/vect/bb-slp-8.c: Adjust.
* gcc.dg/vect/bb-slp-8a.c: Adjust.
* gcc.dg/vect/bb-slp-8b.c: Adjust.
* gcc.dg/vect/bb-slp-9.c: Adjust.
* gcc.dg/vect/bb-slp-div-2.c: Adjust.
* gcc.dg/vect/bb-slp-over-widen-1.c: Adjust.
* gcc.dg/vect/bb-slp-over-widen-2.c: Adjust.
* gcc.dg/vect/bb-slp-pattern-2.c: Adjust.
* gcc.dg/vect/bb-slp-pow-1.c: Adjust.
* gcc.dg/vect/bb-slp-pr58135.c: Adjust.
* gcc.dg/vect/bb-slp-pr65935.c: Adjust.
* gcc.dg/vect/bb-slp-pr78205.c: Adjust.
* gcc.dg/vect/bb-slp-pr81635-1.c: Adjust.
* gcc.dg/vect/bb-slp-pr81635-3.c: Adjust.
* gcc.dg/vect/bb-slp-pr95839-2.c: Adjust.
* gcc.dg/vect/bb-slp-pr95839.c: Adjust.
* gcc.dg/vect/bb-slp-pr95866.c: Adjust.
* gcc.dg/vect/bb-slp-subgroups-1.c: Adjust.
* gcc.dg/vect/bb-slp-subgroups-2.c: Adjust.
* gcc.dg/vect/bb-slp-subgroups-3.c: Adjust.
* gcc.dg/vect/fast-math-bb-slp-call-1.c: Adjust.
* gcc.dg/vect/no-tree-reassoc-bb-slp-12.c: Adjust.
* gcc.dg/vect/no-tree-sra-bb-slp-pr50730.c: Adjust.
* gfortran.dg/vect/pr62283-2.f: Adjust.
* gcc.target/i386/pr68961.c: Adjust.
* gcc.target/i386/pr84101.c: Adjust.
* gcc.dg/vect/bb-slp-pr81635-2.c: Adjust.
* gcc.dg/vect/bb-slp-pr81635-4.c: Adjust.
* gcc.dg/vect/fast-math-bb-slp-call-2.c: Adjust.
* gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a. Adjust.c:
* gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp.c: Adjust.
* gcc.dg/vect/bb-slp-div-1.c: Adjust.
* gcc.dg/vect/bb-slp-pr90006.c: Adjust.
* g++.dg/vect/slp-pr50413.cc: Adjust.
---
 gcc/testsuite/g++.dg/vect/slp-pr50413.cc | 3 +--
 gcc/testsuite/g++.dg/vect/slp-pr50819.cc | 2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-1.c | 2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-10.c| 2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-11.c| 2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-13.c| 2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-14.c| 2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-15.c| 2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-16.c| 2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-17.c| 2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-18.c| 2 +-
 

[PATCH] IPA MOD REF: add debug counter.

2020-10-08 Thread Martin Liška

I'm going to install the following tested patch.

Thanks,
Martin

gcc/ChangeLog:

* dbgcnt.def (DEBUG_COUNTER): Add ipa_mod_ref debug counter.
* tree-ssa-alias.c (modref_may_conflict): Handle the counter.
---
 gcc/dbgcnt.def   | 1 +
 gcc/tree-ssa-alias.c | 4 
 2 files changed, 5 insertions(+)

diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index cf8775b2b66..07946a85ecc 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -171,6 +171,7 @@ DEBUG_COUNTER (if_after_reload)
 DEBUG_COUNTER (if_conversion)
 DEBUG_COUNTER (if_conversion_tree)
 DEBUG_COUNTER (ipa_cp_bits)
+DEBUG_COUNTER (ipa_mod_ref)
 DEBUG_COUNTER (ipa_sra_params)
 DEBUG_COUNTER (ipa_sra_retvalues)
 DEBUG_COUNTER (ira_move)
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 97dc4ac8814..d85f378a8ae 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "ipa-modref.h"
 #include "attr-fnspec.h"
 #include "errors.h"
+#include "dbgcnt.h"
 
 /* Broad overview of how alias analysis on gimple works:
 
@@ -2470,6 +2471,9 @@ modref_may_conflict (const gimple *stmt,

   if (tt->every_base)
 return true;
 
+  if (!dbg_cnt (ipa_mod_ref))

+return true;
+
   base_set = ao_ref_base_alias_set (ref);
 
   ref_set = ao_ref_alias_set (ref);

--
2.28.0



Re: [PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Wilco Dijkstra via Gcc-patches

Btw for PowerPC is 0..32:

https://www.ibm.com/support/knowledgecenter/ssw_aix_72/assembler/idalangref_cntlzw_instrs.html

Wilco


Re: [PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 08, 2020 at 11:22:34AM +, Wilco Dijkstra wrote:
> >> I think a better way forward would be to make the builtin_clz/ctz more 
> >> defined.
> >> Having undefined values is a source of unnecessary bugs given practically 
> >> all
> >> modern targets return the number of bits for the zero input - it is 
> >> relatively
> >> easy to ensure this on the few targets that don't.
> >
> > Well, e.g. i?86/x86_64 in most commonly used CPU flags is really undefined
> > (the register is unchanged).  And -1 is also quite commonly used value,
> > e.g. powerpc, gcn, xtensa.
> 
> So wouldn't it be easy to initialize the register before you do the bsr to get
> the same result as with BMI? I don't think an extra mov can affect performance
> in actual code (and GCC could still optimize the zero case if the input range
> doesn't include zero).
> 
> -1 is more complex, if these targets don't want to add extra instructions to 
> fix
> it up, we could define the zero result either -1 or #bits depending on the 
> target
> (still better than completely undefined).

Having it undefined allows optimizations, and has been that way for years.
We just should make sure that we optimize code like x ? __builtin_c[lt]z (x) : 
32;
etc. properly (and I believe we do).

Jakub



Re: [PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Wilco Dijkstra via Gcc-patches
Hi Jakub,

>> I think a better way forward would be to make the builtin_clz/ctz more 
>> defined.
>> Having undefined values is a source of unnecessary bugs given practically all
>> modern targets return the number of bits for the zero input - it is 
>> relatively
>> easy to ensure this on the few targets that don't.
>
> Well, e.g. i?86/x86_64 in most commonly used CPU flags is really undefined
> (the register is unchanged).  And -1 is also quite commonly used value,
> e.g. powerpc, gcn, xtensa.

So wouldn't it be easy to initialize the register before you do the bsr to get
the same result as with BMI? I don't think an extra mov can affect performance
in actual code (and GCC could still optimize the zero case if the input range
doesn't include zero).

-1 is more complex, if these targets don't want to add extra instructions to fix
it up, we could define the zero result either -1 or #bits depending on the 
target
(still better than completely undefined).

Cheers,
Wilco


Re: [PATCH] MIPS/libphobos: Fix switchcontext.S assembly for MIPS I ISA

2020-10-08 Thread Iain Buclaw via Gcc-patches
Excerpts from Maciej W. Rozycki's message of October 7, 2020 9:45 pm:
> Correct MIPS I assembly build errors in switchcontext.S:
> 
> .../libphobos/libdruntime/config/mips/switchcontext.S: Assembler messages:
> .../libphobos/libdruntime/config/mips/switchcontext.S:50: Error: opcode not 
> supported on this processor: mips1 (mips1) `sdc1 
> $f20,(0*8-((6*8+4+(-6*8+4&7($sp)'
> 
> etc., due to the use of the MIPS II LDC1 and SDC1 hardware instructions 
> for FP register load and store operations.  Instead use the L.D and S.D 
> generic assembly instructions, which are strict aliases for the LDC1 and 
> SDC1 instructions respectively and produce identical machine code where 
> the assembly for the MIPS II or a higher ISA has been requested, however 
> they become assembly macros and expand to compatible sequences of LWC1 
> and SWC1 hardware instructions where the assembly for the MIPS I ISA is 
> in effect.
> 
>   libphobos/
>   * libdruntime/config/mips/switchcontext.S [__mips_hard_float]: 
>   Use L.D and S.D generic assembly instructions rather than LDC1 
>   and SDC1 MIPS II hardware instructions.
> ---
> Hi,
> 
>  Noticed in a build of a MIPS I toolchain.  I have no way to run MIPS 
> regression-testing right now, however in `libopcodes' the L.D and S.D 
> instructions are strict aliases valid for the MIPS II and higher ISAs, and 
> just to double-check that I have built MIPS32r2 GCC with and without the 
> change applied and verified with `objdump' that the respective target 
> objects produced are identical.
> 
>  OK to apply to trunk, and -- as a fatal compilation error -- to backport 
> to active release branches?
> 

Fine with me, thanks.

Iain.



Re: [PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 08, 2020 at 11:04:01AM +, Wilco Dijkstra wrote:
> > Perhaps another way out of this would be document and enforce that
> > __builtin_c[lt]z{,l,ll} etc calls are undefined at zero, but C[TL]Z ifn
> > calls are defined there based on *_DEFINED_VALUE_AT_ZERO (*) == 2, and then
> > we would need to make sure that e.g. in simplify_count_trailing_zeroes
> > we emit a .CTZ or __builtin_ctz call depending on whether it is undefined
> > there or not; or give .C[TL]Z an additional argument (0/1) which would tell
> > if it is defined at zero or not.
> 
> I think a better way forward would be to make the builtin_clz/ctz more 
> defined.
> Having undefined values is a source of unnecessary bugs given practically all
> modern targets return the number of bits for the zero input - it is relatively
> easy to ensure this on the few targets that don't.

Well, e.g. i?86/x86_64 in most commonly used CPU flags is really undefined
(the register is unchanged).  And -1 is also quite commonly used value,
e.g. powerpc, gcn, xtensa.

Jakub



[PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Wilco Dijkstra via Gcc-patches
Hi Jakub,

> Perhaps another way out of this would be document and enforce that
> __builtin_c[lt]z{,l,ll} etc calls are undefined at zero, but C[TL]Z ifn
> calls are defined there based on *_DEFINED_VALUE_AT_ZERO (*) == 2, and then
> we would need to make sure that e.g. in simplify_count_trailing_zeroes
> we emit a .CTZ or __builtin_ctz call depending on whether it is undefined
> there or not; or give .C[TL]Z an additional argument (0/1) which would tell
> if it is defined at zero or not.

I think a better way forward would be to make the builtin_clz/ctz more defined.
Having undefined values is a source of unnecessary bugs given practically all
modern targets return the number of bits for the zero input - it is relatively
easy to ensure this on the few targets that don't.

Cheers,
Wilco

[PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Wilco Dijkstra via Gcc-patches
Hi,

> I am quoting my analysis from the PR.  Could an aarch64 expert 
> pontificate here?
>
> This test is checking the final assembly for a specific sequence.  I 
> don't speak aarch64 assembly, but the IL is different coming out of evrp.

The code currently generated is incorrect - you really need to preserve the AND.

The issue seems that evrp doesn't seem to take CLZ_DEFINED_VALUE_AT_ZERO
macros into account. When it is 2, the value of CLZ/CTZ is defined even in the
mid-end and you need to include it in the range (so if the CLZ/CTZ value is the
number of bits you should use ranges 0..32 and 0..64).

Cheers,
Wilco


Re: [PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 08, 2020 at 12:22:11PM +0200, Jakub Jelinek via Gcc-patches wrote:
> We have several enhancement reports in bugzilla from Gabriel Ravier on this
> topic I think.

See e.g. PR94801, PR94793, PR95863 on the topic.

Jakub



[PING][PATCH v2] combine: Don't turn (mult (extend x) 2^n) into extract [PR96998]

2020-10-08 Thread Alex Coplan via Gcc-patches
Ping. The kernel is still broken on AArch64.

On 30/09/2020 11:39, Alex Coplan via Gcc-patches wrote:
> Currently, make_extraction() identifies where we can emit an ASHIFT of
> an extend in place of an extraction, but fails to make the corresponding
> canonicalization/simplification when presented with a MULT by a power of
> two. Such a representation is canonical when representing a left-shifted
> address inside a MEM.
> 
> This patch remedies this situation: after the patch, make_extraction()
> now also identifies RTXs such as:
> 
> (mult:DI (subreg:DI (reg:SI r)) (const_int 2^n))
> 
> and rewrites this as:
> 
> (mult:DI (sign_extend:DI (reg:SI r)) (const_int 2^n))
> 
> instead of using a sign_extract.
> 
> (This patch also fixes up a comment in expand_compound_operation() which
> appears to have suffered from bitrot.)
> 
> This fixes PR96998: an ICE on AArch64 due to an unrecognised
> sign_extract insn which was exposed by
> r11-2903-g6b3034eaba83935d9f6dfb20d2efbdb34b5b00bf. That change
> introduced a canonicalisation in LRA to rewrite mult to shift in address
> reloads.
> 
> Prior to this patch, the flow was as follows. We start with the
> following insn going into combine:
> 
> (insn 9 8 10 3 (set (mem:SI (plus:DI (mult:DI (reg:DI 98 [ g ])
> (const_int 4 [0x4]))
> (reg/f:DI 96)) [3 *i_5+0 S4 A32])
> (asm_operands:SI ("") ("=Q") 0 []
>  []
>  [] test.c:11)) "test.c":11:5 -1
>  (expr_list:REG_DEAD (reg:DI 98 [ g ])
> (nil)))
> 
> Then combine turns this into a sign_extract:
> 
> (insn 9 8 10 3 (set (mem:SI (plus:DI (sign_extract:DI (mult:DI (subreg:DI 
> (reg/v:SI 92 [ g ]) 0)
> (const_int 4 [0x4]))
> (const_int 34 [0x22])
> (const_int 0 [0]))
> (reg/f:DI 96)) [3 *i_5+0 S4 A32])
> (asm_operands:SI ("") ("=Q") 0 []
>  []
>  [] test.c:11)) "test.c":11:5 -1
>  (expr_list:REG_DEAD (reg/v:SI 92 [ g ])
> (nil)))
> 
> Then LRA reloads the address and (prior to the LRA change) we get:
> 
> (insn 32 8 9 3 (set (reg:DI 0 x0 [100])
> (plus:DI (sign_extract:DI (mult:DI (reg:DI 0 x0 [orig:92 g ] [92])
> (const_int 4 [0x4]))
> (const_int 34 [0x22])
> (const_int 0 [0]))
> (reg/f:DI 19 x19 [96]))) "test.c":11:5 283 {*add_extvdi_multp2}
>  (nil))
> (insn 9 32 10 3 (set (mem:SI (reg:DI 0 x0 [100]) [3 *i_5+0 S4 A32])
> (asm_operands:SI ("") ("=Q") 0 []
>  []
>  [] test.c:11)) "test.c":11:5 -1
>  (nil))
> 
> Now observe that insn 32 here is not canonical: firstly, we should be
> using an ASHIFT by 2 instead of a MULT by 4, since we're outside of a
> MEM. Indeed, the LRA change remedies this, and support for such insns in
> the AArch64 backend was dropped in
> r11-3033-g2f8ae301f6a125f50b0a758047fcddae7b68daa8.
> 
> Now the reason we ICE after the LRA change here is that AArch64 has
> never supported the ASHIFT variant of this sign_extract insn. Inspecting
> the unrecognised reloaded insn confirms this:
> 
> (gdb) p debug(insn)
> (insn 33 8 34 3 (set (reg:DI 100)
> (sign_extract:DI (ashift:DI (subreg:DI (reg/v:SI 92 [ g ]) 0)
> (const_int 2 [0x2]))
> (const_int 34 [0x22])
> (const_int 0 [0]))) "test.c":11:5 -1
>  (nil))
> 
> The thesis of this patch is that combine should _never_ be producing
> such an insn. Clearly this should be canonicalised as an extend
> operation instead (as combine already does in make_extraction() for the
> ASHIFT form). After this change to combine, we get:
> 
> (insn 9 8 10 3 (set (mem:SI (plus:DI (mult:DI (sign_extend:DI (reg/v:SI 92 [ 
> g ]))
> (const_int 4 [0x4]))
> (reg/f:DI 96)) [3 *i_5+0 S4 A32])
> (asm_operands:SI ("") ("=Q") 0 []
>  []
>  [] test.c:11)) "test.c":11:5 -1
>  (expr_list:REG_DEAD (reg/v:SI 92 [ g ])
> (nil)))
> 
> coming out of combine, and LRA can happily reload the address:
> 
> (insn 32 8 9 3 (set (reg:DI 0 x0 [100])
> (plus:DI (ashift:DI (sign_extend:DI (reg/v:SI 0 x0 [orig:92 g ] [92]))
> (const_int 2 [0x2]))
> (reg/f:DI 19 x19 [96]))) "test.c":11:5 245 {*add_extendsi_shft_di}
>  (nil))
> (insn 9 32 10 3 (set (mem:SI (reg:DI 0 x0 [100]) [3 *i_5+0 S4 A32])
> (asm_operands:SI ("") ("=Q") 0 []
>  []
>  [] test.c:11)) "test.c":11:5 -1
>  (nil))
> 
> and all is well, with nice simple and canonical RTL being used
> throughout.
> 
> Testing:
>  * Bootstrap and regtest on aarch64-linux-gnu, arm-linux-gnueabihf, and
>x86-linux-gnu in progress.
> 
> OK for trunk (with AArch64 changes discussed here [0] as a follow-on
> patch) provided it passes testing?
> 
> Thanks,
> Alex
> 
> [0] : https://gcc.gnu.org/pipermail/gcc-patches/2020-September/554257.html
> 
> ---

Re: [PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Jakub Jelinek via Gcc-patches
On Thu, Oct 08, 2020 at 11:58:21AM +0200, Aldy Hernandez via Gcc-patches wrote:
> I am quoting my analysis from the PR.  Could an aarch64 expert pontificate
> here?
> 
> This test is checking the final assembly for a specific sequence.  I don't
> speak aarch64 assembly, but the IL is different coming out of evrp.
> 
> The first culprit is this difference in the mergephi1 dump:
> 
>_9 = .CTZ (x_6(D));
> -  _10 = _9 & 31;
> +  _10 = _9;
> 
> These are unsigned ints, so assuming they are 32 bits on aarch64,
> __builtin_ctz is always less than 32.  This is because a CTZ of 0 is
> undefined according to the GCC manual:
> 
> [[
> Built-in Function: int __builtin_ctz (unsigned int x)
> 
> Returns the number of trailing 0-bits in x, starting at the least
> significant bit position. If x is 0, the result is undefined.

In user __builtin_ctz yes, but it isn't that easy.

Aarch64 defines CTZ_DEFINED_VALUE_AT_ZERO to 2 and sets value to bitsize
of the mode (i.e. 32 in this case), 

And documentation says:
 -- Macro: CLZ_DEFINED_VALUE_AT_ZERO (MODE, VALUE)
 -- Macro: CTZ_DEFINED_VALUE_AT_ZERO (MODE, VALUE)
 A C expression that indicates whether the architecture defines a
 value for 'clz' or 'ctz' with a zero operand.  A result of '0'
 indicates the value is undefined.  If the value is defined for only
 the RTL expression, the macro should evaluate to '1'; if the value
 applies also to the corresponding optab entry (which is normally
 the case if it expands directly into the corresponding RTL), then
 the macro should evaluate to '2'.  In the cases where the value is
 defined, VALUE should be set to this value.

At least for *_DEFINED_VALUE_AT_ZERO (*) == 2, I'm afraid VRP can't
assume it is undefined at zero, but needs to use a range that covers
both the values for non-zero operand and the VALUE from
CTZ_DEFINED_VALUE_AT_ZERO, so [0, 32] range in this case, because
earlier GIMPLE passes could have already optimized away e.g. a != 0 check.
See simplify_count_trailing_zeroes in forwprop.

Perhaps another way out of this would be document and enforce that
__builtin_c[lt]z{,l,ll} etc calls are undefined at zero, but C[TL]Z ifn
calls are defined there based on *_DEFINED_VALUE_AT_ZERO (*) == 2, and then
we would need to make sure that e.g. in simplify_count_trailing_zeroes
we emit a .CTZ or __builtin_ctz call depending on whether it is undefined
there or not; or give .C[TL]Z an additional argument (0/1) which would tell
if it is defined at zero or not.

We have several enhancement reports in bugzilla from Gabriel Ravier on this
topic I think.

Jakub



Re: [PATCH] arm: Fix multiple inheritance thunks for thumb-1 with -mpure-code

2020-10-08 Thread Richard Earnshaw via Gcc-patches
On 08/10/2020 10:07, Christophe Lyon via Gcc-patches wrote:
> On Tue, 6 Oct 2020 at 18:02, Richard Earnshaw
>  wrote:
>>
>> On 29/09/2020 20:50, Christophe Lyon via Gcc-patches wrote:
>>> When mi_delta is > 255 and -mpure-code is used, we cannot load delta
>>> from code memory (like we do without -mpure-code).
>>>
>>> This patch builds the value of mi_delta into r3 with a series of
>>> movs/adds/lsls.
>>>
>>> We also do some cleanup by not emitting the function address and delta
>>> via .word directives at the end of the thunk since we don't use them
>>> with -mpure-code.
>>>
>>> No need for new testcases, this bug was already identified by
>>> eg. pr46287-3.C
>>>
>>> 2020-09-29  Christophe Lyon  
>>>
>>>   gcc/
>>>   * config/arm/arm.c (arm_thumb1_mi_thunk): Build mi_delta in r3 and
>>>   do not emit function address and delta when -mpure-code is used.
>>
> Hi Richard,
> 
> Thanks for your comments.
> 
>> There are some optimizations you can make to this code.
>>
>> Firstly, for values between 256 and 510 (inclusive), it would be better
>> to just expand a mov of 255 followed by an add.
> I now see the splitted for the "Pe" constraint which I hadn't noticed
> before, so I can write something similar indeed.
> 
> However, I'm note quite sure to understand the benefit in the split
> when -mpure-code is NOT used.
> Consider:
> int f3_1 (void) { return 510; }
> int f3_2 (void) { return 511; }
> Compile with -O2 -mcpu=cortex-m0:
> f3_1:
> movsr0, #255
> lslsr0, r0, #1
> bx  lr
> f3_2:
> ldr r0, .L4
> bx  lr
> 
> The splitter makes the code bigger, does it "compensate" for this by
> not having to load the constant?
> Actually the constant uses 4 more bytes, which should be taken into
> account when comparing code size,

Yes, the size of the literal pool entry needs to be taken into account.
 It might happen that the entry could be shared with another use of that
literal, but in general that's rare.

> so f3_1 uses 6 bytes, and f3_2 uses 8, so as you say below three
> thumb1 instructions would be equivalent in size compared to loading
> from the literal pool. Should the 256-510 range be extended?

It's a bit borderline at three instructions when literal pools are not
expensive to use, but in thumb1 literal pools tend to be quite small due
to the limited pc offsets we can use.  I think on balance we probably
want to use the instruction sequence unless optimizing for size.

> 
> 
>> This is also true for
>> the literal pools alternative as well, so should be handled before all
>> this.
> I am not sure what you mean: with -mpure-code, the above sample is compiled 
> as:
> f3_1:
> movsr0, #255
> lslsr0, r0, #1
> bx  lr
> f3_2:
> movsr0, #1
> lslsr0, r0, #8
> addsr0, r0, #255
> bx  lr
> 
> so the "return 510" case is already handled as without -mpure-code.

I was thinking specifically of the thunk sequence where you seem to be
emitting instructions directly rather than generating RTL.  The examples
you show here are not thunks.


> 
>>  I also suspect (but haven't check) that the base adjustment will
>> most commonly be a multiple of the machine word size (ie 4).  If that is
>> the case then you could generate n/4 and then shift it left by 2 for an
>> even greater range of literals.
> I can see there is provision for this in the !TARGET_THUMB1_ONLY case,
> I'll update my patch.
> 
>>  More generally, any sequence of up to
>> three thumb1 instructions will be no larger, and probably as fast as the
>> existing literal pool fall back.
>>
>> Secondly, if the value is, for example, 65536 (0x1), your code will
>> emit a mov followed by two shift-by-8 instructions; the two shifts could
>> be merged into a single shift-by-16.
> 
> Right, I'll try to make use of thumb_shiftable_const.
> 
>> Finally, I'd really like to see some executable tests for this, if at
>> all possible.
> I mentioned pr46287-3.C, but that's not the only existing testcase
> that showed the problem. There are also:
> g++.dg/opt/thunk1.C
> g++.dg/ipa/pr46984.C
> g++.dg/torture/pr46287.C
> g++.dg/torture/pr45699.C
> 
> Do you want that I copy one of these in the arm subdir and add
> -mpure-code in dg-options?

On reflection, probably not - that just makes things more complicated
with all the dg-options mess (I'm worried about interactions with other
sets of options on the command line and the fall-out from that).  If
someone cares about pure-code they should be doing full testsuite runs
with it enabled and that should be sufficient.

R.

> 
> Thanks,
> 
> Christophe
> 
>> R.
>>
>>>
>>> k#   (use "git pull" to merge the remote branch into yours)
>>> ---
>>>  gcc/config/arm/arm.c | 91 
>>> +---
>>>  1 file changed, 66 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>> index ceeb91f..62abeb5 100644
>>> --- 

[PATCH] PR target/97312: Tweak gcc.target/aarch64/pr90838.c

2020-10-08 Thread Aldy Hernandez via Gcc-patches
I am quoting my analysis from the PR.  Could an aarch64 expert 
pontificate here?


This test is checking the final assembly for a specific sequence.  I 
don't speak aarch64 assembly, but the IL is different coming out of evrp.


The first culprit is this difference in the mergephi1 dump:

   _9 = .CTZ (x_6(D));
-  _10 = _9 & 31;
+  _10 = _9;

These are unsigned ints, so assuming they are 32 bits on aarch64, 
__builtin_ctz is always less than 32.  This is because a CTZ of 0 is 
undefined according to the GCC manual:


[[
Built-in Function: int __builtin_ctz (unsigned int x)

Returns the number of trailing 0-bits in x, starting at the least 
significant bit position. If x is 0, the result is undefined.

]]

So a bitwise AND of anything less than 32 with 0x1f (31) is a no-op.

Here are the full IL differences:

--- legacy-evrp/pr90838.c.038t.mergephi12020-10-07 
08:44:12.152358885 -0400

+++ ranger/pr90838.c.038t.mergephi1 2020-10-07 08:39:12.339296502 -0400
@@ -1,41 +1,41 @@

 ;; Function ctz1 (ctz1, funcdef_no=0, decl_uid=3587, cgraph_uid=1, 
symbol_order=0)


 ctz1 (unsigned int x)
 {
   static const char table[32] = 
"\x00\x01\x1c\x02\x1d\x0e\x18\x03\x1e\x16\x14\x0f\x19\x11\x04\b\x1f\x1b\r\x17\x15\x13\x10\x07\x1a\f\x12\x06\v\x05\n\t";

   unsigned int _1;
   unsigned int _2;
   unsigned int _3;
   unsigned int _4;
   char _5;
   int _9;
   int _10;

:
   _1 = -x_6(D);
   _2 = _1 & x_6(D);
   _3 = _2 * 125613361;
   _4 = _3 >> 27;
   _9 = .CTZ (x_6(D));
-  _10 = _9 & 31;
+  _10 = _9;
   _5 = (char) _10;
   return _10;

 }



 ;; Function ctz2 (ctz2, funcdef_no=1, decl_uid=3591, cgraph_uid=2, 
symbol_order=1)


 ctz2 (unsigned int x)
 {
   static short int table[64] = {32, 0, 1, 12, 2, 6, 0, 13, 3, 0, 7, 0, 
0, 0, 0, 14, 10, 4, 0, 0, 8, 0, 0, 25, 0, 0, 0, 0, 0, 21, 27, 15, 31, 
11, 5, 0, 0, 0, 0, 0, 9, 0, 0,

24, 0, 0, 20, 26, 30, 0, 0, 0, 0, 23, 0, 19, 29, 0, 22, 18, 28, 17, 16, 0};
   unsigned int _1;
   unsigned int _2;
   unsigned int _3;
   short int _4;
   int _8;

:
   _1 = -x_5(D);
@@ -87,27 +87,27 @@


 ;; Function ctz4 (ctz4, funcdef_no=3, decl_uid=3601, cgraph_uid=4, 
symbol_order=5)


 ctz4 (long unsigned int x)
 {
   long unsigned int lsb;
   long unsigned int _1;
   long long unsigned int _2;
   long long unsigned int _3;
   char _4;
   int _9;
   int _10;

:
   _1 = -x_5(D);
   lsb_6 = _1 & x_5(D);
   _2 = lsb_6 * 283881067100198605;
   _3 = _2 >> 58;
   _9 = .CTZ (x_5(D));
-  _10 = _9 & 63;
+  _10 = _9;
   _4 = (char) _10;
   return _10;

 }

The difference in assembly matches.  We have 2 less AND's in the final 
output:


$ diff -u legacy.s ranger.s
--- legacy.s2020-10-07 09:06:13.420446783 -0400
+++ ranger.s2020-10-07 09:06:42.646646949 -0400
@@ -8,7 +8,6 @@
 ctz1:
rbitw0, w0
clz w0, w0
-   and w0, w0, 31
ret
.size   ctz1, .-ctz1
.align  2
@@ -36,7 +35,6 @@
 ctz4:
rbitx0, x0
clz x0, x0
-   and w0, w0, 63
ret
.size   ctz4, .-ctz4

If my analysis is correct, we could just remove the line checking for 
"and", or perhaps check that we don't have any and's.


OK for trunk?
Aldy

gcc/testsuite/ChangeLog:

PR target/97312
* gcc.target/aarch64/pr90838.c: Remove scan for AND.

diff --git a/gcc/testsuite/gcc.target/aarch64/pr90838.c 
b/gcc/testsuite/gcc.target/aarch64/pr90838.c

index e1e19ac6a61..76cd5e18d2e 100644
--- a/gcc/testsuite/gcc.target/aarch64/pr90838.c
+++ b/gcc/testsuite/gcc.target/aarch64/pr90838.c
@@ -60,5 +60,4 @@ int ctz4 (unsigned long x)
 }

 /* { dg-final { scan-assembler-times "clz\t" 4 } } */
-/* { dg-final { scan-assembler-times "and\t" 2 } } */
 /* { dg-final { scan-assembler-not "cmp\t.*0" } } */



[PUSHED] Fix PR97325

2020-10-08 Thread Aldy Hernandez via Gcc-patches
This was a pasto by yours truly while porting the __builtin code from 
vr_values::extract_range_basic.


I've tested Andrew's patch, and pushed it.

Aldy

gcc/ChangeLog:

PR tree-optimization/97325
* gimple-range.cc (gimple_ranger::range_of_builtin_call): 
Handle

negative numbers in __builtin_ffs and __builtin_popcount.

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 75c03d6610b..2461bb78394 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -620,6 +620,8 @@ gimple_ranger::range_of_builtin_call (irange , 
gcall *call)

   // If some high bits are known to be zero, decrease the maximum.
   if (!r.undefined_p ())
{
+ if (TYPE_SIGN (r.type ()) == SIGNED)
+   range_cast (r, unsigned_type_for (r.type ()));
  wide_int max = r.upper_bound ();
  maxi = wi::floor_log2 (max) + 1;
}



[PATCH] IBM Z: Change vector copysign to use bitwise operations

2020-10-08 Thread Ilya Leoshkevich via Gcc-patches
Bootstrapped and regtested on s390x-redhat-linux.  OK for master?

The vector copysign pattern incorrectly assumes that vector
if_then_else operates on bits, not on elements.  This can theoretically
mislead the optimizers.  Fix by changing it to use bitwise operations,
like commit 2930bb321794 ("PR94613: Fix vec_sel builtin for IBM Z") did
for vec_sel builtin.

gcc/ChangeLog:

2020-10-07  Ilya Leoshkevich  

* config/s390/s390-protos.h (s390_build_signbit_mask): New
function.
* config/s390/s390.c (s390_tointvec): New function.
(s390_contiguous_bitmask_vector_p): Bitcast the argument to
an integral mode.
(s390_expand_vec_init): Do not call
s390_contiguous_bitmask_vector_p with a scalar argument.
(s390_build_signbit_mask): New function.
* config/s390/vector.md (copysign3): Use bitwise
operations.
---
 gcc/config/s390/s390-protos.h |  1 +
 gcc/config/s390/s390.c| 92 ---
 gcc/config/s390/vector.md | 31 
 3 files changed, 95 insertions(+), 29 deletions(-)

diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index 6f1bc07db17..029f7289fac 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -121,6 +121,7 @@ extern void s390_expand_vec_compare_cc (rtx, enum rtx_code, 
rtx, rtx, bool);
 extern enum rtx_code s390_reverse_condition (machine_mode, enum rtx_code);
 extern void s390_expand_vcond (rtx, rtx, rtx, enum rtx_code, rtx, rtx);
 extern void s390_expand_vec_init (rtx, rtx);
+extern rtx s390_build_signbit_mask (machine_mode);
 extern rtx s390_return_addr_rtx (int, rtx);
 extern rtx s390_back_chain_rtx (void);
 extern rtx_insn *s390_emit_call (rtx, rtx, rtx, rtx);
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 93894307d62..554c1adf40a 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -2450,6 +2450,54 @@ s390_contiguous_bitmask_p (unsigned HOST_WIDE_INT in, 
bool wrap_p,
   return b;
 }
 
+/* Return the associated integral mode of VEC_MODE.  Must be in sync with
+   tointvec mode_attr.  */
+static machine_mode
+s390_tointvec (machine_mode vec_mode)
+{
+  switch (vec_mode)
+{
+case V1QImode:
+  return V1QImode;
+case V2QImode:
+  return V2QImode;
+case V4QImode:
+  return V4QImode;
+case V8QImode:
+  return V8QImode;
+case V16QImode:
+  return V16QImode;
+case V1HImode:
+  return V1HImode;
+case V2HImode:
+  return V2HImode;
+case V4HImode:
+  return V4HImode;
+case V8HImode:
+  return V8HImode;
+case V1SImode:
+case V1SFmode:
+  return V1SImode;
+case V2SImode:
+case V2SFmode:
+  return V2SImode;
+case V4SImode:
+case V4SFmode:
+  return V4SImode;
+case V1DImode:
+case V1DFmode:
+  return V1DImode;
+case V2DImode:
+case V2DFmode:
+  return V2DImode;
+case V1TImode:
+case V1TFmode:
+  return V1TImode;
+default:
+  gcc_unreachable ();
+}
+}
+
 /* Return true if OP contains the same contiguous bitfield in *all*
its elements.  START and END can be used to obtain the start and
end position of the bitfield.
@@ -2467,6 +2515,9 @@ s390_contiguous_bitmask_vector_p (rtx op, int *start, int 
*end)
   rtx elt;
   bool b;
 
+  /* Handle floats by bitcasting them to ints.  */
+  op = gen_lowpart (s390_tointvec (GET_MODE (op)), op);
+
   gcc_assert (!!start == !!end);
   if (!const_vec_duplicate_p (op, )
   || !CONST_INT_P (elt))
@@ -6863,15 +6914,16 @@ s390_expand_vec_init (rtx target, rtx vals)
 }
 
   /* Use vector gen mask or vector gen byte mask if possible.  */
-  if (all_same && all_const_int
-  && (XVECEXP (vals, 0, 0) == const0_rtx
- || s390_contiguous_bitmask_vector_p (XVECEXP (vals, 0, 0),
-  NULL, NULL)
- || s390_bytemask_vector_p (XVECEXP (vals, 0, 0), NULL)))
+  if (all_same && all_const_int)
 {
-  emit_insn (gen_rtx_SET (target,
- gen_rtx_CONST_VECTOR (mode, XVEC (vals, 0;
-  return;
+  rtx vec = gen_rtx_CONST_VECTOR (mode, XVEC (vals, 0));
+  if (XVECEXP (vals, 0, 0) == const0_rtx
+ || s390_contiguous_bitmask_vector_p (vec, NULL, NULL)
+ || s390_bytemask_vector_p (vec, NULL))
+   {
+ emit_insn (gen_rtx_SET (target, vec));
+ return;
+   }
 }
 
   /* Use vector replicate instructions.  vlrep/vrepi/vrep  */
@@ -6949,6 +7001,30 @@ s390_expand_vec_init (rtx target, rtx vals)
 }
 }
 
+/* Emit a vector constant that contains 1s in each element's sign bit position
+   and 0s in other positions.  MODE is the desired constant's mode.  */
+extern rtx
+s390_build_signbit_mask (machine_mode mode)
+{
+  /* Generate the integral element mask value.  */
+  machine_mode inner_mode = GET_MODE_INNER (mode);
+  int inner_bitsize = GET_MODE_BITSIZE (inner_mode);
+  wide_int 

[PUSHED] Fix PR97315 (part 2 of 2)

2020-10-08 Thread Aldy Hernandez via Gcc-patches

This fixes the second testcase in the PR.

I've tested Andrew's patch and pushed it.

Aldy

gcc/ChangeLog:

PR tree-optimization/97315
* range-op.cc (value_range_with_overflow): Change any
non-overflow calculation in which both bounds are
overflow/underflow to be undefined.

gcc/testsuite/ChangeLog:

* gcc.dg/pr97315-2.c: New test.

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 87c6d82f2ac..22bc23c1bbf 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -287,6 +287,15 @@ value_range_with_overflow (irange , tree type,
 }
   else
 {
+  // If both bounds either underflowed or overflowed, then the result
+  // is undefined.
+  if ((min_ovf == wi::OVF_OVERFLOW && max_ovf == wi::OVF_OVERFLOW)
+ || (min_ovf == wi::OVF_UNDERFLOW && max_ovf == wi::OVF_UNDERFLOW))
+   {
+ r.set_undefined ();
+ return;
+   }
+
   // If overflow does not wrap, saturate to [MIN, MAX].
   wide_int new_lb, new_ub;
   if (min_ovf == wi::OVF_UNDERFLOW)
diff --git a/gcc/testsuite/gcc.dg/pr97315-2.c 
b/gcc/testsuite/gcc.dg/pr97315-2.c

new file mode 100644
index 000..5dd1b6a3fc7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr97315-2.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void c(int);
+
+int a;
+void b()
+{
+  if (a >= 2147483647)
+c(a + 1);
+}



Re: [PATCH] arm: Fix ICE on glibc compilation after my DIVMOD optimization [PR97322]

2020-10-08 Thread Ramana Radhakrishnan via Gcc-patches
On Thu, Oct 8, 2020 at 10:22 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> The arm target hook for divmod wasn't prepared to handle constants passed to
> the function.
>
> Fixed thusly, bootstrapped/regtested on armv7hl-linux-gnueabi, ok for trunk?
>
> 2020-10-08  Jakub Jelinek  
>
> PR target/97322
> * config/arm/arm.c (arm_expand_divmod_libfunc): Pass mode instead of
> GET_MODE (op0) or GET_MODE (op1) to emit_library_call_value.
>
> * gcc.dg/pr97322.c: New test.

Ok.

Ramana

>
> --- gcc/config/arm/arm.c.jj 2020-10-07 10:47:46.892985596 +0200
> +++ gcc/config/arm/arm.c2020-10-07 20:19:25.524367665 +0200
> @@ -33275,9 +33275,7 @@ arm_expand_divmod_libfunc (rtx libfunc,
>  = smallest_int_mode_for_size (2 * GET_MODE_BITSIZE (mode));
>
>rtx libval = emit_library_call_value (libfunc, NULL_RTX, LCT_CONST,
> -   libval_mode,
> -   op0, GET_MODE (op0),
> -   op1, GET_MODE (op1));
> +   libval_mode, op0, mode, op1, mode);
>
>rtx quotient = simplify_gen_subreg (mode, libval, libval_mode, 0);
>rtx remainder = simplify_gen_subreg (mode, libval, libval_mode,
> --- gcc/testsuite/gcc.dg/pr97322.c.jj   2020-10-07 20:19:54.071961807 +0200
> +++ gcc/testsuite/gcc.dg/pr97322.c  2020-10-07 20:19:16.897490309 +0200
> @@ -0,0 +1,17 @@
> +/* PR target/97322 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +void
> +foo (unsigned long long x, unsigned long long *y)
> +{
> +  y[0] = x / 10;
> +  y[1] = x % 10;
> +}
> +
> +void
> +bar (unsigned int x, unsigned int *y)
> +{
> +  y[0] = x / 10;
> +  y[1] = x % 10;
> +}
>
> Jakub
>


[PUSHED] Fix PR97315 (part 1 of 2)

2020-10-08 Thread Aldy Hernandez via Gcc-patches
This fixes the first testcase in the PR by removing the trap and instead 
annotating the listing.


I've tested Andrew's patch and pushed it.

Aldy

Fix PR97315 (part 1 of 2)

gcc/ChangeLog:

PR tree-optimization/97315
* gimple-ssa-evrp.c (hybrid_folder::choose_value): Removes the
trap and instead annotates the listing.

gcc/testsuite/ChangeLog:

* gcc.dg/pr97315-1.c: New test.

diff --git a/gcc/gimple-ssa-evrp.c b/gcc/gimple-ssa-evrp.c
index 363e2ab6816..7688e4aa4bd 100644
--- a/gcc/gimple-ssa-evrp.c
+++ b/gcc/gimple-ssa-evrp.c
@@ -258,37 +258,43 @@ hybrid_folder::value_of_stmt (gimple *stmt, tree op)
 tree
 hybrid_folder::choose_value (tree evrp_val, tree ranger_val)
 {
-  if (!ranger_val)
-{
-  // If neither returned a value, return NULL_TREE.
-  if (!evrp_val)
-   return NULL_TREE;
+  // If both found the same value, just return it.
+  if (evrp_val && ranger_val && !compare_values (evrp_val, ranger_val))
+return evrp_val;
+
+  // If neither returned a value, return NULL_TREE.
+  if (!ranger_val && !evrp_val)
+return NULL_TREE;

-  // Otherwise EVRP found something.
-  if (dump_file)
+  // Otherwise there is a discrepancy to flag.
+  if (dump_file)
+{
+  if (evrp_val && ranger_val)
+   fprintf (dump_file, "EVRP:hybrid: Disagreement\n");
+  if (evrp_val)
{
  fprintf (dump_file, "EVRP:hybrid: EVRP found singleton ");
  print_generic_expr (dump_file, evrp_val);
  fprintf (dump_file, "\n");
}
-  return evrp_val;
+  if (ranger_val)
+   {
+ fprintf (dump_file, "EVRP:hybrid: RVRP found singleton ");
+ print_generic_expr (dump_file, ranger_val);
+ fprintf (dump_file, "\n");
+   }
 }

-  // Otherwise ranger found a value, if they match we're good.
-  if (evrp_val && !compare_values (evrp_val, ranger_val))
+  // If one value was found, return it.
+  if (!evrp_val)
+return ranger_val;
+  if (!ranger_val)
 return evrp_val;

-  // We should never get different singletons.
-  gcc_checking_assert (!evrp_val);
-
-  // Now ranger has found a value, but EVRP did not.
-  if (dump_file)
-{
-  fprintf (dump_file, "EVRP:hybrid: RVRP found singleton ");
-  print_generic_expr (dump_file, ranger_val);
-  fprintf (dump_file, "\n");
-}
-  return ranger_val;
+  // If values are different, return the first calculated value.
+  if ((param_evrp_mode & EVRP_MODE_RVRP_FIRST) == EVRP_MODE_RVRP_FIRST)
+return ranger_val;
+  return evrp_val;
 }

 /* Main entry point for the early vrp pass which is a simplified 
non-iterative
diff --git a/gcc/testsuite/gcc.dg/pr97315-1.c 
b/gcc/testsuite/gcc.dg/pr97315-1.c

new file mode 100644
index 000..250e0e9ecbb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr97315-1.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+typedef struct tree_node *tree;
+enum tree_code { RECORD_TYPE, QUAL_UNION_TYPE };
+enum tree_code_class {};
+struct tree_base {
+  tree_code code : 16;
+};
+struct tree_node {
+  tree_base base;
+};
+extern tree_code_class tree_code_type[];
+void tree_check_failed() __attribute__((__noreturn__));
+tree tree_check3(tree __t, tree_code __c1, tree_code __c3) {
+  if (__t->base.code != __c1 && __t->base.code != __c3)
+tree_check_failed();
+  return __t;
+}
+tree add_type_duplicate_type;
+void add_type_duplicate() {
+  if (tree_code_type[add_type_duplicate_type->base.code])
+if (add_type_duplicate_type->base.code == RECORD_TYPE)
+  for (;
+   tree_check3(add_type_duplicate_type, RECORD_TYPE, 
QUAL_UNION_TYPE);)

+tree_check3(add_type_duplicate_type, RECORD_TYPE, QUAL_UNION_TYPE);
+}



[PATCH] arm: Fix ICE on glibc compilation after my DIVMOD optimization [PR97322]

2020-10-08 Thread Jakub Jelinek via Gcc-patches
Hi!

The arm target hook for divmod wasn't prepared to handle constants passed to
the function.

Fixed thusly, bootstrapped/regtested on armv7hl-linux-gnueabi, ok for trunk?

2020-10-08  Jakub Jelinek  

PR target/97322
* config/arm/arm.c (arm_expand_divmod_libfunc): Pass mode instead of
GET_MODE (op0) or GET_MODE (op1) to emit_library_call_value.

* gcc.dg/pr97322.c: New test.

--- gcc/config/arm/arm.c.jj 2020-10-07 10:47:46.892985596 +0200
+++ gcc/config/arm/arm.c2020-10-07 20:19:25.524367665 +0200
@@ -33275,9 +33275,7 @@ arm_expand_divmod_libfunc (rtx libfunc,
 = smallest_int_mode_for_size (2 * GET_MODE_BITSIZE (mode));
 
   rtx libval = emit_library_call_value (libfunc, NULL_RTX, LCT_CONST,
-   libval_mode,
-   op0, GET_MODE (op0),
-   op1, GET_MODE (op1));
+   libval_mode, op0, mode, op1, mode);
 
   rtx quotient = simplify_gen_subreg (mode, libval, libval_mode, 0);
   rtx remainder = simplify_gen_subreg (mode, libval, libval_mode,
--- gcc/testsuite/gcc.dg/pr97322.c.jj   2020-10-07 20:19:54.071961807 +0200
+++ gcc/testsuite/gcc.dg/pr97322.c  2020-10-07 20:19:16.897490309 +0200
@@ -0,0 +1,17 @@
+/* PR target/97322 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void
+foo (unsigned long long x, unsigned long long *y)
+{
+  y[0] = x / 10;
+  y[1] = x % 10;
+}
+
+void
+bar (unsigned int x, unsigned int *y)
+{
+  y[0] = x / 10;
+  y[1] = x % 10;
+}

Jakub



[committed] openmp: Set cfun->calls_alloca when needed in OpenMP outlined regions [PR97294]

2020-10-08 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase FAILs, because we don't mark the child OpenMP function
as cfun->calls_alloca when it does call alloca.  When optimizing, during DCE we
reset those flags and recompute them again, but with -O0 DCE is not performed.

Fixed by calling notice_special_calls when moving insns to the child function.

cfun->calls_alloca is normally set during gimplification and most of the
alloca calls omp-low.c does go through the gimplifier, but one spot didn't
and built the gcall directly, so that one needs to set calls_alloca too.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2020-10-08  Jakub Jelinek  

PR sanitizer/97294
* tree-cfg.c (move_block_to_fn): Call notice_special_calls on
call stmts being moved into dest_cfun.
* omp-low.c (lower_rec_input_clauses): Set cfun->calls_alloca when
adding __builtin_alloca_with_align call without gimplification.

* gcc.dg/asan/pr97294.c: New test.

--- gcc/tree-cfg.c.jj   2020-09-02 12:13:22.617071099 +0200
+++ gcc/tree-cfg.c  2020-10-07 17:36:13.138847818 +0200
@@ -7220,6 +7220,8 @@ move_block_to_fn (struct function *dest_
   free_stmt_operands (cfun, stmt);
   push_cfun (dest_cfun);
   update_stmt (stmt);
+  if (is_gimple_call (stmt))
+   notice_special_calls (as_a  (stmt));
   pop_cfun ();
 }
 
--- gcc/omp-low.c.jj2020-09-29 13:35:18.230153537 +0200
+++ gcc/omp-low.c   2020-10-07 16:27:31.582642216 +0200
@@ -5112,6 +5112,7 @@ lower_rec_input_clauses (tree clauses, g
  atmp = builtin_decl_explicit (BUILT_IN_ALLOCA_WITH_ALIGN);
  stmt = gimple_build_call (atmp, 2, x,
size_int (DECL_ALIGN (var)));
+ cfun->calls_alloca = 1;
  tmp = create_tmp_var_raw (ptr_type_node);
  gimple_add_tmp_var (tmp);
  gimple_call_set_lhs (stmt, tmp);
--- gcc/testsuite/gcc.dg/asan/pr97294.c.jj  2020-10-07 19:52:24.945460248 
+0200
+++ gcc/testsuite/gcc.dg/asan/pr97294.c 2020-10-07 19:56:37.662856063 +0200
@@ -0,0 +1,41 @@
+/* PR sanitizer/97294 */
+/* { dg-do compile { target fopenmp } } */
+/* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
+/* { dg-options "-O0 -fsanitize=address -fopenmp" } */
+/* { dg-final { scan-assembler "foo._omp_fn.\[0-9]\[1-9]*:.*call\[ 
\t]\*__*asan_allocas_unpoison.*\.size\[ \t]\*foo._omp_fn.\[0-9]\[1-9]*," { 
target x86_64-*-linux* i?86-*-linux* } } } */
+
+__attribute__((noipa)) void
+foo (int *p, int n)
+{
+  int i;
+  #pragma omp parallel for num_threads(2) reduction(+:p[:n])
+  for (i = 0; i < 10; i++)
+{
+  p[0]++;
+  p[n - 1] += 2;
+}
+}
+
+__attribute__((noipa)) void
+bar (void)
+{
+  unsigned char buf[1024];
+  int i;
+  asm volatile ("" : : "r" ([0]) : "memory");
+  for (i = 0; i < 1024; i++)
+buf[i] = i;
+  asm volatile ("" : : "r" ([0]) : "memory");
+}
+
+int
+main ()
+{
+  int p[50], i;
+  for (i = 0; i < 50; i++)
+p[i] = 0;
+  foo (p, 50);
+  bar ();
+  if (p[0] != 10 || p[49] != 20)
+__builtin_abort ();
+  return 0;
+}

Jakub



Re: [PATCH] arm: Fix multiple inheritance thunks for thumb-1 with -mpure-code

2020-10-08 Thread Christophe Lyon via Gcc-patches
On Tue, 6 Oct 2020 at 18:02, Richard Earnshaw
 wrote:
>
> On 29/09/2020 20:50, Christophe Lyon via Gcc-patches wrote:
> > When mi_delta is > 255 and -mpure-code is used, we cannot load delta
> > from code memory (like we do without -mpure-code).
> >
> > This patch builds the value of mi_delta into r3 with a series of
> > movs/adds/lsls.
> >
> > We also do some cleanup by not emitting the function address and delta
> > via .word directives at the end of the thunk since we don't use them
> > with -mpure-code.
> >
> > No need for new testcases, this bug was already identified by
> > eg. pr46287-3.C
> >
> > 2020-09-29  Christophe Lyon  
> >
> >   gcc/
> >   * config/arm/arm.c (arm_thumb1_mi_thunk): Build mi_delta in r3 and
> >   do not emit function address and delta when -mpure-code is used.
>
Hi Richard,

Thanks for your comments.

> There are some optimizations you can make to this code.
>
> Firstly, for values between 256 and 510 (inclusive), it would be better
> to just expand a mov of 255 followed by an add.
I now see the splitted for the "Pe" constraint which I hadn't noticed
before, so I can write something similar indeed.

However, I'm note quite sure to understand the benefit in the split
when -mpure-code is NOT used.
Consider:
int f3_1 (void) { return 510; }
int f3_2 (void) { return 511; }
Compile with -O2 -mcpu=cortex-m0:
f3_1:
movsr0, #255
lslsr0, r0, #1
bx  lr
f3_2:
ldr r0, .L4
bx  lr

The splitter makes the code bigger, does it "compensate" for this by
not having to load the constant?
Actually the constant uses 4 more bytes, which should be taken into
account when comparing code size,
so f3_1 uses 6 bytes, and f3_2 uses 8, so as you say below three
thumb1 instructions would be equivalent in size compared to loading
from the literal pool. Should the 256-510 range be extended?


> This is also true for
> the literal pools alternative as well, so should be handled before all
> this.
I am not sure what you mean: with -mpure-code, the above sample is compiled as:
f3_1:
movsr0, #255
lslsr0, r0, #1
bx  lr
f3_2:
movsr0, #1
lslsr0, r0, #8
addsr0, r0, #255
bx  lr

so the "return 510" case is already handled as without -mpure-code.

>  I also suspect (but haven't check) that the base adjustment will
> most commonly be a multiple of the machine word size (ie 4).  If that is
> the case then you could generate n/4 and then shift it left by 2 for an
> even greater range of literals.
I can see there is provision for this in the !TARGET_THUMB1_ONLY case,
I'll update my patch.

>  More generally, any sequence of up to
> three thumb1 instructions will be no larger, and probably as fast as the
> existing literal pool fall back.
>
> Secondly, if the value is, for example, 65536 (0x1), your code will
> emit a mov followed by two shift-by-8 instructions; the two shifts could
> be merged into a single shift-by-16.

Right, I'll try to make use of thumb_shiftable_const.

> Finally, I'd really like to see some executable tests for this, if at
> all possible.
I mentioned pr46287-3.C, but that's not the only existing testcase
that showed the problem. There are also:
g++.dg/opt/thunk1.C
g++.dg/ipa/pr46984.C
g++.dg/torture/pr46287.C
g++.dg/torture/pr45699.C

Do you want that I copy one of these in the arm subdir and add
-mpure-code in dg-options?

Thanks,

Christophe

> R.
>
> >
> > k#   (use "git pull" to merge the remote branch into yours)
> > ---
> >  gcc/config/arm/arm.c | 91 
> > +---
> >  1 file changed, 66 insertions(+), 25 deletions(-)
> >
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index ceeb91f..62abeb5 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -28342,9 +28342,43 @@ arm_thumb1_mi_thunk (FILE *file, tree, 
> > HOST_WIDE_INT delta,
> >  {
> >if (mi_delta > 255)
> >   {
> > -   fputs ("\tldr\tr3, ", file);
> > -   assemble_name (file, label);
> > -   fputs ("+4\n", file);
> > +   /* With -mpure-code, we cannot load delta from the constant
> > +  pool: we build it explicitly.  */
> > +   if (target_pure_code)
> > + {
> > +   bool mov_done_p = false;
> > +   int i;
> > +
> > +   /* Emit upper 3 bytes if needed.  */
> > +   for (i = 0; i < 3; i++)
> > + {
> > +   int byte = (mi_delta >> (8 * (3 - i))) & 0xff;
> > +
> > +   if (byte)
> > + {
> > +   if (mov_done_p)
> > + asm_fprintf (file, "\tadds\tr3, #%d\n", byte);
> > +   else
> > + asm_fprintf (file, "\tmovs\tr3, #%d\n", byte);
> > +   mov_done_p = true;
> > + }
> > +
> > +   if (mov_done_p)
> > + asm_fprintf (file, "\tlsls\tr3, #8\n");
> > +  

[PATCH] aix: remove libgomp and libatomic archives before creating FAT archives

2020-10-08 Thread CHIGOT, CLEMENT via Gcc-patches
libgomp and libatomic might be in use during the build. Thus, ar commands
will failed to add new objetcs to it as the file is considered busy.
Deleting it and recreating it with both architectures fixes that.

libatomic/Changelog:
2020-08-17 Clement Chigot 
 * config/t-aix: Deleting libatomic before creating FAT library.

libgomp/Changelog:
2020-08-17 Clement Chigot 
 * config/t-aix: Deleting libgomp before creating FAT library.


Clément Chigot
ATOS Bull SAS
1 rue de Provence - 38432 Échirolles - France



0001-aix-remove-libgomp-and-libatomic-archives-before-cre.patch
Description: 0001-aix-remove-libgomp-and-libatomic-archives-before-cre.patch


[PATCH] aix: add FAT library support for libstdc++fs.a

2020-10-08 Thread CHIGOT, CLEMENT via Gcc-patches
Knowing if --enable-libstdcxx-filesystem-ts has been passed to the configure in
the AIX config Makefile would require a lot of change in libstdc++ Makefiles. 
Thus,
it's easier to ignore errors when retrieving objects and check their existence
before adding them.

libstdc++/Changelog:
2020-09-22 Clement Chigot 
 * config/os/aix/t-aix: Add complementary mode object file to libstdc++fs.a



Clément Chigot
ATOS Bull SAS
1 rue de Provence - 38432 Échirolles - France



0001-aix-add-FAT-library-support-for-libstdc-fs.a.patch
Description: 0001-aix-add-FAT-library-support-for-libstdc-fs.a.patch


[committed][libgomp, nvptx] Report launch dimensions in GOMP_OFFLOAD_run

2020-10-08 Thread Tom de Vries
Hi,

Using this patch, when using GOMP_DEBUG=1 and launching a kernel in
GOMP_OFFLOAD_run (used by the omp implementation), we see the kernel launch
dimensions:
...
  GOMP_OFFLOAD_run: kernel main$_omp_fn$0: \
launch [(teams: 1), 1, 1] [(lanes: 32), (threads: 1), 1]
...

Build on x86_64-linux with nvptx accelerator, tested libgomp.

Committed to trunk.

Thanks,
- Tom

[libgomp, nvptx] Report launch dimensions in GOMP_OFFLOAD_run

libgomp/ChangeLog:

2020-10-08  Tom de Vries  

PR libgomp/81802
* plugin/plugin-nvptx.c (GOMP_OFFLOAD_run): Report launch
dimensions.

---
 libgomp/plugin/plugin-nvptx.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index a63dd1a99fb..11d4ceeae62 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -1891,7 +1891,11 @@ nvptx_stacks_free (void *p, int num)
 void
 GOMP_OFFLOAD_run (int ord, void *tgt_fn, void *tgt_vars, void **args)
 {
-  CUfunction function = ((struct targ_fn_descriptor *) tgt_fn)->fn;
+  struct targ_fn_descriptor *tgt_fn_desc
+= (struct targ_fn_descriptor *) tgt_fn;
+  CUfunction function = tgt_fn_desc->fn;
+  const struct targ_fn_launch *launch = tgt_fn_desc->launch;
+  const char *fn_name = launch->fn;
   CUresult r;
   struct ptx_device *ptx_dev = ptx_devices[ord];
   const char *maybe_abort_msg = "(perhaps abort was called)";
@@ -1926,6 +1930,9 @@ GOMP_OFFLOAD_run (int ord, void *tgt_fn, void *tgt_vars, 
void **args)
 CU_LAUNCH_PARAM_BUFFER_SIZE, _args_size,
 CU_LAUNCH_PARAM_END
   };
+  GOMP_PLUGIN_debug (0, "  %s: kernel %s: launch"
+" [(teams: %u), 1, 1] [(lanes: 32), (threads: %u), 1]\n",
+__FUNCTION__, fn_name, teams, threads);
   r = CUDA_CALL_NOCHECK (cuLaunchKernel, function, teams, 1, 1,
 32, threads, 1, 0, NULL, NULL, config);
   if (r != CUDA_SUCCESS)


Re: [r11-3641 Regression] FAIL: gcc.dg/torture/pta-ptrarith-1.c -Os scan-tree-dump alias "ESCAPED = {[^\n}]* i f [^\n}]*}" on Linux/x86_64 (-m32 -march=cascadelake)

2020-10-08 Thread Richard Biener
On Thu, 8 Oct 2020, Jan Hubicka wrote:

> Hi,
> this is fix I am testing (it solved the testcase)

LGTM

> gcc/ChangeLog:
> 
> 2020-10-08  Jan Hubicka  
> 
>   * ipa-modref.c (get_access): Fix handling of offsets.
>   * tree-ssa-alias.c (modref_may_conflict): Watch for overflows.
> 
> diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
> index a5fa33a35de..5868aa97484 100644
> --- a/gcc/ipa-modref.c
> +++ b/gcc/ipa-modref.c
> @@ -318,8 +318,7 @@ get_access (ao_ref *ref)
> 0, -1, false};
>if (TREE_CODE (base) == MEM_REF || TREE_CODE (base) == TARGET_MEM_REF)
>  {
> -  tree offset = TREE_CODE (base) == MEM_REF
> - ? TREE_OPERAND (base, 1) : NULL_TREE;
> +  tree memref = base;
>base = TREE_OPERAND (base, 0);
>if (TREE_CODE (base) == SSA_NAME
> && SSA_NAME_IS_DEFAULT_DEF (base)
> @@ -336,8 +335,14 @@ get_access (ao_ref *ref)
>   }
> a.parm_index++;
>   }
> -   a.parm_offset_known
> - = offset && wi::to_poly_offset (offset).to_shwi (_offset);
> +   if (TREE_CODE (memref) == MEM_REF)
> + {
> +   a.parm_offset_known
> +  = wi::to_poly_wide (TREE_OPERAND
> +  (memref, 1)).to_shwi (_offset);
> + }
> +   else
> + a.parm_offset_known = false;
>   }
>else
>   a.parm_index = -1;
> diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
> index 97dc4ac8814..d885f7157c5 100644
> --- a/gcc/tree-ssa-alias.c
> +++ b/gcc/tree-ssa-alias.c
> @@ -2542,16 +2542,22 @@ modref_may_conflict (const gimple *stmt,
> else
>   {
> ao_ref ref2;
> -
> -   ao_ref_init_from_ptr_and_range
> -  (, arg, true,
> -   access_node->offset
> -   + (access_node->parm_offset
> -  << LOG2_BITS_PER_UNIT), access_node->size,
> -   access_node->max_size);
> -   ref2.ref_alias_set = ref_set;
> -   ref2.base_alias_set = base_set;
> -   if (refs_may_alias_p_1 (, ref, tbaa_p))
> +   poly_offset_int off = (poly_offset_int)access_node->offset
> + + ((poly_offset_int)access_node->parm_offset
> +<< LOG2_BITS_PER_UNIT);
> +   poly_int64 off2;
> +   if (off.to_shwi ())
> + {
> +   ao_ref_init_from_ptr_and_range
> +  (, arg, true, off2,
> +   access_node->size,
> +   access_node->max_size);
> +   ref2.ref_alias_set = ref_set;
> +   ref2.base_alias_set = base_set;
> +   if (refs_may_alias_p_1 (, ref, tbaa_p))
> + return true;
> + }
> +   else if (ptr_deref_may_alias_ref_p_1 (arg, ref))
>   return true;
>   }
> num_tests++;
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: [r11-3641 Regression] FAIL: gcc.dg/torture/pta-ptrarith-1.c -Os scan-tree-dump alias "ESCAPED = {[^\n}]* i f [^\n}]*}" on Linux/x86_64 (-m32 -march=cascadelake)

2020-10-08 Thread Jan Hubicka
Hi,
this is fix I am testing (it solved the testcase)

gcc/ChangeLog:

2020-10-08  Jan Hubicka  

* ipa-modref.c (get_access): Fix handling of offsets.
* tree-ssa-alias.c (modref_may_conflict): Watch for overflows.

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index a5fa33a35de..5868aa97484 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -318,8 +318,7 @@ get_access (ao_ref *ref)
  0, -1, false};
   if (TREE_CODE (base) == MEM_REF || TREE_CODE (base) == TARGET_MEM_REF)
 {
-  tree offset = TREE_CODE (base) == MEM_REF
-   ? TREE_OPERAND (base, 1) : NULL_TREE;
+  tree memref = base;
   base = TREE_OPERAND (base, 0);
   if (TREE_CODE (base) == SSA_NAME
  && SSA_NAME_IS_DEFAULT_DEF (base)
@@ -336,8 +335,14 @@ get_access (ao_ref *ref)
}
  a.parm_index++;
}
- a.parm_offset_known
-   = offset && wi::to_poly_offset (offset).to_shwi (_offset);
+ if (TREE_CODE (memref) == MEM_REF)
+   {
+ a.parm_offset_known
+= wi::to_poly_wide (TREE_OPERAND
+(memref, 1)).to_shwi (_offset);
+   }
+ else
+   a.parm_offset_known = false;
}
   else
a.parm_index = -1;
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 97dc4ac8814..d885f7157c5 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -2542,16 +2542,22 @@ modref_may_conflict (const gimple *stmt,
  else
{
  ao_ref ref2;
-
- ao_ref_init_from_ptr_and_range
-(, arg, true,
- access_node->offset
- + (access_node->parm_offset
-<< LOG2_BITS_PER_UNIT), access_node->size,
- access_node->max_size);
- ref2.ref_alias_set = ref_set;
- ref2.base_alias_set = base_set;
- if (refs_may_alias_p_1 (, ref, tbaa_p))
+ poly_offset_int off = (poly_offset_int)access_node->offset
+   + ((poly_offset_int)access_node->parm_offset
+  << LOG2_BITS_PER_UNIT);
+ poly_int64 off2;
+ if (off.to_shwi ())
+   {
+ ao_ref_init_from_ptr_and_range
+(, arg, true, off2,
+ access_node->size,
+ access_node->max_size);
+ ref2.ref_alias_set = ref_set;
+ ref2.base_alias_set = base_set;
+ if (refs_may_alias_p_1 (, ref, tbaa_p))
+   return true;
+   }
+ else if (ptr_deref_may_alias_ref_p_1 (arg, ref))
return true;
}
  num_tests++;