[PATCH] c++: ICE with bogus late return type [PR99803]

2021-04-14 Thread Marek Polacek via Gcc-patches
Here we ICE when compiling this code in C++20, because we're trying to
slam a 'typename' after the ->.  The cp_parser_template_id call just
before the spot I'm changing parsed A::template A as a BASELINK
that contains a constructor, but make_typename_type crashes on that.

My fix is the same as c++/88325, add an is_overloaded_fn check.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/99803
* parser.c (cp_parser_simple_type_specifier): Don't call
cp_parser_make_typename_type for is_overloaded_fn.

gcc/testsuite/ChangeLog:

PR c++/99803
* g++.dg/cpp2a/typename19.C: New test.
---
 gcc/cp/parser.c | 2 +-
 gcc/testsuite/g++.dg/cpp2a/typename19.C | 5 +
 2 files changed, 6 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/typename19.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 3a107206318..3c506d891c9 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -18903,7 +18903,7 @@ cp_parser_simple_type_specifier (cp_parser* parser,
  if (TREE_CODE (type) != TYPE_DECL)
{
  /* ...unless we pretend we have seen 'typename'.  */
- if (typename_p)
+ if (typename_p && !is_overloaded_fn (type))
type = cp_parser_make_typename_type (parser, type,
 token->location);
  else
diff --git a/gcc/testsuite/g++.dg/cpp2a/typename19.C 
b/gcc/testsuite/g++.dg/cpp2a/typename19.C
new file mode 100644
index 000..bd7e5110e00
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/typename19.C
@@ -0,0 +1,5 @@
+// PR c++/99803
+// { dg-do compile { target c++20 } }
+
+struct A { template A(T); };
+auto A(unsigned) -> A::template A; // { dg-error "not name a type" }

base-commit: a87d3f964df31d4fbceb822c6d293e85c117d992
-- 
2.30.2



[PATCH v4 2/2] x86: Add general_regs_only function attribute

2021-04-14 Thread H.J. Lu via Gcc-patches
commit 87c753ac241f25d222d46ba1ac66ceba89d6a200
Author: H.J. Lu 
Date:   Fri Aug 21 09:42:49 2020 -0700

x86: Add target("general-regs-only") function attribute

is incomplete since it is impossible to call integer intrinsics from
a function with general-regs-only target attribute.

1. Add general_regs_only function attribute to inform the compiler that
functions use only general purpose registers.  When making inlining
decisions on such functions, non-GPR compiler options are excluded.
2. Add general_regs_only attribute to x86 intrinsics which use only
general purpose registers.

gcc/

PR target/99744
* config/i386/i386-options.c (ix86_attribute_table): Add
general_regs_only.
* config/i386/i386.c (ix86_can_inline_p): Exclude non-integer
target options if callee has general_regs_only attribute.
* config/i386/adxintrin.h: Add general_regs_only attribute to
intrinsics which use only general purpose registers.
* config/i386/bmiintrin.h: Likewise.
* config/i386/bmi2intrin.h: Likewise.
* config/i386/cetintrin.h: Likewise.
* config/i386/cldemoteintrin.h: Likewise.
* config/i386/clflushoptintrin.h: Likewise.
* config/i386/clwbintrin.h: Likewise.
* config/i386/clzerointrin.h: Likewise.
* config/i386/enqcmdintrin.h: Likewise.
* config/i386/fxsrintrin.h: Likewise.
* config/i386/hresetintrin.h: Likewise.
* config/i386/ia32intrin.h: Likewise.
* config/i386/lwpintrin.h: Likewise.
* config/i386/lzcntintrin.h: Likewise.
* config/i386/movdirintrin.h: Likewise.
* config/i386/mwaitxintrin.h: Likewise.
* config/i386/pconfigintrin.h: Likewise.
* config/i386/pkuintrin.h: Likewise.
* config/i386/popcntintrin.h: Likewise.
* config/i386/rdseedintrin.h: Likewise.
* config/i386/rtmintrin.h: Likewise.
* config/i386/serializeintrin.h: Likewise.
* config/i386/sgxintrin.h: Likewise.
* config/i386/tbmintrin.h: Likewise.
* config/i386/tsxldtrkintrin.h: Likewise.
* config/i386/uintrintrin.h: Likewise.
* config/i386/waitpkgintrin.h: Likewise.
* config/i386/wbnoinvdintrin.h: Likewise.
* config/i386/x86gprintrin.h: Likewise.
* config/i386/xsavecintrin.h: Likewise.
* config/i386/xsaveintrin.h: Likewise.
* config/i386/xsaveoptintrin.h: Likewise.
* config/i386/xsavesintrin.h: Likewise.
* config/i386/xtestintrin.h: Likewise.
* doc/extend.texi: Document general_regs_only function attribute.

gcc/testsuite/

PR target/99744
* gcc.target/i386/pr99744-3.c: New test.
* gcc.target/i386/pr99744-4.c: Likewise.
---
 gcc/config/i386/adxintrin.h   |  18 +-
 gcc/config/i386/bmi2intrin.h  |  24 +-
 gcc/config/i386/bmiintrin.h   |  92 --
 gcc/config/i386/cetintrin.h   |  33 +-
 gcc/config/i386/cldemoteintrin.h  |   3 +-
 gcc/config/i386/clflushoptintrin.h|   3 +-
 gcc/config/i386/clwbintrin.h  |   3 +-
 gcc/config/i386/clzerointrin.h|   4 +-
 gcc/config/i386/enqcmdintrin.h|   6 +-
 gcc/config/i386/fxsrintrin.h  |  12 +-
 gcc/config/i386/hresetintrin.h|   3 +-
 gcc/config/i386/i386-options.c|   2 +
 gcc/config/i386/i386.c|  29 +-
 gcc/config/i386/ia32intrin.h  |  82 +++--
 gcc/config/i386/lwpintrin.h   |  24 +-
 gcc/config/i386/lzcntintrin.h |  20 +-
 gcc/config/i386/movdirintrin.h|   9 +-
 gcc/config/i386/mwaitxintrin.h|   8 +-
 gcc/config/i386/pconfigintrin.h   |   3 +-
 gcc/config/i386/pkuintrin.h   |   6 +-
 gcc/config/i386/popcntintrin.h|   8 +-
 gcc/config/i386/rdseedintrin.h|   9 +-
 gcc/config/i386/rtmintrin.h   |   9 +-
 gcc/config/i386/serializeintrin.h |   8 +-
 gcc/config/i386/sgxintrin.h   |   9 +-
 gcc/config/i386/tbmintrin.h   |  80 +++--
 gcc/config/i386/tsxldtrkintrin.h  |   6 +-
 gcc/config/i386/uintrintrin.h |  12 +-
 gcc/config/i386/waitpkgintrin.h   |   9 +-
 gcc/config/i386/wbnoinvdintrin.h  |   3 +-
 gcc/config/i386/x86gprintrin.h|  45 ++-
 gcc/config/i386/xsavecintrin.h|   6 +-
 gcc/config/i386/xsaveintrin.h |  18 +-
 gcc/config/i386/xsaveoptintrin.h  |   6 +-
 gcc/config/i386/xsavesintrin.h|  12 +-
 gcc/config/i386/xtestintrin.h |   3 +-
 gcc/doc/extend.texi   |   5 +
 gcc/testsuite/gcc.target/i386/pr99744-3.c |  13 +
 gcc/testsuite/gcc.target/i386/pr99744-4.c | 352 ++
 39 files changed, 818 insertions(+), 179 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
 create mode 100644 

[PATCH v4 0/2] x86: Add general_regs_only function attribute

2021-04-14 Thread H.J. Lu via Gcc-patches
I realized that

commit 87c753ac241f25d222d46ba1ac66ceba89d6a200
Author: H.J. Lu 
Date:   Fri Aug 21 09:42:49 2020 -0700

x86: Add target("general-regs-only") function attribute

is incomplete since it is impossible to call integer intrinsics from
a function with general-regs-only target attribute.  We need to add a
general_regs_only function attribute to go with it to mark functions
which use only general purpose registers.  When making inlining
decisions on such functions, x86 backend can exclude non-GPR compiler
options.  The general_regs_only attribute should be added to all x86
intrinsics which use only general purpose registers.

H.J. Lu (2):
  x86: Move OPTION_MASK_* to i386-common.h
  x86: Add general_regs_only function attribute

 gcc/common/config/i386/i386-common.c  | 297 --
 gcc/common/config/i386/i386-common.h  | 315 +++
 gcc/config/i386/adxintrin.h   |  18 +-
 gcc/config/i386/bmi2intrin.h  |  24 +-
 gcc/config/i386/bmiintrin.h   |  92 --
 gcc/config/i386/cetintrin.h   |  33 +-
 gcc/config/i386/cldemoteintrin.h  |   3 +-
 gcc/config/i386/clflushoptintrin.h|   3 +-
 gcc/config/i386/clwbintrin.h  |   3 +-
 gcc/config/i386/clzerointrin.h|   4 +-
 gcc/config/i386/enqcmdintrin.h|   6 +-
 gcc/config/i386/fxsrintrin.h  |  12 +-
 gcc/config/i386/hresetintrin.h|   3 +-
 gcc/config/i386/i386-options.c|   2 +
 gcc/config/i386/i386.c|  29 +-
 gcc/config/i386/i386.h|   1 +
 gcc/config/i386/ia32intrin.h  |  82 +++--
 gcc/config/i386/lwpintrin.h   |  24 +-
 gcc/config/i386/lzcntintrin.h |  20 +-
 gcc/config/i386/movdirintrin.h|   9 +-
 gcc/config/i386/mwaitxintrin.h|   8 +-
 gcc/config/i386/pconfigintrin.h   |   3 +-
 gcc/config/i386/pkuintrin.h   |   6 +-
 gcc/config/i386/popcntintrin.h|   8 +-
 gcc/config/i386/rdseedintrin.h|   9 +-
 gcc/config/i386/rtmintrin.h   |   9 +-
 gcc/config/i386/serializeintrin.h |   8 +-
 gcc/config/i386/sgxintrin.h   |   9 +-
 gcc/config/i386/tbmintrin.h   |  80 +++--
 gcc/config/i386/tsxldtrkintrin.h  |   6 +-
 gcc/config/i386/uintrintrin.h |  12 +-
 gcc/config/i386/waitpkgintrin.h   |   9 +-
 gcc/config/i386/wbnoinvdintrin.h  |   3 +-
 gcc/config/i386/x86gprintrin.h|  45 ++-
 gcc/config/i386/xsavecintrin.h|   6 +-
 gcc/config/i386/xsaveintrin.h |  18 +-
 gcc/config/i386/xsaveoptintrin.h  |   6 +-
 gcc/config/i386/xsavesintrin.h|  12 +-
 gcc/config/i386/xtestintrin.h |   3 +-
 gcc/doc/extend.texi   |   5 +
 gcc/testsuite/gcc.target/i386/pr99744-3.c |  13 +
 gcc/testsuite/gcc.target/i386/pr99744-4.c | 352 ++
 42 files changed, 1134 insertions(+), 476 deletions(-)
 create mode 100644 gcc/common/config/i386/i386-common.h
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr99744-4.c

-- 
2.30.2



[PATCH v4 1/2] x86: Move OPTION_MASK_* to i386-common.h

2021-04-14 Thread H.J. Lu via Gcc-patches
Move OPTION_MASK_* to i386-common.h so that they can be used in x86
backend.

* common/config/i386/i386-common.c (OPTION_MASK_*): Move to ...
* common/config/i386/i386-common.h: Here.  New file.
* config/i386/i386.h: Include common/config/i386/i386-common.h.
---
 gcc/common/config/i386/i386-common.c | 297 -
 gcc/common/config/i386/i386-common.h | 315 +++
 gcc/config/i386/i386.h   |   1 +
 3 files changed, 316 insertions(+), 297 deletions(-)
 create mode 100644 gcc/common/config/i386/i386-common.h

diff --git a/gcc/common/config/i386/i386-common.c 
b/gcc/common/config/i386/i386-common.c
index 1e6c1590ac4..37ff47bd676 100644
--- a/gcc/common/config/i386/i386-common.c
+++ b/gcc/common/config/i386/i386-common.c
@@ -29,303 +29,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "opts.h"
 #include "flags.h"
 
-/* Define a set of ISAs which are available when a given ISA is
-   enabled.  MMX and SSE ISAs are handled separately.  */
-
-#define OPTION_MASK_ISA_MMX_SET OPTION_MASK_ISA_MMX
-#define OPTION_MASK_ISA_3DNOW_SET \
-  (OPTION_MASK_ISA_3DNOW | OPTION_MASK_ISA_MMX_SET)
-#define OPTION_MASK_ISA_3DNOW_A_SET \
-  (OPTION_MASK_ISA_3DNOW_A | OPTION_MASK_ISA_3DNOW_SET)
-
-#define OPTION_MASK_ISA_SSE_SET OPTION_MASK_ISA_SSE
-#define OPTION_MASK_ISA_SSE2_SET \
-  (OPTION_MASK_ISA_SSE2 | OPTION_MASK_ISA_SSE_SET)
-#define OPTION_MASK_ISA_SSE3_SET \
-  (OPTION_MASK_ISA_SSE3 | OPTION_MASK_ISA_SSE2_SET)
-#define OPTION_MASK_ISA_SSSE3_SET \
-  (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_SSE3_SET)
-#define OPTION_MASK_ISA_SSE4_1_SET \
-  (OPTION_MASK_ISA_SSE4_1 | OPTION_MASK_ISA_SSSE3_SET)
-#define OPTION_MASK_ISA_SSE4_2_SET \
-  (OPTION_MASK_ISA_SSE4_2 | OPTION_MASK_ISA_SSE4_1_SET)
-#define OPTION_MASK_ISA_AVX_SET \
-  (OPTION_MASK_ISA_AVX | OPTION_MASK_ISA_SSE4_2_SET \
-   | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_FMA_SET \
-  (OPTION_MASK_ISA_FMA | OPTION_MASK_ISA_AVX_SET)
-#define OPTION_MASK_ISA_AVX2_SET \
-  (OPTION_MASK_ISA_AVX2 | OPTION_MASK_ISA_AVX_SET)
-#define OPTION_MASK_ISA_FXSR_SET OPTION_MASK_ISA_FXSR
-#define OPTION_MASK_ISA_XSAVE_SET OPTION_MASK_ISA_XSAVE
-#define OPTION_MASK_ISA_XSAVEOPT_SET \
-  (OPTION_MASK_ISA_XSAVEOPT | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_AVX512F_SET \
-  (OPTION_MASK_ISA_AVX512F | OPTION_MASK_ISA_AVX2_SET)
-#define OPTION_MASK_ISA_AVX512CD_SET \
-  (OPTION_MASK_ISA_AVX512CD | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512PF_SET \
-  (OPTION_MASK_ISA_AVX512PF | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512ER_SET \
-  (OPTION_MASK_ISA_AVX512ER | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512DQ_SET \
-  (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512BW_SET \
-  (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512VL_SET \
-  (OPTION_MASK_ISA_AVX512VL | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512IFMA_SET \
-  (OPTION_MASK_ISA_AVX512IFMA | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512VBMI_SET \
-  (OPTION_MASK_ISA_AVX512VBMI | OPTION_MASK_ISA_AVX512BW_SET)
-#define OPTION_MASK_ISA2_AVX5124FMAPS_SET OPTION_MASK_ISA2_AVX5124FMAPS
-#define OPTION_MASK_ISA2_AVX5124VNNIW_SET OPTION_MASK_ISA2_AVX5124VNNIW
-#define OPTION_MASK_ISA_AVX512VBMI2_SET \
-  (OPTION_MASK_ISA_AVX512VBMI2 | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512VNNI_SET \
-  (OPTION_MASK_ISA_AVX512VNNI | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA2_AVXVNNI_SET OPTION_MASK_ISA2_AVXVNNI
-#define OPTION_MASK_ISA_AVX512VPOPCNTDQ_SET \
-  (OPTION_MASK_ISA_AVX512VPOPCNTDQ | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA_AVX512BITALG_SET \
-  (OPTION_MASK_ISA_AVX512BITALG | OPTION_MASK_ISA_AVX512F_SET)
-#define OPTION_MASK_ISA2_AVX512BF16_SET OPTION_MASK_ISA2_AVX512BF16
-#define OPTION_MASK_ISA_RTM_SET OPTION_MASK_ISA_RTM
-#define OPTION_MASK_ISA_PRFCHW_SET OPTION_MASK_ISA_PRFCHW
-#define OPTION_MASK_ISA_RDSEED_SET OPTION_MASK_ISA_RDSEED
-#define OPTION_MASK_ISA_ADX_SET OPTION_MASK_ISA_ADX
-#define OPTION_MASK_ISA_PREFETCHWT1_SET OPTION_MASK_ISA_PREFETCHWT1
-#define OPTION_MASK_ISA_CLFLUSHOPT_SET OPTION_MASK_ISA_CLFLUSHOPT
-#define OPTION_MASK_ISA_XSAVES_SET \
-  (OPTION_MASK_ISA_XSAVES | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_XSAVEC_SET \
-  (OPTION_MASK_ISA_XSAVEC | OPTION_MASK_ISA_XSAVE_SET)
-#define OPTION_MASK_ISA_CLWB_SET OPTION_MASK_ISA_CLWB
-#define OPTION_MASK_ISA2_AVX512VP2INTERSECT_SET 
OPTION_MASK_ISA2_AVX512VP2INTERSECT
-#define OPTION_MASK_ISA2_AMX_TILE_SET OPTION_MASK_ISA2_AMX_TILE
-#define OPTION_MASK_ISA2_AMX_INT8_SET OPTION_MASK_ISA2_AMX_INT8
-#define OPTION_MASK_ISA2_AMX_BF16_SET OPTION_MASK_ISA2_AMX_BF16
-
-/* SSE4 includes both SSE4.1 and SSE4.2. -msse4 should be the same
-   as -msse4.2.  */
-#define OPTION_MASK_ISA_SSE4_SET OPTION_MASK_ISA_SSE4_2_SET
-
-#define OPTION_MASK_ISA_SSE4A_SET \
-  

[PATCH] precompute_tls_p target hook in calls.c for AIX TLS (PR 94177)

2021-04-14 Thread David Edelsohn via Gcc-patches
AIX uses a compiler-managed TOC for global data, including TLS symbols.
The GCC TOC implementation manages the TOC entries through the
constant pool.

TLS symbols sometimes require a function call to obtain the TLS base
pointer.  The arguments to the TLS call can conflict with arguments to
a normal function call if the TLS symbol is an argument in the normal call.
GCC specifically checks for this situation and precomputes the TLS
arguments, but the mechanism to check for this requirement utilizes
legitimate_constant_p().  The necessary result of legitimate_constant_p()
for correct TOC behavior and for correct TLS argument behavior is in
conflict.

I have tried multiple approaches to wrap the symbol in UNSPEC and
tweaking legitimate_constant_p definition.  The current AIX TOC
implementation is too tied to force_const_mem() and the constant pool.
The calls.c test is tied to both CONST and TLS.  I would appreciate
not being told that this is abusing the definition of CONST in GCC or
that I should re-write the TOC implementation.

This patch adds a new target hook precompute_tls_p() to decide if an
argument should be precomputed regardless of the result from
legitmate_constant_p().

If you want to consider this a hack for AIX, fine.

Bootstrapped on powerpc-ibm-aix7.2.3.0.

Thanks, David

* gcc/calls.c (precompute_register_parameters): Additionally test
targetm.precompute_tls_p to pre-compute argument.
* gcc/config/rs6000/aix.h (TARGET_PRECOMPUTE_TLS_P): Define.
* gcc/config/rs6000/rs6000.c (rs6000_aix_precompute_tls_p): New.
* gcc/target.def (precompute_tls_p): New.

diff --git a/gcc/calls.c b/gcc/calls.c
index ff606204772..883d08ba5f2 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -1002,7 +1002,8 @@ precompute_register_parameters (int num_actuals,
struct arg_data *args,
/* If the value is a non-legitimate constant, force it into a
   pseudo now.  TLS symbols sometimes need a call to resolve.  */
if (CONSTANT_P (args[i].value)
-   && !targetm.legitimate_constant_p (args[i].mode, args[i].value))
+   && (!targetm.legitimate_constant_p (args[i].mode, args[i].value)
+   || targetm.precompute_tls_p (args[i].mode, args[i].value)))
  args[i].value = force_reg (args[i].mode, args[i].value);

/* If we're going to have to load the value by parts, pull the

diff --git a/gcc/config/rs6000/aix.h b/gcc/config/rs6000/aix.h
index 7fccb31307b..b116e1a36bb 100644
--- a/gcc/config/rs6000/aix.h
+++ b/gcc/config/rs6000/aix.h
@@ -279,3 +279,4 @@
 /* Use standard DWARF numbering for DWARF debugging information.  */
 #define RS6000_USE_DWARF_NUMBERING

+#define TARGET_PRECOMPUTE_TLS_P rs6000_aix_precompute_tls_p
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 48b8efd732b..e2010035ee8 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -9608,7 +9608,8 @@ rs6000_cannot_force_const_mem (machine_mode mode
ATTRIBUTE_UNUSED, rtx x)
   && SYMBOL_REF_TLS_MODEL (XEXP (XEXP (x, 0), 0)) != 0)
 return true;

-  /* Do not place an ELF TLS symbol in the constant pool.  */
+  /* Allow AIX TOC TLS symbols in the constant pool,
+ but not ELF TLS symbols.  */
   return TARGET_ELF && tls_referenced_p (x);
 }

@@ -25370,6 +25371,18 @@ rs6000_legitimate_constant_p (machine_mode mode, rtx x)
   return true;
 }

+/* Implement TARGET_PRECOMPUTE_TLS_P.
+
+   On the AIX, TLS symbols are in the TOC, which is maintained in the
+   constant pool.  AIX TOC TLS symbols need to be pre-computed, but
+   must be considered legitimate constants.  */
+
+static bool
+rs6000_aix_precompute_tls_p (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
+{
+  return tls_referenced_p (x);
+}
+
 ^L
 /* Return TRUE iff the sequence ending in LAST sets the static chain.  */

diff --git a/gcc/target.def b/gcc/target.def
index d7b94bd8e5d..0ebfb58fa6f 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2715,6 +2715,18 @@ The default definition returns true.",
  bool, (machine_mode mode, rtx x),
  hook_bool_mode_rtx_true)

+/* True if X is a TLS operand whose value should be pre-computed.  */
+DEFHOOK
+(precompute_tls_p,
+ "This hook returns true if @var{x} is a TLS operand on the target\n\
+machine that should be pre-computed when used as the argument in a call.\n\
+You can assume that @var{x} satisfies @code{CONSTANT_P}, so you need not \n\
+check this.\n\
+\n\
+The default definition returns false.",
+ bool, (machine_mode mode, rtx x),
+ hook_bool_mode_rtx_false)
+
 /* True if the constant X cannot be placed in the constant pool.  */
 DEFHOOK
 (cannot_force_const_mem,


Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC

2021-04-14 Thread Michael Meissner via Gcc-patches
On Wed, Apr 14, 2021 at 02:15:47PM -0500, Segher Boessenkool wrote:
> On Wed, Apr 14, 2021 at 03:09:13PM -0400, Michael Meissner wrote:
> > On Tue, Apr 13, 2021 at 05:19:12PM -0500, Segher Boessenkool wrote:
> > > > * config/rs6000/rs6.h (FLOAT128_MIN_MAX_FPMASK_P): New 
> > > > macro.
> > > 
> > > As said in the other mail, don't do the macro; just write its expansion
> > > in the single place it is used.
> > 
> > Note, in the first patch it is only used 1 time, but in the second patch it 
> > is
> > used 5 times (4 times in mode iterators in rs6000.md, 1 other use in 
> > rs6000.c).
> > But I will eliminate it, and replicate it in each of the 6 places it is 
> > used.
> 
> The alternative is to come up with a much better name :-/

I dunno, given the what the macro is used for (i.e. whether we have the IEEE
128-bit minimum, maximum, and floating point compare mask)
FLOAT128_MIN_MAX_FPMASK_P meets the definition.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC

2021-04-14 Thread Michael Meissner via Gcc-patches
On Wed, Apr 14, 2021 at 02:38:47PM -0500, Segher Boessenkool wrote:
> On Fri, Apr 09, 2021 at 10:43:58AM -0400, Michael Meissner wrote:
> > (Fv mode attribute): Add KFmode and TFmode.
> > (movcc_fpmask): Replace
> > movcc_p9.  Add IEEE 128-bit fp support.
> > (movcc_invert_fpmask): Replace
> > movcc_invert_p9.  Add IEEE 128-bit fp
> > support.
> > (fpmask): Add IEEE 128-bit fp support.  Enable generator to
> > build te RTL.
> > (xxsel): Add IEEE 128-bit fp support.  Enable generator to
> > build te RTL.
> 
> > @@ -608,8 +621,13 @@ (define_mode_attr Ff   [(SF "f") (DF "d") (DI 
> > "d")])
> >  ; SF/DF constraint for arithmetic on VSX registers using instructions 
> > added in
> >  ; ISA 2.06 (power7).  This includes instructions that normally target DF 
> > mode,
> >  ; but are used on SFmode, since internally SFmode values are kept in the 
> > DFmode
> > -; format.
> > -(define_mode_attr Fv   [(SF "wa") (DF "wa") (DI "wa")])
> > +; format.  Also include IEEE 128-bit instructions which are restricted to 
> > the
> > +; Altivec registers.
> > +(define_mode_attr Fv   [(SF "wa")
> > +(DF "wa")
> > +(DI "wa")
> > +(KF "v")
> > +(TF "v")])
> 
> Eww.  Please just split the patterns.  Fv should just go away, it is
> always "wa" currently.  Removing that cascades to more cleanups, which
> is why I haven't done it yet, it takes time.

The problem is you have a combinatorial explosion.  Right now, there are two
patterns, one for the normal move, and one for the inverted move.  Without
doing a cascaded combination, you would need some 32 patterns to cover all of
the possibilities.

Or you give up on having a conditional move that compares one type and moves a
second:

_Float128 a, b;
double c, d, r;

r = (a == b) ? c : d;

As I recall when I put the original logic in, there were a few places that did
this mixed comparison between SF/DF modes was used in real code.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Richard Sandiford via Gcc-patches
Richard Sandiford  writes:
> Jakub Jelinek  writes:
>> On Wed, Apr 14, 2021 at 05:31:23PM +0100, Richard Sandiford wrote:
>>> > +(define_split
>>> > +  [(set (match_operand:GPI 0 "register_operand")
>>> > + (LOGICAL:GPI
>>> > +   (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
>>> > +(match_operand:QI 2 "aarch64_shift_imm_"))
>>> > +(match_operand:GPI 4 "const_int_operand"))
>>> > +   (zero_extend:GPI (match_operand 3 "register_operand"]
>>> > +  "can_create_pseudo_p ()
>>> > +   && REG_P (operands[1])
>>> > +   && REG_P (operands[3])
>>> > +   && REGNO (operands[1]) == REGNO (operands[3])
>>> > +   && ((unsigned HOST_WIDE_INT)
>>> > +   trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
>>> > +<< INTVAL (operands[2]), mode)
>>> > +   == UINTVAL (operands[4]))"
>>> 
>>> IMO this would be easier to understand as:
>>> 
>>>&& (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
>>><< INTVAL (operands[2]), mode)
>>>== INTVAL (operands[4]))
>>> 
>>> (At first I thought the cast and UINTVAL were trying to escape the
>>> sign-extension canonicalisation.)
>>
>> It is ok to write it that way, you're right, I wrote it with
>> UINTVAL etc. because I initially didn't use trunc_int_for_mode
>> but that is wrong for SImode if the mask is shifted into bit 31.
>>
>>> I'm not sure about this one though.  The REGNO checks mean that this is
>>> effectively for hard registers only.  I thought one of the reasons for
>>> make_more_copies was to avoid combining hard registers like this, so I'm
>>> not sure we should have a pattern that specifically targets them.
>>> 
>>> Segher, have I misunderstood?
>>
>> Yes, this one works only with the hard regs, the problem is that when
>> the hard regs are there, combiner doesn't try anything else, so without
>> such splitter it punts on that.
>> If I add yet another testcase which doesn't have hard registers, like:
>> unsigned
>> or_shift2 (void)
>> {
>>   unsigned char i = 0;
>>   asm volatile ("" : "+r" (i));
>>   return i | (i << 11);
>> }
>> then my patch doesn't handle that case, and the only splitter that would
>> help would need to deal with:
>> (set (reg/i:SI 0 x0)
>> (ior:SI (and:SI (ashift:SI (subreg:SI (reg:QI 97 [ i ]) 0)
>> (const_int 11 [0xb]))
>> (const_int 522240 [0x7f800]))
>> (zero_extend:SI (reg:QI 97 [ i ]
>> I have added another combine splitter for this below.  But as you can
>> see, what combiner simplification comes with isn't really consistent
>> and orthogonal, different operations in there look quite differently :(.
>
> Hmm, OK.  Still, the above looks reasonable on first principles.
>
>>> These two look good to me apart from the cast nit.  The last one feels
>>> like it's more general than just sign_extends though.  I guess it would
>>> work for any duplicated operation that can be performed in a single
>>> instruction.
>>
>> True, but only very small portion of them can actually make it through,
>> it needs something that combine has been able to propagate into another
>> instruction.  So if we know about other insns that would look the same
>> and would actually be ever matched, we can e.g. define an operator predicate
>> for it, but until we have testcases for that, not sure it is worth it.
>>
>> Here is an updated patch that handles also the zero extends without hard
>> registers and doesn't have the UHWI casts (but untested for now except
>> for the testcase):
>>
>> 2021-04-14  Jakub Jelinek  
>>
>>  PR target/100056
>>  * config/aarch64/aarch64.md (*_3):
>>  Add combine splitters for *_ashl3 with
>>  ZERO_EXTEND, SIGN_EXTEND or AND.
>>
>>  * gcc.target/aarch64/pr100056.c: New test.
>>
>> --- gcc/config/aarch64/aarch64.md.jj 2021-04-13 20:41:45.030040848 +0200
>> +++ gcc/config/aarch64/aarch64.md2021-04-14 19:07:41.641623978 +0200
>> @@ -4431,6 +4431,75 @@ (define_insn "*_>[(set_attr "type" "logic_shift_imm")]
>>  )
>>  
>> +(define_split
>> +  [(set (match_operand:GPI 0 "register_operand")
>> +(LOGICAL:GPI
>> +  (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
>> +   (match_operand:QI 2 "aarch64_shift_imm_"))
>> +   (match_operand:GPI 4 "const_int_operand"))
>> +  (zero_extend:GPI (match_operand 3 "register_operand"]
>> +  "can_create_pseudo_p ()
>> +   && REG_P (operands[1])
>> +   && REG_P (operands[3])
>> +   && REGNO (operands[1]) == REGNO (operands[3])
>> +   && (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
>> +   << INTVAL (operands[2]), mode)
>> +   == INTVAL (operands[4]))"
>> +  [(set (match_dup 4) (zero_extend:GPI (match_dup 3)))
>> +   (set (match_dup 0) (LOGICAL:GPI (ashift:GPI (match_dup 4) (match_dup 2))
>> +   (match_dup 4)))]
>> +  "operands[4] = gen_reg_rtx (mode);"
>> +)
>> +
>> +(define_split
>> +  [(set 

Re: [PATCH] c: Don't drop vector attributes that affect type identity [PR98852]

2021-04-14 Thread Jeff Law via Gcc-patches



On 4/14/2021 9:32 AM, Richard Sandiford via Gcc-patches wrote:

 types are distinct from GNU vector types in at least
their mangling.  However, there used to be nothing explicit in the
VECTOR_TYPE itself to indicate the difference: we simply treated them
as distinct TYPE_MAIN_VARIANTs.  This caused problems like the ones
reported in PR95726.

The fix for that PR was to add type attributes to the 
types, in order to maintain the distinction between them and GNU
vectors.  However, this in turn caused PR98852, where c_common_type
would unconditionally drop the attributes on the source types.
This meant that:

 vector +  vector

had a GNU vector type rather than an  vector type.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96377#c2 for
Jakub's analysis of the history of this c_common_type code.
TBH I'm not sure which case the build_type_attribute_variant
code is handling, but I think we should at least avoid dropping
attributes that affect type identity.

I've tried to audit the C and target-specific attributes to look
for other types that might be affected by this, but I couldn't
see any.  We are only dealing with:

   gcc_assert (code1 == VECTOR_TYPE || code1 == COMPLEX_TYPE
  || code1 == FIXED_POINT_TYPE || code1 == REAL_TYPE
  || code1 == INTEGER_TYPE);

which excludes most affects_type_identity attributes.  The closest
was s390_vector_bool, but the handler for that attribute changes
the type node and drops the attribute itself (*no_add_attrs = true).

I put the main list handling into a separate function
(remove_attributes_matching) because a later patch will need it
for something else.

Tested on aarch64-linux-gnu, arm-linux-gnueabihf, armeb-eabi and
x86_64-linux-gnu.  OK for trunk?  The bug also occurs on GCC 10 branch,
but we'll need a slightly different fix there.

Thanks,
Richard


gcc/
PR c/98852
* attribs.h (affects_type_identity_attributes): Declare.
* attribs.c (remove_attributes_matching): New function.
(affects_type_identity_attributes): Likewise.

gcc/c/
PR c/98852
* c-typeck.c (c_common_type): Do not drop attributes that
affect type identity.

gcc/testsuite/
PR c/98852
* gcc.target/aarch64/advsimd-intrinsics/pr98852.c: New test.


OK


Jeff


Re: [PATCH] c++: Tweak merging of vector attributes that affect type identity [PR98852]

2021-04-14 Thread Jeff Law via Gcc-patches



On 4/14/2021 9:36 AM, Richard Sandiford via Gcc-patches wrote:

 types are distinct from GNU vector types in at least
their mangling.  However, there used to be nothing explicit in the
VECTOR_TYPE itself to indicate the difference: we simply treated them
as distinct TYPE_MAIN_VARIANTs.  This caused problems like the ones
reported in PR95726.

The fix for that PR was to add type attributes to the 
types, in order to maintain the distinction between them and GNU
vectors.  However, this in turn caused PR98852, where cp_common_type
would merge the type attributes from the two source types and attach
the result to the common type.  For example:

unsigned vector with no attribute + signed vector with attribute X

would get converted to:

unsigned vector with attribute X

That isn't what we want in this case, since X describes the mangling
of the original type.  But even if we dropped the mangling from X and
worked it out from context, we would still have a situation in which
the common type was provably distinct from both of the source types:
it would take its -ness from one side and its signedness
from the other.  I guess there are other cases where the common type
doesn't match either side, but I'm not sure it's the obvious behaviour
here.  It's also different from GCC 10.1 and earlier, where the unsigned
vector “won” in its original form.

This patch instead merges only the attributes that don't affect type
identity.  For now I've restricted it to vector types, since we're so
close to GCC 11, but it might make sense to use this elsewhere.

I've tried to audit the C and target-specific attributes to look for
other types that might be affected by this, but I couldn't see any.
The closest was s390_vector_bool, but the handler for that attribute
changes the type node and drops the attribute itself
(*no_add_attrs = true).

Tested on aarch64-linux-gnu, arm-linux-gnueabihf, armeb-eabi and
x86_64-linux-gnu.  OK for trunk?  The bug also occurs on GCC 10 branch,
but we'll need a slightly different fix there.

Richard


gcc/
PR c++/98852
* attribs.h (restrict_type_identity_attributes_to): Declare.
* attribs.c (restrict_type_identity_attributes_to): New function.

gcc/cp/
PR c++/98852
* typeck.c (merge_type_attributes_from): New function.
(cp_common_type): Use it for vector types.


OK

jeff



Re: [PATCH] c: Don't drop vector attributes that affect type identity [PR98852]

2021-04-14 Thread Joseph Myers
On Wed, 14 Apr 2021, Richard Sandiford via Gcc-patches wrote:

> gcc/
>   PR c/98852
>   * attribs.h (affects_type_identity_attributes): Declare.
>   * attribs.c (remove_attributes_matching): New function.
>   (affects_type_identity_attributes): Likewise.
> 
> gcc/c/
>   PR c/98852
>   * c-typeck.c (c_common_type): Do not drop attributes that
>   affect type identity.
> 
> gcc/testsuite/
>   PR c/98852
>   * gcc.target/aarch64/advsimd-intrinsics/pr98852.c: New test.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC

2021-04-14 Thread Bernhard Reutner-Fischer via Gcc-patches
On 14 April 2021 21:01:15 CEST, Segher Boessenkool  
wrote:

>> > > --- /dev/null
>> > > +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
>> > > @@ -0,0 +1,93 @@
>> > > +/* { dg-do compile } */
>> > > +/* { dg-require-effective-target ppc_float128_hw } */
>> > > +/* { dg-require-effective-target power10_ok } */
>> > > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
>> > > +/* { dg-final { scan-assembler {\mxscmpeq[dq]p\M} } } */
>> > > +/* { dg-final { scan-assembler {\mxxpermdi\M} } } */
>> > > +/* { dg-final { scan-assembler {\mxxsel\M}} } */
>> > > +/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
>> > > +/* { dg-final { scan-assembler-not {\mfcmp[uo]\M} } } */
>> > > +/* { dg-final { scan-assembler-not {\mfsel\M} } } */
>> 
>> I'd have expected scan-assembler-times fwiw.
>
>For what?  scan-assembler-not *is* scan-assembler-times, in effect (but
>simpler of course, and it does work with capturing parens).

I meant -times for the occurrences of scan-assembler, not the -not, in case 
that wasn't clear.

>Having too strict checks for generated code means no end to having to
>update many testcases when we have very small changes in the compiler.
>It's a balancing act.  But maybe some -times would be good here, dunno.
>
>> > > +__float128
>> > > +eq_f128_d (__float128 a, __float128 b, double x, double y)
>> > > +{
>> > > +  return (x != y) ? a : b;
>> > > +}
>> 
>> I would think the above should be == since it's named eq_ and
>> the body would be redundant to ne_f128_d below as is.
>
>Good spot :-)

Well -times would maybe have caught exactly this I suppose.

I know the exact count can be cumbersome to maintain, but in this very specific 
case which checks exactly the desired instruction it may be appropriate.

Just saying, prompted by the typo..
thanks,


Re: [PATCH] re PR tree-optimization/93210 (Sub-optimal code optimization on struct/combound constexpr (gcc vs. clang))

2021-04-14 Thread Jeff Law via Gcc-patches



On 4/14/2021 11:13 AM, Stefan Schulze Frielinghaus via Gcc-patches wrote:

Regarding test gcc.dg/pr93210.c, on different targets GIMPLE code may
slightly differ which is why the scan-tree-dump-times directive may
fail.  For example, for a RETURN_EXPR on x86_64 we have

   return 0x11100f0e0d0c0a090807060504030201;

whereas on IBM Z the first operand is a RESULT_DECL like

= 0x102030405060708090a0c0d0e0f1011;
   return ;

gcc/testsuite/ChangeLog:

* gcc.dg/pr93210.c: Adapt regex in order to also support a
RESULT_DECL as an operand for a RETURN_EXPR.

Ok for mainline?


OK

jeff



Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 14, 2021 at 02:47:59PM -0500, Segher Boessenkool wrote:
> On Wed, Apr 14, 2021 at 09:45:35PM +0200, Jakub Jelinek wrote:
> > On Wed, Apr 14, 2021 at 02:42:54PM -0500, Segher Boessenkool wrote:
> > > > provably doesn't (that is from the splitter I wrote for the non-hard 
> > > > regs),
> > > > nor
> > > >   [(set (match_operand:GPI 0 "register_operand")
> > > > (LOGICAL:GPI
> > > >   (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
> > > >(match_operand:QI 2 
> > > > "aarch64_shift_imm_"))
> > > >(match_operand:GPI 3 "const_int_operand"))
> > > >   (zero_extend:GPI (subreg (match_dup 1) 0]
> > > > works (and it is unclear how I'd find out the mode of the subreg even 
> > > > if it
> > > > worked).
> > > 
> > > Just
> > >   (subreg:QI (match_dup 1) 0)
> > > should work?
> > 
> > That doesn't work either.
> 
> Why not?  What goes wrong with that?

It just doesn't match and therefore doesn't split it.

Jakub



Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Segher Boessenkool
On Wed, Apr 14, 2021 at 09:45:35PM +0200, Jakub Jelinek wrote:
> On Wed, Apr 14, 2021 at 02:42:54PM -0500, Segher Boessenkool wrote:
> > > provably doesn't (that is from the splitter I wrote for the non-hard 
> > > regs),
> > > nor
> > >   [(set (match_operand:GPI 0 "register_operand")
> > > (LOGICAL:GPI
> > >   (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
> > >(match_operand:QI 2 
> > > "aarch64_shift_imm_"))
> > >(match_operand:GPI 3 "const_int_operand"))
> > >   (zero_extend:GPI (subreg (match_dup 1) 0]
> > > works (and it is unclear how I'd find out the mode of the subreg even if 
> > > it
> > > worked).
> > 
> > Just
> >   (subreg:QI (match_dup 1) 0)
> > should work?
> 
> That doesn't work either.

Why not?  What goes wrong with that?


Segher


Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Segher Boessenkool
On Wed, Apr 14, 2021 at 09:25:46PM +0200, Jakub Jelinek wrote:
> On Wed, Apr 14, 2021 at 01:47:04PM -0500, Segher Boessenkool wrote:
> > > and I must say I don't know if make_more_copies was meant to
> > > split insn 2 into (set (reg:QI pseudo) (reg:QI 0 x0)) and
> > > (set (reg/v:SI 96) (zero_extend:SI (reg:QI pseudo)))
> > > or not.
> > 
> > It makes
> > 
> > (set (reg:QI new) (reg:QI x0))
> > (set (reg:SI 96) (zero_extend:SI (reg:QI new)))
> > 
> > The point is it keeps exactly the same form, but no hard regs anymore.
> 
> It doesn't, as make_more_copies does:
>   rtx dest = SET_DEST (set);
>   if (!(REG_P (dest) && !HARD_REGISTER_P (dest)))
>   continue;
>
>   rtx src = SET_SRC (set);
>   if (!(REG_P (src) && HARD_REGISTER_P (src)))
> continue;
> but in this case the hard reg is wrapped into the zero_extend already
> and so it will continue;

Ah, I see.  That could/should be improved then.  But, GCC 12 :-)


Segher


Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 14, 2021 at 02:42:54PM -0500, Segher Boessenkool wrote:
> > provably doesn't (that is from the splitter I wrote for the non-hard regs),
> > nor
> >   [(set (match_operand:GPI 0 "register_operand")
> > (LOGICAL:GPI
> >   (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
> >(match_operand:QI 2 
> > "aarch64_shift_imm_"))
> >(match_operand:GPI 3 "const_int_operand"))
> >   (zero_extend:GPI (subreg (match_dup 1) 0]
> > works (and it is unclear how I'd find out the mode of the subreg even if it
> > worked).
> 
> Just
>   (subreg:QI (match_dup 1) 0)
> should work?

That doesn't work either.

Jakub



Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 14, 2021 at 09:20:35PM +0200, Jakub Jelinek wrote:
> The question is what to write in the splitter pattern so that
> they would match.
>   [(set (match_operand:GPI 0 "register_operand")
> (LOGICAL:GPI
>   (and:GPI (ashift:GPI (match_operator:GPI 4 "subreg_lowpart_operator"
>  [(match_operand 1 "register_operand")])
>(match_operand:QI 2 
> "aarch64_shift_imm_"))
>(match_operand:GPI 3 "const_int_operand"))
>   (zero_extend:GPI (match_dup 1]
> provably doesn't (that is from the splitter I wrote for the non-hard regs),
> nor
>   [(set (match_operand:GPI 0 "register_operand")
> (LOGICAL:GPI
>   (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
>(match_operand:QI 2 
> "aarch64_shift_imm_"))
>(match_operand:GPI 3 "const_int_operand"))
>   (zero_extend:GPI (subreg (match_dup 1) 0]
> works (and it is unclear how I'd find out the mode of the subreg even if it
> worked).

What seems to work and handles both the pseudos (in which case there
is a (subreg:GPI (reg:whatever xyz) 0) and (reg:whatever xyz)
in the operands) and the hard registers (in which case there is
(reg:GPI xyz) and (reg:whatever xyz) in the operands) is:

(define_split
  [(set (match_operand:GPI 0 "register_operand")
(LOGICAL:GPI
  (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
   (match_operand:QI 2 "aarch64_shift_imm_"))
   (match_operand:GPI 3 "const_int_operand"))
  (zero_extend:GPI (match_operand 4 "register_operand"]
  "can_create_pseudo_p ()
   && ((paradoxical_subreg_p (operands[1])
&& rtx_equal_p (SUBREG_REG (operands[1]), operands[4]))
   || (REG_P (operands[1])
   && REG_P (operands[4])
   && REGNO (operands[1]) == REGNO (operands[4])))
   && (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[4]))
   << INTVAL (operands[2]), mode)
   == INTVAL (operands[3]))"
  [(set (match_dup 5) (zero_extend:GPI (match_dup 4)))
   (set (match_dup 0) (LOGICAL:GPI (ashift:GPI (match_dup 5) (match_dup 2))
   (match_dup 5)))]
  "operands[5] = gen_reg_rtx (mode);"
)

While it is one pattern, it needs different handling for the two cases.
Is that acceptable?

Jakub



Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Segher Boessenkool
On Wed, Apr 14, 2021 at 09:20:33PM +0200, Jakub Jelinek wrote:
> On Wed, Apr 14, 2021 at 01:47:04PM -0500, Segher Boessenkool wrote:
> > A subreg:QI of the match_dup should match fine.  You can use a subreg
> > wherever GCC tries to match a reg.
> > 
> > > match_dup means insn-recog.c calls rtx_equal_p and that returns false if 
> > > the
> > > mode is not the same.
> > 
> > Yes, but they are the same :-)
> 
> In the end sure.
> > 
> > (reg:SI whatever)  and  (subreg:QI (reg:SI whatever) 0)
> 
> The question is what to write in the splitter pattern so that
> they would match.
>   [(set (match_operand:GPI 0 "register_operand")
> (LOGICAL:GPI
>   (and:GPI (ashift:GPI (match_operator:GPI 4 "subreg_lowpart_operator"
>  [(match_operand 1 "register_operand")])
>(match_operand:QI 2 
> "aarch64_shift_imm_"))
>(match_operand:GPI 3 "const_int_operand"))
>   (zero_extend:GPI (match_dup 1]
> provably doesn't (that is from the splitter I wrote for the non-hard regs),
> nor
>   [(set (match_operand:GPI 0 "register_operand")
> (LOGICAL:GPI
>   (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
>(match_operand:QI 2 
> "aarch64_shift_imm_"))
>(match_operand:GPI 3 "const_int_operand"))
>   (zero_extend:GPI (subreg (match_dup 1) 0]
> works (and it is unclear how I'd find out the mode of the subreg even if it
> worked).

Just
  (subreg:QI (match_dup 1) 0)
should work?


Segher


Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC

2021-04-14 Thread Segher Boessenkool
On Fri, Apr 09, 2021 at 10:43:58AM -0400, Michael Meissner wrote:
> (Fv mode attribute): Add KFmode and TFmode.
> (movcc_fpmask): Replace
> movcc_p9.  Add IEEE 128-bit fp support.
> (movcc_invert_fpmask): Replace
> movcc_invert_p9.  Add IEEE 128-bit fp
> support.
> (fpmask): Add IEEE 128-bit fp support.  Enable generator to
> build te RTL.
> (xxsel): Add IEEE 128-bit fp support.  Enable generator to
> build te RTL.

> @@ -608,8 +621,13 @@ (define_mode_attr Ff [(SF "f") (DF "d") (DI 
> "d")])
>  ; SF/DF constraint for arithmetic on VSX registers using instructions added 
> in
>  ; ISA 2.06 (power7).  This includes instructions that normally target DF 
> mode,
>  ; but are used on SFmode, since internally SFmode values are kept in the 
> DFmode
> -; format.
> -(define_mode_attr Fv [(SF "wa") (DF "wa") (DI "wa")])
> +; format.  Also include IEEE 128-bit instructions which are restricted to the
> +; Altivec registers.
> +(define_mode_attr Fv [(SF "wa")
> +  (DF "wa")
> +  (DI "wa")
> +  (KF "v")
> +  (TF "v")])

Eww.  Please just split the patterns.  Fv should just go away, it is
always "wa" currently.  Removing that cascades to more cleanups, which
is why I haven't done it yet, it takes time.

Almost all places that use Fv have no use at all for KF.

>  (define_expand "movcc"
> -   [(set (match_operand:SFDF 0 "gpc_reg_operand")
> -  (if_then_else:SFDF (match_operand 1 "comparison_operator")
> - (match_operand:SFDF 2 "gpc_reg_operand")
> - (match_operand:SFDF 3 "gpc_reg_operand")))]
> +   [(set (match_operand:FPMASK 0 "gpc_reg_operand")
> +  (if_then_else:FPMASK (match_operand 1 "comparison_operator")
> +   (match_operand:FPMASK 2 "gpc_reg_operand")
> +   (match_operand:FPMASK 3 "gpc_reg_operand")))]

So you really want SFDFQF or such?  That is much more generic than
"FPMASK", which doesn't explain what it means at all, either.

But, you can keep the patterns separate as well.

> - [(set_attr "length" "8")
> + ;; length is 12 in case we need to add XXPERMDI
> + [(set_attr "length" "12")

Which is only for QP.  So really, just keep the patterns split.

> +  return (FLOAT128_IEEE_P (mode)
> +   ? "xscmp%V1qp %0,%2,%3"
> +   : "xscmp%V1dp %x0,%x2,%x3");

Different output as well.

> -(define_insn "*xxsel"
> -  [(set (match_operand:SFDF 0 "vsx_register_operand" "=")
> - (if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" 
> "wa")
> -(match_operand:V2DI 2 "zero_constant" ""))
> -(match_operand:SFDF 3 "vsx_register_operand" "")
> -(match_operand:SFDF 4 "vsx_register_operand" 
> "")))]
> +(define_insn "xxsel"
> +  [(set (match_operand:FPMASK 0 "vsx_register_operand" "=wa")
> + (if_then_else:FPMASK
> +  (ne (match_operand:V2DI 1 "vsx_register_operand" "wa")
> +  (match_operand:V2DI 2 "zero_constant" ""))
> +  (match_operand:FPMASK 3 "vsx_register_operand" "wa")
> +  (match_operand:FPMASK 4 "vsx_register_operand" "wa")))]
>"TARGET_P9_MINMAX"
>"xxsel %x0,%x4,%x3,%x1"
>[(set_attr "type" "vecmove")])

Please keep that a "*"; it should be generated via "movcc".

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> @@ -0,0 +1,93 @@

> +/* { dg-final { scan-assembler {\mxxsel\M}} } */

> +/* { dg-final { scan-assembler-not {\mfsel\M} } } */

It is somewhat problematic to require xxsel and disallow fsel (for one
thing, the compiler could always generated xxsel instead of any fsel).
But it will probably keep working fine, the routines here are very
short.

> +__float128
> +eq_f128_d (__float128 a, __float128 b, double x, double y)
> +{
> +  return (x != y) ? a : b;
> +}

So "==" here.


Segher


Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 14, 2021 at 01:47:04PM -0500, Segher Boessenkool wrote:
> > and I must say I don't know if make_more_copies was meant to
> > split insn 2 into (set (reg:QI pseudo) (reg:QI 0 x0)) and
> > (set (reg/v:SI 96) (zero_extend:SI (reg:QI pseudo)))
> > or not.
> 
> It makes
> 
> (set (reg:QI new) (reg:QI x0))
> (set (reg:SI 96) (zero_extend:SI (reg:QI new)))
> 
> The point is it keeps exactly the same form, but no hard regs anymore.

It doesn't, as make_more_copies does:
  rtx dest = SET_DEST (set);
  if (!(REG_P (dest) && !HARD_REGISTER_P (dest)))
  continue;
   
  rtx src = SET_SRC (set);
  if (!(REG_P (src) && HARD_REGISTER_P (src)))
continue;
but in this case the hard reg is wrapped into the zero_extend already
and so it will continue;

Jakub



Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 14, 2021 at 01:47:04PM -0500, Segher Boessenkool wrote:
> A subreg:QI of the match_dup should match fine.  You can use a subreg
> wherever GCC tries to match a reg.
> 
> > match_dup means insn-recog.c calls rtx_equal_p and that returns false if the
> > mode is not the same.
> 
> Yes, but they are the same :-)

In the end sure.
> 
> (reg:SI whatever)  and  (subreg:QI (reg:SI whatever) 0)

The question is what to write in the splitter pattern so that
they would match.
  [(set (match_operand:GPI 0 "register_operand")
(LOGICAL:GPI
  (and:GPI (ashift:GPI (match_operator:GPI 4 "subreg_lowpart_operator"
 [(match_operand 1 "register_operand")])
   (match_operand:QI 2 "aarch64_shift_imm_"))
   (match_operand:GPI 3 "const_int_operand"))
  (zero_extend:GPI (match_dup 1]
provably doesn't (that is from the splitter I wrote for the non-hard regs),
nor
  [(set (match_operand:GPI 0 "register_operand")
(LOGICAL:GPI
  (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
   (match_operand:QI 2 "aarch64_shift_imm_"))
   (match_operand:GPI 3 "const_int_operand"))
  (zero_extend:GPI (subreg (match_dup 1) 0]
works (and it is unclear how I'd find out the mode of the subreg even if it
worked).

Jakub



[PATCH] c++: Fix up C++23 [] <...> requires primary -> type {} parsing [PR99850]

2021-04-14 Thread Jakub Jelinek via Gcc-patches
Hi!

The requires clause parsing has code to suggest users wrapping
non-primary expressions in (), so if it e.g. parses a primary expression
and sees it is followed by ++, --, ., ( or -> among other things it
will try to reparse it as assignment expression or what and if that works
suggests wrapping it inside of parens.
When it is requires-clause that is after  etc. it already
has an exception from that as ( can occur in valid C++20 expression there
- starting the parameters of the lambda.
In C++23 another case can occur, as the parameters with the ()s can be
omitted, requires C can be followed immediately by -> which starts a
trailing return type.  Even in that case, we don't want to parse that
as C->...

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux (with
GXX_TESTSUITE_STDS=98,11,14,17,20,2b ), ok for trunk?

2021-04-14  Jakub Jelinek  

PR c++/99850
* parser.c (cp_parser_constraint_requires_parens) :
If lambda_p, return pce_ok for C++23 or later instead of
pce_maybe_postfix.

* g++.dg/cpp23/lambda-specifiers2.C: New test.

--- gcc/cp/parser.c.jj  2021-04-14 10:48:41.318103715 +0200
+++ gcc/cp/parser.c 2021-04-14 14:52:03.220235527 +0200
@@ -28526,7 +28526,19 @@ cp_parser_constraint_requires_parens (cp
   case CPP_PLUS_PLUS:
   case CPP_MINUS_MINUS:
   case CPP_DOT:
+   /* Unenclosed postfix operator.  */
+   return pce_maybe_postfix;
+
   case CPP_DEREF:
+   /* A primary constraint that precedes the lambda-declarator of a
+  lambda expression is followed by trailing return type.
+
+ [] requires C -> void {}
+
+  Don't try to re-parse this as a postfix expression in
+  C++23 and later.  In C++20 ( needs to come in between.  */
+   if (lambda_p && cxx_dialect >= cxx23)
+ return pce_ok;
/* Unenclosed postfix operator.  */
return pce_maybe_postfix;
}
--- gcc/testsuite/g++.dg/cpp23/lambda-specifiers2.C.jj  2021-04-14 
15:01:41.728714721 +0200
+++ gcc/testsuite/g++.dg/cpp23/lambda-specifiers2.C 2021-04-14 
15:01:32.959813534 +0200
@@ -0,0 +1,7 @@
+// PR c++/99850
+// P1102R2 - Down with ()!
+// { dg-do compile { target c++23 } }
+
+auto l = [] requires true -> void {};
+template  concept C = true;
+auto m = [] requires (C && ...) -> void {};

Jakub



Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC

2021-04-14 Thread Segher Boessenkool
On Wed, Apr 14, 2021 at 03:09:13PM -0400, Michael Meissner wrote:
> On Tue, Apr 13, 2021 at 05:19:12PM -0500, Segher Boessenkool wrote:
> > >   * config/rs6000/rs6.h (FLOAT128_MIN_MAX_FPMASK_P): New macro.
> > 
> > As said in the other mail, don't do the macro; just write its expansion
> > in the single place it is used.
> 
> Note, in the first patch it is only used 1 time, but in the second patch it is
> used 5 times (4 times in mode iterators in rs6000.md, 1 other use in 
> rs6000.c).
> But I will eliminate it, and replicate it in each of the 6 places it is used.

The alternative is to come up with a much better name :-/


Segher


[PATCH] c++: Fix up handling of structured bindings in extract_locals_r [PR99833]

2021-04-14 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs in tsubst_decomp_names because the assumptions
that the structured binding artificial var is followed in DECL_CHAIN by
the corresponding structured binding vars is violated.
I've tracked it to extract_locals* which is done for the constexpr
IF_STMT.  extract_locals_r when it sees a DECL_EXPR adds that decl
into a hash set so that such decls aren't returned from extract_locals*,
but in the case of a structured binding that just means the artificial var
and not the vars corresponding to structured binding identifiers.
The following patch fixes it by pushing not just the artificial var
for structured bindings but also the other vars.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-04-14  Jakub Jelinek  

PR c++/99833
* pt.c (extract_locals_r): When handling DECL_EXPR of a structured
binding, add to data.internal also all corresponding structured
binding decls.

* g++.dg/cpp1z/pr99833.C: New test.
* g++.dg/cpp2a/pr99833.C: New test.

--- gcc/cp/pt.c.jj  2021-04-14 10:48:41.322103670 +0200
+++ gcc/cp/pt.c 2021-04-14 12:52:53.116896754 +0200
@@ -12811,7 +12811,27 @@ extract_locals_r (tree *tp, int */*walk_
 tp = _NAME (*tp);
 
   if (TREE_CODE (*tp) == DECL_EXPR)
-data.internal.add (DECL_EXPR_DECL (*tp));
+{
+  tree decl = DECL_EXPR_DECL (*tp);
+  data.internal.add (decl);
+  if (VAR_P (decl)
+ && DECL_DECOMPOSITION_P (decl)
+ && TREE_TYPE (decl) != error_mark_node)
+   {
+ gcc_assert (DECL_NAME (decl) == NULL_TREE);
+ for (tree decl2 = DECL_CHAIN (decl);
+  decl2
+  && VAR_P (decl2)
+  && DECL_DECOMPOSITION_P (decl2)
+  && DECL_NAME (decl2)
+  && TREE_TYPE (decl2) != error_mark_node;
+  decl2 = DECL_CHAIN (decl2))
+   {
+ gcc_assert (DECL_DECOMP_BASE (decl2) == decl);
+ data.internal.add (decl2);
+   }
+   }
+}
   else if (TREE_CODE (*tp) == LAMBDA_EXPR)
 {
   /* Since we defer implicit capture, look in the parms and body.  */
--- gcc/testsuite/g++.dg/cpp1z/pr99833.C.jj 2021-04-14 13:03:14.654879632 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/pr99833.C2021-04-14 13:03:39.599598004 
+0200
@@ -0,0 +1,11 @@
+// PR c++/99833
+// { dg-do compile { target c++17 } }
+
+struct S { int a, b; };
+template 
+void
+foo ()
+{
+  [](auto d) { if constexpr (auto [a, b]{d}; sizeof (a) > 0) a++; } (S{});
+}
+template void foo ();
--- gcc/testsuite/g++.dg/cpp2a/pr99833.C.jj 2021-04-14 13:04:08.975266383 
+0200
+++ gcc/testsuite/g++.dg/cpp2a/pr99833.C2021-04-14 13:04:23.191105881 
+0200
@@ -0,0 +1,18 @@
+// PR c++/99833
+// { dg-do compile { target c++20 } }
+
+#include 
+
+auto f(auto&& x)
+{
+  [&](auto...) {
+auto y = std::tuple{ "what's happening here?", x };
+if constexpr (auto [_, z] = y; requires { z; })
+  return;
+  }();
+}
+
+int main()
+{
+  f(42);
+}

Jakub



Re: [PATCH 1/2] Add IEEE 128-bit min/max support on PowerPC

2021-04-14 Thread Michael Meissner via Gcc-patches
On Tue, Apr 13, 2021 at 05:19:12PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Apr 09, 2021 at 10:42:50AM -0400, Michael Meissner wrote:
> > Since then the patch seems to have gone into a limbo state.
> 
> Patches I cannot immediately handle take time, and if they aren't
> pinged, they can fall off the map.  So a) ping your patches, once a week
> for example; and b) write patches that are simpler to review (do not
> cost many hours each).
> 
> > gcc/
> > 2021-04-09  Michael Meissner  
> > 
> > * config/rs6000/rs6000.c (rs6000_emit_minmax): Add support for ISA
> > 3.1 IEEE 128-bit floating point xsmaxcqp and xsmincqp instructions.
> > * config/rs6000/rs6.h (FLOAT128_MIN_MAX_FPMASK_P): New macro.
> 
> As said in the other mail, don't do the macro; just write its expansion
> in the single place it is used.

Note, in the first patch it is only used 1 time, but in the second patch it is
used 5 times (4 times in mode iterators in rs6000.md, 1 other use in rs6000.c).
But I will eliminate it, and replicate it in each of the 6 places it is used.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] Better const_vector printing

2021-04-14 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On April 14, 2021 5:10:26 PM GMT+02:00, Richard Sandiford via Gcc-patches 
>  wrote:
>>Looking at PR99929 showed that we weren't dumping enough information
>>about variable-length CONST_VECTORs.  Something like:
>>
>>  (const_vector:VNx4SI [(const_int 1) (const_int 0)])
>>
>>could be either:
>>
>>(a) 1, 0, 1, 0, repeating
>>(b) 1 followed by all zeros
>>
>>This patch adds more information to the dumps.  There are four cases:
>>
>>(a) above:
>>
>>(const_vector:VNx4SI repeat [
>>  (const_int 1)
>>  (const_int 0)
>>])
>>
>>(b) above:
>>
>>(const_vector:VNx4SI [
>>  (const_int 1)
>>  repeat [
>>(const_int 0)
>>  ]
>>])
>>
>>a single stepped sequence:
>>
>>(const_vector:VNx4SI [
>>  (const_int 0)
>>  stepped [
>>(const_int 1)
>>(const_int 2)
>>  ]
>>])
>>
>>interleaved stepped sequences:
>>
>>(const_vector:VNx4SI [
>>  (const_int 0)
>>  (const_int 40)
>>  stepped (interleave 2) [
>>(const_int 1)
>>(const_int 41)
>>(const_int 2)
>>(const_int 42)
>>  ]
>>])
>>
>>There are probably better syntaxes, but hopefully this is at least
>>an improvement on the status quo.
>>
>>Tested on aarch64-linux-gnu, arm-linux-gnueabihf, armeb-eabi
>>and x86_64-linux-gnu.  OK to install now, or should it wait
>>until GCC 12?  (It only affects SVE in practice.)
>
> Ok now (it should be harmless, no?) 

Yeah, I hope so :-)

Thanks,
Richard


Re: [PATCH 2/2] Add IEEE 128-bit min/max support on PowerPC

2021-04-14 Thread Segher Boessenkool
On Fri, Apr 09, 2021 at 09:20:32PM +0200, Bernhard Reutner-Fischer wrote:
> On Fri, 09 Apr 2021 11:54:59 -0500
> will schmidt via Gcc-patches  wrote:
> > > +  enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE 
> > > (old_cmp));
> 
> I think you can drop the enum keyword.

You can in C++, but not in C.  It is fine to have it in C++ as well.
I think it is nicer in this case to lose the keyword, but it is hardly
harmful :-)

> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c
> > > @@ -0,0 +1,93 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-require-effective-target ppc_float128_hw } */
> > > +/* { dg-require-effective-target power10_ok } */
> > > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> > > +/* { dg-final { scan-assembler {\mxscmpeq[dq]p\M} } } */
> > > +/* { dg-final { scan-assembler {\mxxpermdi\M} } } */
> > > +/* { dg-final { scan-assembler {\mxxsel\M}} } */
> > > +/* { dg-final { scan-assembler-not {\mxscmpu[dq]p\M}  } } */
> > > +/* { dg-final { scan-assembler-not {\mfcmp[uo]\M} } } */
> > > +/* { dg-final { scan-assembler-not {\mfsel\M} } } */
> 
> I'd have expected scan-assembler-times fwiw.

For what?  scan-assembler-not *is* scan-assembler-times, in effect (but
simpler of course, and it does work with capturing parens).

Having too strict checks for generated code means no end to having to
update many testcases when we have very small changes in the compiler.
It's a balancing act.  But maybe some -times would be good here, dunno.

> > > +__float128
> > > +eq_f128_d (__float128 a, __float128 b, double x, double y)
> > > +{
> > > +  return (x != y) ? a : b;
> > > +}
> 
> I would think the above should be == since it's named eq_ and
> the body would be redundant to ne_f128_d below as is.

Good spot :-)


Segher


[pushed] c++: premature overload resolution redux [PR100078]

2021-04-14 Thread Jason Merrill via Gcc-patches
My patch for PR93085 didn't consider that a default template argument can
also make a template dependent.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/100078
PR c++/93085
* pt.c (uses_outer_template_parms): Also look at default
template argument.

gcc/testsuite/ChangeLog:

PR c++/100078
* g++.dg/template/dependent-tmpl2.C: New test.
---
 gcc/cp/pt.c |  5 +
 gcc/testsuite/g++.dg/template/dependent-tmpl2.C | 10 ++
 2 files changed, 15 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/template/dependent-tmpl2.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f488a5a8c12..0f119a55272 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -10856,6 +10856,7 @@ uses_outer_template_parms (tree decl)
   for (int i = TREE_VEC_LENGTH (parms) - 1; i >= 0; --i)
{
  tree parm = TREE_VALUE (TREE_VEC_ELT (parms, i));
+ tree defarg = TREE_PURPOSE (TREE_VEC_ELT (parms, i));
  if (TREE_CODE (parm) == PARM_DECL
  && for_each_template_parm (TREE_TYPE (parm),
 template_parm_outer_level,
@@ -10864,6 +10865,10 @@ uses_outer_template_parms (tree decl)
  if (TREE_CODE (parm) == TEMPLATE_DECL
  && uses_outer_template_parms (parm))
return true;
+ if (defarg
+ && for_each_template_parm (defarg, template_parm_outer_level,
+, NULL, /*nondeduced*/true))
+   return true;
}
 }
   tree ci = get_constraints (decl);
diff --git a/gcc/testsuite/g++.dg/template/dependent-tmpl2.C 
b/gcc/testsuite/g++.dg/template/dependent-tmpl2.C
new file mode 100644
index 000..040ddb47ee6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/dependent-tmpl2.C
@@ -0,0 +1,10 @@
+// PR c++/100078
+// { dg-do compile { target c++11 } }
+
+template  struct enable_if;
+template  struct HashMapBucket {
+  template 
+  static typename enable_if::type selectStructure() {
+selectStructure();
+  }
+};

base-commit: 9b53edc796d284b6adec7f2996772dbddf4c341e
-- 
2.27.0



Re: [PATCH] Better const_vector printing

2021-04-14 Thread Richard Biener via Gcc-patches
On April 14, 2021 5:10:26 PM GMT+02:00, Richard Sandiford via Gcc-patches 
 wrote:
>Looking at PR99929 showed that we weren't dumping enough information
>about variable-length CONST_VECTORs.  Something like:
>
>  (const_vector:VNx4SI [(const_int 1) (const_int 0)])
>
>could be either:
>
>(a) 1, 0, 1, 0, repeating
>(b) 1 followed by all zeros
>
>This patch adds more information to the dumps.  There are four cases:
>
>(a) above:
>
>(const_vector:VNx4SI repeat [
>  (const_int 1)
>  (const_int 0)
>])
>
>(b) above:
>
>(const_vector:VNx4SI [
>  (const_int 1)
>  repeat [
>(const_int 0)
>  ]
>])
>
>a single stepped sequence:
>
>(const_vector:VNx4SI [
>  (const_int 0)
>  stepped [
>(const_int 1)
>(const_int 2)
>  ]
>])
>
>interleaved stepped sequences:
>
>(const_vector:VNx4SI [
>  (const_int 0)
>  (const_int 40)
>  stepped (interleave 2) [
>(const_int 1)
>(const_int 41)
>(const_int 2)
>(const_int 42)
>  ]
>])
>
>There are probably better syntaxes, but hopefully this is at least
>an improvement on the status quo.
>
>Tested on aarch64-linux-gnu, arm-linux-gnueabihf, armeb-eabi
>and x86_64-linux-gnu.  OK to install now, or should it wait
>until GCC 12?  (It only affects SVE in practice.)

Ok now (it should be harmless, no?) 

Thanks, 
Richard. 

>Richard
>
>
>gcc/
>   * print-rtl.c (rtx_writer::print_rtx_operand_codes_E_and_V): Print
>   more information about variable-length CONST_VECTORs.
>---
> gcc/print-rtl.c | 32 +++-
> 1 file changed, 31 insertions(+), 1 deletion(-)
>
>diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
>index c7982bce507..081fc50fab8 100644
>--- a/gcc/print-rtl.c
>+++ b/gcc/print-rtl.c
>@@ -370,6 +370,10 @@ rtx_writer::print_rtx_operand_codes_E_and_V
>(const_rtx in_rtx, int idx)
>   print_rtx_head, m_indent * 2, "");
>   m_sawclose = 0;
> }
>+  if (GET_CODE (in_rtx) == CONST_VECTOR
>+  && !GET_MODE_NUNITS (GET_MODE (in_rtx)).is_constant ()
>+  && CONST_VECTOR_DUPLICATE_P (in_rtx))
>+fprintf (m_outfile, " repeat");
>   fputs (" [", m_outfile);
>   if (XVEC (in_rtx, idx) != NULL)
> {
>@@ -377,12 +381,32 @@ rtx_writer::print_rtx_operand_codes_E_and_V
>(const_rtx in_rtx, int idx)
>   if (XVECLEN (in_rtx, idx))
>   m_sawclose = 1;
> 
>+  int barrier = XVECLEN (in_rtx, idx);
>+  if (GET_CODE (in_rtx) == CONST_VECTOR
>+&& !GET_MODE_NUNITS (GET_MODE (in_rtx)).is_constant ())
>+  barrier = CONST_VECTOR_NPATTERNS (in_rtx);
>+
>   for (int j = 0; j < XVECLEN (in_rtx, idx); j++)
>   {
> int j1;
> 
>+if (j == barrier)
>+  {
>+fprintf (m_outfile, "\n%s%*s",
>+ print_rtx_head, m_indent * 2, "");
>+if (!CONST_VECTOR_STEPPED_P (in_rtx))
>+  fprintf (m_outfile, "repeat [");
>+else if (CONST_VECTOR_NPATTERNS (in_rtx) == 1)
>+  fprintf (m_outfile, "stepped [");
>+else
>+  fprintf (m_outfile, "stepped (interleave %d) [",
>+   CONST_VECTOR_NPATTERNS (in_rtx));
>+m_indent += 2;
>+  }
>+
> print_rtx (XVECEXP (in_rtx, idx, j));
>-for (j1 = j + 1; j1 < XVECLEN (in_rtx, idx); j1++)
>+int limit = MIN (barrier, XVECLEN (in_rtx, idx));
>+for (j1 = j + 1; j1 < limit; j1++)
>   if (XVECEXP (in_rtx, idx, j) != XVECEXP (in_rtx, idx, j1))
> break;
> 
>@@ -393,6 +417,12 @@ rtx_writer::print_rtx_operand_codes_E_and_V
>(const_rtx in_rtx, int idx)
>   }
>   }
> 
>+  if (barrier < XVECLEN (in_rtx, idx))
>+  {
>+m_indent -= 2;
>+fprintf (m_outfile, "\n%s%*s]", print_rtx_head, m_indent * 2, "");
>+  }
>+
>   m_indent -= 2;
> }
>   if (m_sawclose)



Re: [PATCH] Check for matching CONST_VECTOR encodings [PR99929]

2021-04-14 Thread Richard Biener via Gcc-patches
On April 14, 2021 5:13:23 PM GMT+02:00, Richard Sandiford via Gcc-patches 
 wrote:
>PR99929 is one of those “how did we get away with this for so long”
>bugs: the equality routines weren't checking whether two
>variable-length
>CONST_VECTORs had the same encoding.  This meant that:
>
>   { 1, 0, 0, 0, 0, 0, ... }
>
>would appear to be equal to:
>
>   { 1, 0, 1, 0, 1, 0, ... }
>
>since both are represented using the elements { 1, 0 }.
>
>Tested on aarch64-linux-gnu, arm-linux-gnueabihf, armeb-eabi
>and x86_64-linux-gnu.  OK to install?  I'd like to backport
>as far as GCC 8, even though the testcase itself requires
>GCC 10 or later.

OK. 

Richard. 

>Richard
>
>
>gcc/
>   PR rtl-optimization/99929
>   * rtl.h (same_vector_encodings_p): New function.
>   * cse.c (exp_equiv_p): Check that CONST_VECTORs have the same
>encoding.
>   * cselib.c (rtx_equal_for_cselib_1): Likewise.
>   * jump.c (rtx_renumbered_equal_p): Likewise.
>   * lra-constraints.c (operands_match_p): Likewise.
>   * reload.c (operands_match_p): Likewise.
>   * rtl.c (rtx_equal_p_cb, rtx_equal_p): Likewise.
>
>gcc/testsuite/
>   * gcc.target/aarch64/sve/pr99929_1.c: New file.
>   * gcc.target/aarch64/sve/pr99929_2.c: Likewise.
>---
> gcc/cse.c   |  5 +
> gcc/cselib.c|  5 +
> gcc/jump.c  |  5 +
> gcc/lra-constraints.c   |  5 +
> gcc/reload.c|  5 +
> gcc/rtl.c   | 10 ++
> gcc/rtl.h   | 17 +
> .../gcc.target/aarch64/sve/pr99929_1.c  | 16 
> .../gcc.target/aarch64/sve/pr99929_2.c  |  5 +
> 9 files changed, 73 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr99929_1.c
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr99929_2.c
>
>diff --git a/gcc/cse.c b/gcc/cse.c
>index 37c6959abea..df191d5aa3f 100644
>--- a/gcc/cse.c
>+++ b/gcc/cse.c
>@@ -2637,6 +2637,11 @@ exp_equiv_p (const_rtx x, const_rtx y, int
>validate, bool for_gcse)
> CASE_CONST_UNIQUE:
>   return x == y;
> 
>+case CONST_VECTOR:
>+  if (!same_vector_encodings_p (x, y))
>+  return false;
>+  break;
>+
> case LABEL_REF:
>   return label_ref_label (x) == label_ref_label (y);
> 
>diff --git a/gcc/cselib.c b/gcc/cselib.c
>index 2d34a914c6b..779874eeb2d 100644
>--- a/gcc/cselib.c
>+++ b/gcc/cselib.c
>@@ -1048,6 +1048,11 @@ rtx_equal_for_cselib_1 (rtx x, rtx y,
>machine_mode memmode, int depth)
> case DEBUG_EXPR:
>   return 0;
> 
>+case CONST_VECTOR:
>+  if (!same_vector_encodings_p (x, y))
>+  return false;
>+  break;
>+
> case DEBUG_IMPLICIT_PTR:
>   return DEBUG_IMPLICIT_PTR_DECL (x)
>== DEBUG_IMPLICIT_PTR_DECL (y);
>diff --git a/gcc/jump.c b/gcc/jump.c
>index 561dbb70d15..67b5c3374a6 100644
>--- a/gcc/jump.c
>+++ b/gcc/jump.c
>@@ -1777,6 +1777,11 @@ rtx_renumbered_equal_p (const_rtx x, const_rtx
>y)
> CASE_CONST_UNIQUE:
>   return 0;
> 
>+case CONST_VECTOR:
>+  if (!same_vector_encodings_p (x, y))
>+  return false;
>+  break;
>+
> case LABEL_REF:
> /* We can't assume nonlocal labels have their following insns yet.  */
>   if (LABEL_REF_NONLOCAL_P (x) || LABEL_REF_NONLOCAL_P (y))
>diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
>index 62bcfc31772..1560f652da6 100644
>--- a/gcc/lra-constraints.c
>+++ b/gcc/lra-constraints.c
>@@ -834,6 +834,11 @@ operands_match_p (rtx x, rtx y, int y_hard_regno)
> CASE_CONST_UNIQUE:
>   return false;
> 
>+case CONST_VECTOR:
>+  if (!same_vector_encodings_p (x, y))
>+  return false;
>+  break;
>+
> case LABEL_REF:
>   return label_ref_label (x) == label_ref_label (y);
> case SYMBOL_REF:
>diff --git a/gcc/reload.c b/gcc/reload.c
>index 461fd0272eb..e18e27c2405 100644
>--- a/gcc/reload.c
>+++ b/gcc/reload.c
>@@ -2310,6 +2310,11 @@ operands_match_p (rtx x, rtx y)
> CASE_CONST_UNIQUE:
>   return 0;
> 
>+case CONST_VECTOR:
>+  if (!same_vector_encodings_p (x, y))
>+  return false;
>+  break;
>+
> case LABEL_REF:
>   return label_ref_label (x) == label_ref_label (y);
> case SYMBOL_REF:
>diff --git a/gcc/rtl.c b/gcc/rtl.c
>index 1aa794c82ca..e4ae1683069 100644
>--- a/gcc/rtl.c
>+++ b/gcc/rtl.c
>@@ -466,6 +466,11 @@ rtx_equal_p_cb (const_rtx x, const_rtx y,
>rtx_equal_p_callback_function cb)
> CASE_CONST_UNIQUE:
>   return 0;
> 
>+case CONST_VECTOR:
>+  if (!same_vector_encodings_p (x, y))
>+  return false;
>+  break;
>+
> case DEBUG_IMPLICIT_PTR:
>   return DEBUG_IMPLICIT_PTR_DECL (x)
>== DEBUG_IMPLICIT_PTR_DECL (y);
>@@ -608,6 +613,11 @@ rtx_equal_p (const_rtx x, const_rtx y)
> CASE_CONST_UNIQUE:
>   return 0;
> 
>+case CONST_VECTOR:

Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Segher Boessenkool
On Wed, Apr 14, 2021 at 08:01:11PM +0200, Jakub Jelinek wrote:
> On Wed, Apr 14, 2021 at 12:46:43PM -0500, Segher Boessenkool wrote:
> > The REGNO checks work fine for pseudos as well.  But, why does it do
> > this at all, instead of using match_dup?  That should be clearer.
> 
> Because with the hard regs it has different modes, so match_dup
> wouldn't work.

A subreg:QI of the match_dup should match fine.  You can use a subreg
wherever GCC tries to match a reg.

> match_dup means insn-recog.c calls rtx_equal_p and that returns false if the
> mode is not the same.

Yes, but they are the same :-)

(reg:SI whatever)  and  (subreg:QI (reg:SI whatever) 0)

> Before combine the 3 insns are:
> (insn 2 4 3 2 (set (reg/v:SI 96 [ i ])
> (zero_extend:SI (reg:QI 0 x0 [ i ]))) "pr100056.c":10:1 114 
> {*zero_extendqisi2_aarch64}
>  (expr_list:REG_DEAD (reg:QI 0 x0 [ i ])
> (nil)))

> and I must say I don't know if make_more_copies was meant to
> split insn 2 into (set (reg:QI pseudo) (reg:QI 0 x0)) and
> (set (reg/v:SI 96) (zero_extend:SI (reg:QI pseudo)))
> or not.

It makes

(set (reg:QI new) (reg:QI x0))
(set (reg:SI 96) (zero_extend:SI (reg:QI new)))

The point is it keeps exactly the same form, but no hard regs anymore.


Segher


Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Segher Boessenkool
On Wed, Apr 14, 2021 at 06:55:56PM +0100, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> > The REGNO checks work fine for pseudos as well.  But, why does it do
> > this at all, instead of using match_dup?  That should be clearer.
> 
> The register is appearing in two different modes: GPI for operand 1
> and something smaller than GPI for operand 3 (otherwise the extension
> would be ill-formed).  That's what makes it specific to hard registers.

You write your patterns with subregs for that?  That works fine for a
mode change on hard regs as well, afaik.

> > The point of make_more_copies is that the hard registers from function
> > arguments are not pushed down by combine into actual instructions.  This
> > can be done by RA if it thinks that is a good idea, and not done if it
> > thinks it is a bad idea.  Having combine usurp part of the register
> > allocators role is not a good idea.
> >
> > There are other reasons hard regs can still end up in RTL insns in
> > earlier RTL passes of course, but the other changes that went together
> > with make_more_copies stop combine from doing that a lot (the function
> > itself makes sure every hard reg is copied to a new pseudo, because
> > combining that trivial move (from that new pseudo to the pseudo it was
> > copying it to already!) can still be beneficial for other reasons, all
> > strange and pretty unhappy, but important on many targets).
> 
> What would your recommendation be for this pattern?  Is matching
> hard registers a bad idea, or should we go with it?

I would just use a subreg, and match_dup.  But maybe I am missing
something?


Segher


Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 14, 2021 at 12:46:43PM -0500, Segher Boessenkool wrote:
> The REGNO checks work fine for pseudos as well.  But, why does it do
> this at all, instead of using match_dup?  That should be clearer.

Because with the hard regs it has different modes, so match_dup
wouldn't work.

We are talking here about:
Trying 7, 2 -> 8:
7: r98:SI=x0:SI<<0xb&0x7f800
  REG_DEAD x0:QI
2: r96:SI=zero_extend(x0:QI)
8: r97:SI=r98:SI|r96:SI
  REG_DEAD r98:SI
  REG_DEAD r96:SI
Failed to match this instruction:
(set (reg:SI 97)
(ior:SI (and:SI (ashift:SI (reg:SI 0 x0 [ i ])
(const_int 11 [0xb]))
(const_int 522240 [0x7f800]))
(zero_extend:SI (reg:QI 0 x0 [ i ]
Failed to match this instruction:
(set (reg:SI 97)
(ior:SI (and:SI (ashift:SI (reg:SI 0 x0 [ i ])
(const_int 11 [0xb]))
(const_int 522240 [0x7f800]))
(and:SI (reg:SI 0 x0)
(const_int 255 [0xff]
Splitting with gen_split_28 (aarch64.md:4434)
Successfully matched this instruction:
(set (reg:SI 99)
(zero_extend:SI (reg:QI 0 x0 [ i ])))
Successfully matched this instruction:
(set (reg:SI 97)
(ior:SI (ashift:SI (reg:SI 99)
(const_int 11 [0xb]))
(reg:SI 99)))

match_dup means insn-recog.c calls rtx_equal_p and that returns false if the
mode is not the same.

Before combine the 3 insns are:
(insn 2 4 3 2 (set (reg/v:SI 96 [ i ])
(zero_extend:SI (reg:QI 0 x0 [ i ]))) "pr100056.c":10:1 114 
{*zero_extendqisi2_aarch64}
 (expr_list:REG_DEAD (reg:QI 0 x0 [ i ])
(nil)))
(note 3 2 7 2 NOTE_INSN_FUNCTION_BEG)
(insn 7 3 8 2 (set (reg:SI 98)
(ashift:SI (reg/v:SI 96 [ i ])
(const_int 11 [0xb]))) "pr100056.c":11:17 691 
{*aarch64_ashl_sisd_or_int_si3}
 (nil))
(insn 8 7 13 2 (set (reg:SI 97)
(ior:SI (reg:SI 98)
(reg/v:SI 96 [ i ]))) "pr100056.c":11:12 488 {iorsi3}
 (expr_list:REG_DEAD (reg:SI 98)
(expr_list:REG_DEAD (reg/v:SI 96 [ i ])
(nil
and I must say I don't know if make_more_copies was meant to
split insn 2 into (set (reg:QI pseudo) (reg:QI 0 x0)) and
(set (reg/v:SI 96) (zero_extend:SI (reg:QI pseudo)))
or not.

> The point of make_more_copies is that the hard registers from function
> arguments are not pushed down by combine into actual instructions.  This
> can be done by RA if it thinks that is a good idea, and not done if it
> thinks it is a bad idea.  Having combine usurp part of the register
> allocators role is not a good idea.
> 
> There are other reasons hard regs can still end up in RTL insns in
> earlier RTL passes of course, but the other changes that went together
> with make_more_copies stop combine from doing that a lot (the function
> itself makes sure every hard reg is copied to a new pseudo, because
> combining that trivial move (from that new pseudo to the pseudo it was
> copying it to already!) can still be beneficial for other reasons, all
> strange and pretty unhappy, but important on many targets).

Jakub



Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Richard Sandiford via Gcc-patches
Segher Boessenkool  writes:
> On Wed, Apr 14, 2021 at 05:31:23PM +0100, Richard Sandiford wrote:
>> Jakub Jelinek  writes:
>> > +(define_split
>> > +  [(set (match_operand:GPI 0 "register_operand")
>> > +  (LOGICAL:GPI
>> > +(and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
>> > + (match_operand:QI 2 "aarch64_shift_imm_"))
>> > + (match_operand:GPI 4 "const_int_operand"))
>> > +(zero_extend:GPI (match_operand 3 "register_operand"]
>> > +  "can_create_pseudo_p ()
>> > +   && REG_P (operands[1])
>> > +   && REG_P (operands[3])
>> > +   && REGNO (operands[1]) == REGNO (operands[3])
>> > +   && ((unsigned HOST_WIDE_INT)
>> > +   trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
>> > + << INTVAL (operands[2]), mode)
>> > +   == UINTVAL (operands[4]))"
>> 
>> IMO this would be easier to understand as:
>> 
>>&& (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
>> << INTVAL (operands[2]), mode)
>>== INTVAL (operands[4]))
>> 
>> (At first I thought the cast and UINTVAL were trying to escape the
>> sign-extension canonicalisation.)
>
> Yeah.  Or just that with UINTVAL, the implicit conversion is fine (but
> maybe that warns?)  But INTVAL is simpler for sure.
>
>> I'm not sure about this one though.  The REGNO checks mean that this is
>> effectively for hard registers only.  I thought one of the reasons for
>> make_more_copies was to avoid combining hard registers like this, so I'm
>> not sure we should have a pattern that specifically targets them.
>> 
>> Segher, have I misunderstood?
>
> The REGNO checks work fine for pseudos as well.  But, why does it do
> this at all, instead of using match_dup?  That should be clearer.

The register is appearing in two different modes: GPI for operand 1
and something smaller than GPI for operand 3 (otherwise the extension
would be ill-formed).  That's what makes it specific to hard registers.

> The point of make_more_copies is that the hard registers from function
> arguments are not pushed down by combine into actual instructions.  This
> can be done by RA if it thinks that is a good idea, and not done if it
> thinks it is a bad idea.  Having combine usurp part of the register
> allocators role is not a good idea.
>
> There are other reasons hard regs can still end up in RTL insns in
> earlier RTL passes of course, but the other changes that went together
> with make_more_copies stop combine from doing that a lot (the function
> itself makes sure every hard reg is copied to a new pseudo, because
> combining that trivial move (from that new pseudo to the pseudo it was
> copying it to already!) can still be beneficial for other reasons, all
> strange and pretty unhappy, but important on many targets).

What would your recommendation be for this pattern?  Is matching
hard registers a bad idea, or should we go with it?

Thanks,
Richard


Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Segher Boessenkool
On Wed, Apr 14, 2021 at 06:42:33PM +0100, Richard Sandiford wrote:
> Otherwise this looks good apart from the open question about whether
> we should be doing complex combinations involving hard regs.  Let's see
> what Segher says about that side.

It works find with pseudos as well, but then you need to use the same
pseudo twice in here.  Before RA you cannot know this will be the same
hard reg (and it is a bad idea in general to force that).

Disallowing forwarding hard regs (from function args) in combine helps
quite a bit on average, but there are cases like this one where the
balance swings the other way :-(


Segher


Re: [committed] add test for PR 86058

2021-04-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 14, 2021 at 10:49:42AM -0600, Martin Sebor via Gcc-patches wrote:
> Apparently the IL GCC emits on some targets (arm and aarach64 with
> mabi=ilp32, and powerpc64 to name the three where the failures have
> been pointed out) isn't handled by the uninit pass and so it doesn't
> issue the expected warning.  That might be a new (as in previously
> unknown) limitation in the warning or one I don't remember coming
> across.
> 
> I don't see excess warnings with my arm-eabi cross-compiler.  What
> are they in your environment?
> 
> I have limited the test to just x86_64 for now and repurposed pr100073
> where the same failure was reported on powerpc64 to track the missing
> warning on these targets.

+   The test fails on a number of non-x86_64 targets due to pr100073.
+   { dg-do compile { target x86_64-*-* } }

change is incorrect.
Either you mean x86_64 -m64 code only, then it should be
{ i?86-*-* x86_64-*-* } && lp64
or you mean x86_64 -m64/-mx32, then it should be
{ i?86-*-* x86_64-*-* } && { ! ia32 }
or you mean x86_64 -m64/-mx32/-m32, then it should be
{ i?86-*-* x86_64-*-* }
E.g. on Solaris target triplet is i?86-*-* but it supports -m64 also,
on the other side, x86_64-*-* triplet covers all supported multilibs
(so both -m64 and -m32 and sometimes -mx32), but will not cover
i686-*-* etc. even when it is the same thing as x86_64-*-* with -m32.

Jakub



[pushed] c++: non-static member, array bound, sizeof [PR93314]

2021-04-14 Thread Jason Merrill via Gcc-patches
N2253 allowed referring to non-static data members without an object in
unevaluated operands like that of sizeof, but in a constant-expression
context like an array bound or template argument within such an unevaluated
operand we do actually need a value, so that permission cannot apply.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/93314
* semantics.c (finish_id_expression_1): Clear cp_unevaluated_operand
for a non-static data member in a constant-expression.

gcc/testsuite/ChangeLog:

PR c++/93314
* g++.dg/parse/uneval1.C: New test.
---
 gcc/cp/semantics.c   | 10 ++
 gcc/testsuite/g++.dg/parse/uneval1.C | 14 ++
 2 files changed, 24 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/parse/uneval1.C

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 125772238d3..4520181d4e5 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -4093,6 +4093,12 @@ finish_id_expression_1 (tree id_expression,
 
  cp_warn_deprecated_use_scopes (scope);
 
+ /* In a constant-expression context, turn off cp_unevaluated_operand
+so finish_non_static_data_member will complain (93314).  */
+ auto eval = make_temp_override (cp_unevaluated_operand);
+ if (integral_constant_expression_p && TREE_CODE (decl) == FIELD_DECL)
+   cp_unevaluated_operand = 0;
+
  if (TYPE_P (scope))
decl = finish_qualified_id_expr (scope,
 decl,
@@ -4106,6 +4112,10 @@ finish_id_expression_1 (tree id_expression,
}
   else if (TREE_CODE (decl) == FIELD_DECL)
{
+ auto eval = make_temp_override (cp_unevaluated_operand);
+ if (integral_constant_expression_p)
+   cp_unevaluated_operand = 0;
+
  /* Since SCOPE is NULL here, this is an unqualified name.
 Access checking has been performed during name lookup
 already.  Turn off checking to avoid duplicate errors.  */
diff --git a/gcc/testsuite/g++.dg/parse/uneval1.C 
b/gcc/testsuite/g++.dg/parse/uneval1.C
new file mode 100644
index 000..dfc1bb4e0c3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/uneval1.C
@@ -0,0 +1,14 @@
+// PR c++/93314
+
+struct S {
+  int m;
+  static int f() {
+return sizeof(char[m]);// { dg-error "S::m" }
+  }
+};
+
+int main()
+{
+  return S().f()
++ sizeof(char[S::m]);  // { dg-error "S::m" }
+}

base-commit: f99f64f69db49ce6343d79a39eab28dcc6b91865
-- 
2.27.0



Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Segher Boessenkool
Hi!

On Wed, Apr 14, 2021 at 05:31:23PM +0100, Richard Sandiford wrote:
> Jakub Jelinek  writes:
> > +(define_split
> > +  [(set (match_operand:GPI 0 "register_operand")
> > +   (LOGICAL:GPI
> > + (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
> > +  (match_operand:QI 2 "aarch64_shift_imm_"))
> > +  (match_operand:GPI 4 "const_int_operand"))
> > + (zero_extend:GPI (match_operand 3 "register_operand"]
> > +  "can_create_pseudo_p ()
> > +   && REG_P (operands[1])
> > +   && REG_P (operands[3])
> > +   && REGNO (operands[1]) == REGNO (operands[3])
> > +   && ((unsigned HOST_WIDE_INT)
> > +   trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
> > +  << INTVAL (operands[2]), mode)
> > +   == UINTVAL (operands[4]))"
> 
> IMO this would be easier to understand as:
> 
>&& (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
>  << INTVAL (operands[2]), mode)
>== INTVAL (operands[4]))
> 
> (At first I thought the cast and UINTVAL were trying to escape the
> sign-extension canonicalisation.)

Yeah.  Or just that with UINTVAL, the implicit conversion is fine (but
maybe that warns?)  But INTVAL is simpler for sure.

> I'm not sure about this one though.  The REGNO checks mean that this is
> effectively for hard registers only.  I thought one of the reasons for
> make_more_copies was to avoid combining hard registers like this, so I'm
> not sure we should have a pattern that specifically targets them.
> 
> Segher, have I misunderstood?

The REGNO checks work fine for pseudos as well.  But, why does it do
this at all, instead of using match_dup?  That should be clearer.

The point of make_more_copies is that the hard registers from function
arguments are not pushed down by combine into actual instructions.  This
can be done by RA if it thinks that is a good idea, and not done if it
thinks it is a bad idea.  Having combine usurp part of the register
allocators role is not a good idea.

There are other reasons hard regs can still end up in RTL insns in
earlier RTL passes of course, but the other changes that went together
with make_more_copies stop combine from doing that a lot (the function
itself makes sure every hard reg is copied to a new pseudo, because
combining that trivial move (from that new pseudo to the pseudo it was
copying it to already!) can still be beneficial for other reasons, all
strange and pretty unhappy, but important on many targets).


Segher


Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> On Wed, Apr 14, 2021 at 05:31:23PM +0100, Richard Sandiford wrote:
>> > +(define_split
>> > +  [(set (match_operand:GPI 0 "register_operand")
>> > +  (LOGICAL:GPI
>> > +(and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
>> > + (match_operand:QI 2 "aarch64_shift_imm_"))
>> > + (match_operand:GPI 4 "const_int_operand"))
>> > +(zero_extend:GPI (match_operand 3 "register_operand"]
>> > +  "can_create_pseudo_p ()
>> > +   && REG_P (operands[1])
>> > +   && REG_P (operands[3])
>> > +   && REGNO (operands[1]) == REGNO (operands[3])
>> > +   && ((unsigned HOST_WIDE_INT)
>> > +   trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
>> > + << INTVAL (operands[2]), mode)
>> > +   == UINTVAL (operands[4]))"
>> 
>> IMO this would be easier to understand as:
>> 
>>&& (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
>> << INTVAL (operands[2]), mode)
>>== INTVAL (operands[4]))
>> 
>> (At first I thought the cast and UINTVAL were trying to escape the
>> sign-extension canonicalisation.)
>
> It is ok to write it that way, you're right, I wrote it with
> UINTVAL etc. because I initially didn't use trunc_int_for_mode
> but that is wrong for SImode if the mask is shifted into bit 31.
>
>> I'm not sure about this one though.  The REGNO checks mean that this is
>> effectively for hard registers only.  I thought one of the reasons for
>> make_more_copies was to avoid combining hard registers like this, so I'm
>> not sure we should have a pattern that specifically targets them.
>> 
>> Segher, have I misunderstood?
>
> Yes, this one works only with the hard regs, the problem is that when
> the hard regs are there, combiner doesn't try anything else, so without
> such splitter it punts on that.
> If I add yet another testcase which doesn't have hard registers, like:
> unsigned
> or_shift2 (void)
> {
>   unsigned char i = 0;
>   asm volatile ("" : "+r" (i));
>   return i | (i << 11);
> }
> then my patch doesn't handle that case, and the only splitter that would
> help would need to deal with:
> (set (reg/i:SI 0 x0)
> (ior:SI (and:SI (ashift:SI (subreg:SI (reg:QI 97 [ i ]) 0)
> (const_int 11 [0xb]))
> (const_int 522240 [0x7f800]))
> (zero_extend:SI (reg:QI 97 [ i ]
> I have added another combine splitter for this below.  But as you can
> see, what combiner simplification comes with isn't really consistent
> and orthogonal, different operations in there look quite differently :(.

Hmm, OK.  Still, the above looks reasonable on first principles.

>> These two look good to me apart from the cast nit.  The last one feels
>> like it's more general than just sign_extends though.  I guess it would
>> work for any duplicated operation that can be performed in a single
>> instruction.
>
> True, but only very small portion of them can actually make it through,
> it needs something that combine has been able to propagate into another
> instruction.  So if we know about other insns that would look the same
> and would actually be ever matched, we can e.g. define an operator predicate
> for it, but until we have testcases for that, not sure it is worth it.
>
> Here is an updated patch that handles also the zero extends without hard
> registers and doesn't have the UHWI casts (but untested for now except
> for the testcase):
>
> 2021-04-14  Jakub Jelinek  
>
>   PR target/100056
>   * config/aarch64/aarch64.md (*_3):
>   Add combine splitters for *_ashl3 with
>   ZERO_EXTEND, SIGN_EXTEND or AND.
>
>   * gcc.target/aarch64/pr100056.c: New test.
>
> --- gcc/config/aarch64/aarch64.md.jj  2021-04-13 20:41:45.030040848 +0200
> +++ gcc/config/aarch64/aarch64.md 2021-04-14 19:07:41.641623978 +0200
> @@ -4431,6 +4431,75 @@ (define_insn "*_[(set_attr "type" "logic_shift_imm")]
>  )
>  
> +(define_split
> +  [(set (match_operand:GPI 0 "register_operand")
> + (LOGICAL:GPI
> +   (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
> +(match_operand:QI 2 "aarch64_shift_imm_"))
> +(match_operand:GPI 4 "const_int_operand"))
> +   (zero_extend:GPI (match_operand 3 "register_operand"]
> +  "can_create_pseudo_p ()
> +   && REG_P (operands[1])
> +   && REG_P (operands[3])
> +   && REGNO (operands[1]) == REGNO (operands[3])
> +   && (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
> +<< INTVAL (operands[2]), mode)
> +   == INTVAL (operands[4]))"
> +  [(set (match_dup 4) (zero_extend:GPI (match_dup 3)))
> +   (set (match_dup 0) (LOGICAL:GPI (ashift:GPI (match_dup 4) (match_dup 2))
> +(match_dup 4)))]
> +  "operands[4] = gen_reg_rtx (mode);"
> +)
> +
> +(define_split
> +  [(set (match_operand:GPI 0 "register_operand")
> + (LOGICAL:GPI
> +   (and:GPI (ashift:GPI (match_operator:GPI 4 

[committed] [PR100066] Check paradoxical subreg when splitting hard reg live range

2021-04-14 Thread Vladimir Makarov via Gcc-patches

The following patch fixes

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100066

The patch was successfully bootstrapped and tested on x86-64, aarch64, 
and ppc64.


commit f99f64f69db49ce6343d79a39eab28dcc6b91865
Author: Vladimir N. Makarov 
Date:   Wed Apr 14 13:21:40 2021 -0400

[PR100066] Check paradoxical subreg when splitting hard reg live range

When splitting live range of a hard reg, LRA actually split multi-register
containing the hard reg.  So we need to check the biggest used mode of the hard reg on
paradoxical subregister when the natural and the biggest
mode are ordered.

gcc/ChangeLog:

PR rtl-optimization/100066
* lra-constraints.c (split_reg): Check paradoxical_subreg_p for
ordered modes when choosing splitting mode for hard reg.

gcc/testsuite/ChangeLog:

PR rtl-optimization/100066
* gcc.target/i386/pr100066.c: New.

diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 62bcfc31772..9425f2d7e73 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -5797,10 +5797,14 @@ split_reg (bool before_p, int original_regno, rtx_insn *insn,
   mode = lra_reg_info[hard_regno].biggest_mode;
   machine_mode reg_rtx_mode = GET_MODE (regno_reg_rtx[hard_regno]);
   /* A reg can have a biggest_mode of VOIDmode if it was only ever seen as
-	 part of a multi-word register.  In that case, just use the reg_rtx.
-	 Otherwise, limit the size to that of the biggest access in the
-	 function.  */
-  if (mode == VOIDmode)
+	 part of a multi-word register.  In that case, just use the reg_rtx
+	 mode.  Do the same also if the biggest mode was larger than a register
+	 or we can not compare the modes.  Otherwise, limit the size to that of
+	 the biggest access in the function.  */
+  if (mode == VOIDmode
+	  || !ordered_p (GET_MODE_PRECISION (mode),
+			 GET_MODE_PRECISION (reg_rtx_mode))
+	  || paradoxical_subreg_p (mode, reg_rtx_mode))
 	{
 	  original_reg = regno_reg_rtx[hard_regno];
 	  mode = reg_rtx_mode;
diff --git a/gcc/testsuite/gcc.target/i386/pr100066.c b/gcc/testsuite/gcc.target/i386/pr100066.c
new file mode 100644
index 000..a795864e172
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr100066.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { int128 } } } */
+/* { dg-options "-O1 -w" } */
+int pm;
+
+void
+w3 (int, int, int);
+
+void
+e6 (__int128 rt, long int mo)
+{
+  mo += rt / 0;
+  w3 (pm / mo, pm, 0);
+}


Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 14, 2021 at 05:31:23PM +0100, Richard Sandiford wrote:
> > +(define_split
> > +  [(set (match_operand:GPI 0 "register_operand")
> > +   (LOGICAL:GPI
> > + (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
> > +  (match_operand:QI 2 "aarch64_shift_imm_"))
> > +  (match_operand:GPI 4 "const_int_operand"))
> > + (zero_extend:GPI (match_operand 3 "register_operand"]
> > +  "can_create_pseudo_p ()
> > +   && REG_P (operands[1])
> > +   && REG_P (operands[3])
> > +   && REGNO (operands[1]) == REGNO (operands[3])
> > +   && ((unsigned HOST_WIDE_INT)
> > +   trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
> > +  << INTVAL (operands[2]), mode)
> > +   == UINTVAL (operands[4]))"
> 
> IMO this would be easier to understand as:
> 
>&& (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
>  << INTVAL (operands[2]), mode)
>== INTVAL (operands[4]))
> 
> (At first I thought the cast and UINTVAL were trying to escape the
> sign-extension canonicalisation.)

It is ok to write it that way, you're right, I wrote it with
UINTVAL etc. because I initially didn't use trunc_int_for_mode
but that is wrong for SImode if the mask is shifted into bit 31.

> I'm not sure about this one though.  The REGNO checks mean that this is
> effectively for hard registers only.  I thought one of the reasons for
> make_more_copies was to avoid combining hard registers like this, so I'm
> not sure we should have a pattern that specifically targets them.
> 
> Segher, have I misunderstood?

Yes, this one works only with the hard regs, the problem is that when
the hard regs are there, combiner doesn't try anything else, so without
such splitter it punts on that.
If I add yet another testcase which doesn't have hard registers, like:
unsigned
or_shift2 (void)
{
  unsigned char i = 0;
  asm volatile ("" : "+r" (i));
  return i | (i << 11);
}
then my patch doesn't handle that case, and the only splitter that would
help would need to deal with:
(set (reg/i:SI 0 x0)
(ior:SI (and:SI (ashift:SI (subreg:SI (reg:QI 97 [ i ]) 0)
(const_int 11 [0xb]))
(const_int 522240 [0x7f800]))
(zero_extend:SI (reg:QI 97 [ i ]
I have added another combine splitter for this below.  But as you can
see, what combiner simplification comes with isn't really consistent
and orthogonal, different operations in there look quite differently :(.

> These two look good to me apart from the cast nit.  The last one feels
> like it's more general than just sign_extends though.  I guess it would
> work for any duplicated operation that can be performed in a single
> instruction.

True, but only very small portion of them can actually make it through,
it needs something that combine has been able to propagate into another
instruction.  So if we know about other insns that would look the same
and would actually be ever matched, we can e.g. define an operator predicate
for it, but until we have testcases for that, not sure it is worth it.

Here is an updated patch that handles also the zero extends without hard
registers and doesn't have the UHWI casts (but untested for now except
for the testcase):

2021-04-14  Jakub Jelinek  

PR target/100056
* config/aarch64/aarch64.md (*_3):
Add combine splitters for *_ashl3 with
ZERO_EXTEND, SIGN_EXTEND or AND.

* gcc.target/aarch64/pr100056.c: New test.

--- gcc/config/aarch64/aarch64.md.jj2021-04-13 20:41:45.030040848 +0200
+++ gcc/config/aarch64/aarch64.md   2021-04-14 19:07:41.641623978 +0200
@@ -4431,6 +4431,75 @@ (define_insn "*_"))
+  (match_operand:GPI 4 "const_int_operand"))
+ (zero_extend:GPI (match_operand 3 "register_operand"]
+  "can_create_pseudo_p ()
+   && REG_P (operands[1])
+   && REG_P (operands[3])
+   && REGNO (operands[1]) == REGNO (operands[3])
+   && (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
+  << INTVAL (operands[2]), mode)
+   == INTVAL (operands[4]))"
+  [(set (match_dup 4) (zero_extend:GPI (match_dup 3)))
+   (set (match_dup 0) (LOGICAL:GPI (ashift:GPI (match_dup 4) (match_dup 2))
+  (match_dup 4)))]
+  "operands[4] = gen_reg_rtx (mode);"
+)
+
+(define_split
+  [(set (match_operand:GPI 0 "register_operand")
+   (LOGICAL:GPI
+ (and:GPI (ashift:GPI (match_operator:GPI 4 "subreg_lowpart_operator"
+[(match_operand 1 "register_operand")])
+  (match_operand:QI 2 "aarch64_shift_imm_"))
+  (match_operand:GPI 3 "const_int_operand"))
+ (zero_extend:GPI (match_dup 1]
+  "can_create_pseudo_p ()
+   && (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[1]))
+  << INTVAL (operands[2]), mode)
+   == INTVAL (operands[3]))"
+  [(set (match_dup 4) (zero_extend:GPI 

[PATCH] re PR tree-optimization/93210 (Sub-optimal code optimization on struct/combound constexpr (gcc vs. clang))

2021-04-14 Thread Stefan Schulze Frielinghaus via Gcc-patches
Regarding test gcc.dg/pr93210.c, on different targets GIMPLE code may
slightly differ which is why the scan-tree-dump-times directive may
fail.  For example, for a RETURN_EXPR on x86_64 we have

  return 0x11100f0e0d0c0a090807060504030201;

whereas on IBM Z the first operand is a RESULT_DECL like

   = 0x102030405060708090a0c0d0e0f1011;
  return ;

gcc/testsuite/ChangeLog:

* gcc.dg/pr93210.c: Adapt regex in order to also support a
RESULT_DECL as an operand for a RETURN_EXPR.

Ok for mainline?

---
 gcc/testsuite/gcc.dg/pr93210.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/pr93210.c b/gcc/testsuite/gcc.dg/pr93210.c
index ec4194b6b49..134d32bc505 100644
--- a/gcc/testsuite/gcc.dg/pr93210.c
+++ b/gcc/testsuite/gcc.dg/pr93210.c
@@ -1,7 +1,7 @@
 /* PR tree-optimization/93210 */
 /* { dg-do run } */
 /* { dg-options "-O2 -fdump-tree-optimized" } */
-/* { dg-final { scan-tree-dump-times "return \[0-9]\[0-9a-fA-FxX]*;" 31 
"optimized" } } */
+/* { dg-final { scan-tree-dump-times "(?:return| =) 
\[0-9]\[0-9a-fA-FxX]*;" 31 "optimized" } } */
 
 #ifdef __SIZEOF_INT128__
 typedef unsigned __int128 L;
-- 
2.23.0



Re: [committed] add test for PR 86058

2021-04-14 Thread Martin Sebor via Gcc-patches

On 4/14/21 2:11 AM, Christophe Lyon wrote:

On Tue, 13 Apr 2021 at 21:50, Martin Sebor via Gcc-patches
 wrote:


The issue has been fixed so r11-8161 just adds the test case:
https://gcc.gnu.org/g:8084ab15a3e300e3b2c537e56e0f3a1b00778aec



Hi,

This new test fails on arm (and aarch64 with -mabi=ilp32):
XFAIL: gcc.dg/pr86058.c pr? (test for warnings, line 13)
FAIL: gcc.dg/pr86058.c actual (test for warnings, line 13)
PASS: gcc.dg/pr86058.c (test for excess errors)

Can you check?


Apparently the IL GCC emits on some targets (arm and aarach64 with
mabi=ilp32, and powerpc64 to name the three where the failures have
been pointed out) isn't handled by the uninit pass and so it doesn't
issue the expected warning.  That might be a new (as in previously
unknown) limitation in the warning or one I don't remember coming
across.

I don't see excess warnings with my arm-eabi cross-compiler.  What
are they in your environment?

I have limited the test to just x86_64 for now and repurposed pr100073
where the same failure was reported on powerpc64 to track the missing
warning on these targets.

Martin


Re: [PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Richard Sandiford via Gcc-patches
Jakub Jelinek  writes:
> Hi!
>
> Before combiner added 2 to 2 combinations, the following testcase functions
> have been all compiled into 2 instructions, zero/sign extensions or and
> followed by orr with lsl, e.g. for the first function
> Trying 7 -> 8:
> 7: r96:SI=r94:SI<<0xb
> 8: r95:SI=r96:SI|r94:SI
>   REG_DEAD r96:SI
>   REG_DEAD r94:SI
> Successfully matched this instruction:
> (set (reg:SI 95)
> (ior:SI (ashift:SI (reg/v:SI 94 [ i ])
> (const_int 11 [0xb]))
> (reg/v:SI 94 [ i ])))
> is the important successful try_combine and so we end up with
> and w0, w0, 255
> orr w0, w0, w0, lsl 11
> in the body.
> With 2 to 2 combination, before that can trigger, another successful
> combination:
> Trying 2 -> 7:
> 2: r94:SI=zero_extend(x0:QI)
>   REG_DEAD x0:QI
> 7: r96:SI=r94:SI<<0xb
> is replaced with:
> (set (reg/v:SI 94 [ i ])
> (zero_extend:SI (reg:QI 0 x0 [ i ])))
> and
> (set (reg:SI 96)
> (and:SI (ashift:SI (reg:SI 0 x0 [ i ])
> (const_int 11 [0xb]))
> (const_int 522240 [0x7f800])))
> and in the end results in 3 instructions in the body:
> and w1, w0, 255
> ubfiz   w0, w0, 11, 8
> orr w0, w0, w1
> The following combine splitters help undo that when combiner tries to
> combine 3 instructions - the zero/sign extend or and, the other insn
> from the 2 to 2 combination ([us]bfiz) and the logical op, the CPUs
> don't have an insn to do everything in one op, but we can split it
> back into the zero/sign extend or and followed by logical with lsl.
>
> Bootstrapped/regtested on aarch64-linux, ok for trunk?
>
> 2021-04-14  Jakub Jelinek  
>
>   PR target/100056
>   * config/aarch64/aarch64.md (*_3):
>   Add combine splitters for *_ashl3 with
>   ZERO_EXTEND, SIGN_EXTEND or AND.
>
>   * gcc.target/aarch64/pr100056.c: New test.
>
> --- gcc/config/aarch64/aarch64.md.jj  2021-04-13 12:40:57.0 +0200
> +++ gcc/config/aarch64/aarch64.md 2021-04-13 19:54:17.015764651 +0200
> @@ -4431,6 +4431,59 @@ (define_insn "*_[(set_attr "type" "logic_shift_imm")]
>  )
>  
> +(define_split
> +  [(set (match_operand:GPI 0 "register_operand")
> + (LOGICAL:GPI
> +   (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
> +(match_operand:QI 2 "aarch64_shift_imm_"))
> +(match_operand:GPI 4 "const_int_operand"))
> +   (zero_extend:GPI (match_operand 3 "register_operand"]
> +  "can_create_pseudo_p ()
> +   && REG_P (operands[1])
> +   && REG_P (operands[3])
> +   && REGNO (operands[1]) == REGNO (operands[3])
> +   && ((unsigned HOST_WIDE_INT)
> +   trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
> +<< INTVAL (operands[2]), mode)
> +   == UINTVAL (operands[4]))"

IMO this would be easier to understand as:

   && (trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
   << INTVAL (operands[2]), mode)
   == INTVAL (operands[4]))

(At first I thought the cast and UINTVAL were trying to escape the
sign-extension canonicalisation.)

I'm not sure about this one though.  The REGNO checks mean that this is
effectively for hard registers only.  I thought one of the reasons for
make_more_copies was to avoid combining hard registers like this, so I'm
not sure we should have a pattern that specifically targets them.

Segher, have I misunderstood?

> +  [(set (match_dup 4) (zero_extend:GPI (match_dup 3)))
> +   (set (match_dup 0) (LOGICAL:GPI (ashift:GPI (match_dup 4) (match_dup 2))
> +(match_dup 4)))]
> +  "operands[4] = gen_reg_rtx (mode);"
> +)
> +
> +(define_split
> +  [(set (match_operand:GPI 0 "register_operand")
> + (LOGICAL:GPI
> +   (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
> +(match_operand:QI 2 "aarch64_shift_imm_"))
> +(match_operand:GPI 4 "const_int_operand"))
> +   (and:GPI (match_dup 1) (match_operand:GPI 3 "const_int_operand"]
> +  "can_create_pseudo_p ()
> +   && pow2_or_zerop (UINTVAL (operands[3]) + 1)
> +   && ((unsigned HOST_WIDE_INT)
> +   trunc_int_for_mode (UINTVAL (operands[3])
> +<< INTVAL (operands[2]), mode)
> +   == UINTVAL (operands[4]))"
> +  [(set (match_dup 4) (and:GPI (match_dup 1) (match_dup 3)))
> +   (set (match_dup 0) (LOGICAL:GPI (ashift:GPI (match_dup 4) (match_dup 2))
> +(match_dup 4)))]
> +  "operands[4] = gen_reg_rtx (mode);"
> +)
> +
> +(define_split
> +  [(set (match_operand:GPI 0 "register_operand")
> + (LOGICAL:GPI
> +   (ashift:GPI (sign_extend:GPI (match_operand 1 "register_operand"))
> +   (match_operand:QI 2 "aarch64_shift_imm_"))
> +   (sign_extend:GPI (match_dup 1]
> +  "can_create_pseudo_p ()"
> +  [(set (match_dup 4) (sign_extend:GPI (match_dup 1)))
> +   (set (match_dup 0) 

[PATCH] c++: Tweak merging of vector attributes that affect type identity [PR98852]

2021-04-14 Thread Richard Sandiford via Gcc-patches
 types are distinct from GNU vector types in at least
their mangling.  However, there used to be nothing explicit in the
VECTOR_TYPE itself to indicate the difference: we simply treated them
as distinct TYPE_MAIN_VARIANTs.  This caused problems like the ones
reported in PR95726.

The fix for that PR was to add type attributes to the 
types, in order to maintain the distinction between them and GNU
vectors.  However, this in turn caused PR98852, where cp_common_type
would merge the type attributes from the two source types and attach
the result to the common type.  For example:

   unsigned vector with no attribute + signed vector with attribute X

would get converted to:

   unsigned vector with attribute X

That isn't what we want in this case, since X describes the mangling
of the original type.  But even if we dropped the mangling from X and
worked it out from context, we would still have a situation in which
the common type was provably distinct from both of the source types:
it would take its -ness from one side and its signedness
from the other.  I guess there are other cases where the common type
doesn't match either side, but I'm not sure it's the obvious behaviour
here.  It's also different from GCC 10.1 and earlier, where the unsigned
vector “won” in its original form.

This patch instead merges only the attributes that don't affect type
identity.  For now I've restricted it to vector types, since we're so
close to GCC 11, but it might make sense to use this elsewhere.

I've tried to audit the C and target-specific attributes to look for
other types that might be affected by this, but I couldn't see any.
The closest was s390_vector_bool, but the handler for that attribute
changes the type node and drops the attribute itself
(*no_add_attrs = true).

Tested on aarch64-linux-gnu, arm-linux-gnueabihf, armeb-eabi and
x86_64-linux-gnu.  OK for trunk?  The bug also occurs on GCC 10 branch,
but we'll need a slightly different fix there.

Richard


gcc/
PR c++/98852
* attribs.h (restrict_type_identity_attributes_to): Declare.
* attribs.c (restrict_type_identity_attributes_to): New function.

gcc/cp/
PR c++/98852
* typeck.c (merge_type_attributes_from): New function.
(cp_common_type): Use it for vector types.
---
 gcc/attribs.c |  23 
 gcc/attribs.h |   1 +
 gcc/cp/typeck.c   |  15 ++-
 .../advsimd-intrinsics/advsimd-intrinsics.exp |  72 
 .../aarch64/advsimd-intrinsics/pr98852.C  | 110 ++
 5 files changed, 219 insertions(+), 2 deletions(-)
 create mode 100644 
gcc/testsuite/g++.target/aarch64/advsimd-intrinsics/advsimd-intrinsics.exp
 create mode 100644 
gcc/testsuite/g++.target/aarch64/advsimd-intrinsics/pr98852.C

diff --git a/gcc/attribs.c b/gcc/attribs.c
index 2fb29541f3f..3ffa1b6bc81 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -1420,6 +1420,29 @@ affects_type_identity_attributes (tree attrs, bool value)
   return remove_attributes_matching (attrs, predicate);
 }
 
+/* Remove attributes that affect type identity from ATTRS unless the
+   same attributes occur in OK_ATTRS.  */
+
+tree
+restrict_type_identity_attributes_to (tree attrs, tree ok_attrs)
+{
+  auto predicate = [ok_attrs](const_tree attr,
+ const attribute_spec *as) -> bool
+{
+  if (!as || !as->affects_type_identity)
+   return true;
+
+  for (tree ok_attr = lookup_attribute (as->name, ok_attrs);
+  ok_attr;
+  ok_attr = lookup_attribute (as->name, TREE_CHAIN (ok_attr)))
+   if (simple_cst_equal (TREE_VALUE (ok_attr), TREE_VALUE (attr)) == 1)
+ return true;
+
+  return false;
+};
+  return remove_attributes_matching (attrs, predicate);
+}
+
 /* Return a type like TTYPE except that its TYPE_ATTRIBUTE
is ATTRIBUTE.
 
diff --git a/gcc/attribs.h b/gcc/attribs.h
index eadb1d0fac9..df78eb152f9 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -66,6 +66,7 @@ extern bool attribute_value_equal (const_tree, const_tree);
 extern int comp_type_attributes (const_tree, const_tree);
 
 extern tree affects_type_identity_attributes (tree, bool = true);
+extern tree restrict_type_identity_attributes_to (tree, tree);
 
 /* Default versions of target-overridable functions.  */
 extern tree merge_decl_attributes (tree, tree);
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 11dee7d8753..50d0f1e6a62 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -261,6 +261,17 @@ original_type (tree t)
   return cp_build_qualified_type (t, quals);
 }
 
+/* Merge the attributes of type OTHER_TYPE into the attributes of type TYPE
+   and return a variant of TYPE with the merged attributes.  */
+
+static tree
+merge_type_attributes_from (tree type, tree other_type)
+{
+  tree attrs = targetm.merge_type_attributes (type, other_type);
+  attrs = restrict_type_identity_attributes_to (attrs, TYPE_ATTRIBUTES 

[PATCH] c: Don't drop vector attributes that affect type identity [PR98852]

2021-04-14 Thread Richard Sandiford via Gcc-patches
 types are distinct from GNU vector types in at least
their mangling.  However, there used to be nothing explicit in the
VECTOR_TYPE itself to indicate the difference: we simply treated them
as distinct TYPE_MAIN_VARIANTs.  This caused problems like the ones
reported in PR95726.

The fix for that PR was to add type attributes to the 
types, in order to maintain the distinction between them and GNU
vectors.  However, this in turn caused PR98852, where c_common_type
would unconditionally drop the attributes on the source types.
This meant that:

vector +  vector

had a GNU vector type rather than an  vector type.

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96377#c2 for
Jakub's analysis of the history of this c_common_type code.
TBH I'm not sure which case the build_type_attribute_variant
code is handling, but I think we should at least avoid dropping
attributes that affect type identity.

I've tried to audit the C and target-specific attributes to look
for other types that might be affected by this, but I couldn't
see any.  We are only dealing with:

  gcc_assert (code1 == VECTOR_TYPE || code1 == COMPLEX_TYPE
  || code1 == FIXED_POINT_TYPE || code1 == REAL_TYPE
  || code1 == INTEGER_TYPE);

which excludes most affects_type_identity attributes.  The closest
was s390_vector_bool, but the handler for that attribute changes
the type node and drops the attribute itself (*no_add_attrs = true).

I put the main list handling into a separate function
(remove_attributes_matching) because a later patch will need it
for something else.

Tested on aarch64-linux-gnu, arm-linux-gnueabihf, armeb-eabi and
x86_64-linux-gnu.  OK for trunk?  The bug also occurs on GCC 10 branch,
but we'll need a slightly different fix there.

Thanks,
Richard


gcc/
PR c/98852
* attribs.h (affects_type_identity_attributes): Declare.
* attribs.c (remove_attributes_matching): New function.
(affects_type_identity_attributes): Likewise.

gcc/c/
PR c/98852
* c-typeck.c (c_common_type): Do not drop attributes that
affect type identity.

gcc/testsuite/
PR c/98852
* gcc.target/aarch64/advsimd-intrinsics/pr98852.c: New test.
---
 gcc/attribs.c |  54 
 gcc/attribs.h |   2 +
 gcc/c/c-typeck.c  |  10 +-
 .../aarch64/advsimd-intrinsics/pr98852.c  | 129 ++
 4 files changed, 193 insertions(+), 2 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/pr98852.c

diff --git a/gcc/attribs.c b/gcc/attribs.c
index 16c6b12d477..2fb29541f3f 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -1366,6 +1366,60 @@ comp_type_attributes (const_tree type1, const_tree type2)
   return targetm.comp_type_attributes (type1, type2);
 }
 
+/* PREDICATE acts as a function of type:
+
+ (const_tree attr, const attribute_spec *as) -> bool
+
+   where ATTR is an attribute and AS is its possibly-null specification.
+   Return a list of every attribute in attribute list ATTRS for which
+   PREDICATE is true.  Return ATTRS itself if PREDICATE returns true
+   for every attribute.  */
+
+template
+tree
+remove_attributes_matching (tree attrs, Predicate predicate)
+{
+  tree new_attrs = NULL_TREE;
+  tree *ptr = _attrs;
+  const_tree start = attrs;
+  for (const_tree attr = attrs; attr; attr = TREE_CHAIN (attr))
+{
+  tree name = get_attribute_name (attr);
+  const attribute_spec *as = lookup_attribute_spec (name);
+  const_tree end;
+  if (!predicate (attr, as))
+   end = attr;
+  else if (start == attrs)
+   continue;
+  else
+   end = TREE_CHAIN (attr);
+
+  for (; start != end; start = TREE_CHAIN (start))
+   {
+ *ptr = tree_cons (TREE_PURPOSE (start),
+   TREE_VALUE (start), NULL_TREE);
+ TREE_CHAIN (*ptr) = NULL_TREE;
+ ptr = _CHAIN (*ptr);
+   }
+  start = TREE_CHAIN (attr);
+}
+  gcc_assert (!start || start == attrs);
+  return start ? attrs : new_attrs;
+}
+
+/* If VALUE is true, return the subset of ATTRS that affect type identity,
+   otherwise return the subset of ATTRS that don't affect type identity.  */
+
+tree
+affects_type_identity_attributes (tree attrs, bool value)
+{
+  auto predicate = [value](const_tree, const attribute_spec *as) -> bool
+{
+  return bool (as && as->affects_type_identity) == value;
+};
+  return remove_attributes_matching (attrs, predicate);
+}
+
 /* Return a type like TTYPE except that its TYPE_ATTRIBUTE
is ATTRIBUTE.
 
diff --git a/gcc/attribs.h b/gcc/attribs.h
index 898e73db3e4..eadb1d0fac9 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -65,6 +65,8 @@ extern bool attribute_value_equal (const_tree, const_tree);
warning to be generated).  */
 extern int comp_type_attributes (const_tree, const_tree);
 
+extern tree affects_type_identity_attributes (tree, bool = true);

[PATCH] aarch64: Handle more SVE vector constants [PR99246]

2021-04-14 Thread Richard Sandiford via Gcc-patches
PR99246 is about a case in which we failed to handle a CONST_VECTOR
with NELTS_PER_PATTERN==2, i.e. a vector with a “foreground” sequence
of N vectors followed by a repeating “background” sequence of N vectors.

At the moment, it's difficult to produce these vectors directly,
but I'm hoping that for GCC 12 we'll do more folding, which will
in turn make this easier to test and easier to optimise.  Until then,
the patch simply relies on the testcase in the PR.

Tested on aarch64-linux-gnu, pushed to trunk so far.

Richard


gcc/
PR target/99246
* config/aarch64/aarch64.c (aarch64_expand_sve_const_vector_sel):
New function.
(aarch64_expand_sve_const_vector): Use it for nelts_per_pattern==2.

gcc/testsuite/
PR target/99246
* gcc.target/aarch64/sve/acle/general/pr99246.c: New test.
---
 gcc/config/aarch64/aarch64.c  | 54 +++
 .../aarch64/sve/acle/general/pr99246.c| 17 ++
 2 files changed, 71 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr99246.c

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 640550419dc..04b55d9070b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5166,6 +5166,56 @@ aarch64_expand_sve_ld1rq (rtx dest, rtx src)
   return true;
 }
 
+/* SRC is an SVE CONST_VECTOR that contains N "foreground" values followed
+   by N "background" values.  Try to move it into TARGET using:
+
+  PTRUE PRED., VL
+  MOV TRUE., #
+  MOV FALSE., #
+  SEL TARGET., PRED., TRUE., FALSE.
+
+   The PTRUE is always a single instruction but the MOVs might need a
+   longer sequence.  If the background value is zero (as it often is),
+   the sequence can sometimes collapse to a PTRUE followed by a
+   zero-predicated move.
+
+   Return the target on success, otherwise return null.  */
+
+static rtx
+aarch64_expand_sve_const_vector_sel (rtx target, rtx src)
+{
+  gcc_assert (CONST_VECTOR_NELTS_PER_PATTERN (src) == 2);
+
+  /* Make sure that the PTRUE is valid.  */
+  machine_mode mode = GET_MODE (src);
+  machine_mode pred_mode = aarch64_sve_pred_mode (mode);
+  unsigned int npatterns = CONST_VECTOR_NPATTERNS (src);
+  if (aarch64_svpattern_for_vl (pred_mode, npatterns)
+  == AARCH64_NUM_SVPATTERNS)
+return NULL_RTX;
+
+  rtx_vector_builder pred_builder (pred_mode, npatterns, 2);
+  rtx_vector_builder true_builder (mode, npatterns, 1);
+  rtx_vector_builder false_builder (mode, npatterns, 1);
+  for (unsigned int i = 0; i < npatterns; ++i)
+{
+  true_builder.quick_push (CONST_VECTOR_ENCODED_ELT (src, i));
+  pred_builder.quick_push (CONST1_RTX (BImode));
+}
+  for (unsigned int i = 0; i < npatterns; ++i)
+{
+  false_builder.quick_push (CONST_VECTOR_ENCODED_ELT (src, i + npatterns));
+  pred_builder.quick_push (CONST0_RTX (BImode));
+}
+  expand_operand ops[4];
+  create_output_operand ([0], target, mode);
+  create_input_operand ([1], true_builder.build (), mode);
+  create_input_operand ([2], false_builder.build (), mode);
+  create_input_operand ([3], pred_builder.build (), pred_mode);
+  expand_insn (code_for_vcond_mask (mode, mode), 4, ops);
+  return target;
+}
+
 /* Return a register containing CONST_VECTOR SRC, given that SRC has an
SVE data mode and isn't a legitimate constant.  Use TARGET for the
result if convenient.
@@ -5300,6 +5350,10 @@ aarch64_expand_sve_const_vector (rtx target, rtx src)
   if (GET_MODE_NUNITS (mode).is_constant ())
 return NULL_RTX;
 
+  if (nelts_per_pattern == 2)
+if (rtx res = aarch64_expand_sve_const_vector_sel (target, src))
+  return res;
+
   /* Expand each pattern individually.  */
   gcc_assert (npatterns > 1);
   rtx_vector_builder builder;
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr99246.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr99246.c
new file mode 100644
index 000..7f1079c1bd6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/pr99246.c
@@ -0,0 +1,17 @@
+/* { dg-options "-Os" } */
+
+#include 
+extern char b[];
+int x;
+void f() {
+  while (x) {
+x = svaddv(
+svnot_z(svnot_z(svptrue_pat_b8(SV_VL6),
+svmov_z(svptrue_pat_b8(SV_VL1),
+svptrue_pat_b16(SV_VL3))),
+svptrue_pat_b64(SV_VL2)),
+svdup_s32(8193));
+for (int j = x; j; j++)
+  b[j] = 0;
+  }
+}


[PATCH] Check for matching CONST_VECTOR encodings [PR99929]

2021-04-14 Thread Richard Sandiford via Gcc-patches
PR99929 is one of those “how did we get away with this for so long”
bugs: the equality routines weren't checking whether two variable-length
CONST_VECTORs had the same encoding.  This meant that:

   { 1, 0, 0, 0, 0, 0, ... }

would appear to be equal to:

   { 1, 0, 1, 0, 1, 0, ... }

since both are represented using the elements { 1, 0 }.

Tested on aarch64-linux-gnu, arm-linux-gnueabihf, armeb-eabi
and x86_64-linux-gnu.  OK to install?  I'd like to backport
as far as GCC 8, even though the testcase itself requires
GCC 10 or later.

Richard


gcc/
PR rtl-optimization/99929
* rtl.h (same_vector_encodings_p): New function.
* cse.c (exp_equiv_p): Check that CONST_VECTORs have the same encoding.
* cselib.c (rtx_equal_for_cselib_1): Likewise.
* jump.c (rtx_renumbered_equal_p): Likewise.
* lra-constraints.c (operands_match_p): Likewise.
* reload.c (operands_match_p): Likewise.
* rtl.c (rtx_equal_p_cb, rtx_equal_p): Likewise.

gcc/testsuite/
* gcc.target/aarch64/sve/pr99929_1.c: New file.
* gcc.target/aarch64/sve/pr99929_2.c: Likewise.
---
 gcc/cse.c   |  5 +
 gcc/cselib.c|  5 +
 gcc/jump.c  |  5 +
 gcc/lra-constraints.c   |  5 +
 gcc/reload.c|  5 +
 gcc/rtl.c   | 10 ++
 gcc/rtl.h   | 17 +
 .../gcc.target/aarch64/sve/pr99929_1.c  | 16 
 .../gcc.target/aarch64/sve/pr99929_2.c  |  5 +
 9 files changed, 73 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr99929_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr99929_2.c

diff --git a/gcc/cse.c b/gcc/cse.c
index 37c6959abea..df191d5aa3f 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -2637,6 +2637,11 @@ exp_equiv_p (const_rtx x, const_rtx y, int validate, 
bool for_gcse)
 CASE_CONST_UNIQUE:
   return x == y;
 
+case CONST_VECTOR:
+  if (!same_vector_encodings_p (x, y))
+   return false;
+  break;
+
 case LABEL_REF:
   return label_ref_label (x) == label_ref_label (y);
 
diff --git a/gcc/cselib.c b/gcc/cselib.c
index 2d34a914c6b..779874eeb2d 100644
--- a/gcc/cselib.c
+++ b/gcc/cselib.c
@@ -1048,6 +1048,11 @@ rtx_equal_for_cselib_1 (rtx x, rtx y, machine_mode 
memmode, int depth)
 case DEBUG_EXPR:
   return 0;
 
+case CONST_VECTOR:
+  if (!same_vector_encodings_p (x, y))
+   return false;
+  break;
+
 case DEBUG_IMPLICIT_PTR:
   return DEBUG_IMPLICIT_PTR_DECL (x)
 == DEBUG_IMPLICIT_PTR_DECL (y);
diff --git a/gcc/jump.c b/gcc/jump.c
index 561dbb70d15..67b5c3374a6 100644
--- a/gcc/jump.c
+++ b/gcc/jump.c
@@ -1777,6 +1777,11 @@ rtx_renumbered_equal_p (const_rtx x, const_rtx y)
 CASE_CONST_UNIQUE:
   return 0;
 
+case CONST_VECTOR:
+  if (!same_vector_encodings_p (x, y))
+   return false;
+  break;
+
 case LABEL_REF:
   /* We can't assume nonlocal labels have their following insns yet.  */
   if (LABEL_REF_NONLOCAL_P (x) || LABEL_REF_NONLOCAL_P (y))
diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
index 62bcfc31772..1560f652da6 100644
--- a/gcc/lra-constraints.c
+++ b/gcc/lra-constraints.c
@@ -834,6 +834,11 @@ operands_match_p (rtx x, rtx y, int y_hard_regno)
 CASE_CONST_UNIQUE:
   return false;
 
+case CONST_VECTOR:
+  if (!same_vector_encodings_p (x, y))
+   return false;
+  break;
+
 case LABEL_REF:
   return label_ref_label (x) == label_ref_label (y);
 case SYMBOL_REF:
diff --git a/gcc/reload.c b/gcc/reload.c
index 461fd0272eb..e18e27c2405 100644
--- a/gcc/reload.c
+++ b/gcc/reload.c
@@ -2310,6 +2310,11 @@ operands_match_p (rtx x, rtx y)
 CASE_CONST_UNIQUE:
   return 0;
 
+case CONST_VECTOR:
+  if (!same_vector_encodings_p (x, y))
+   return false;
+  break;
+
 case LABEL_REF:
   return label_ref_label (x) == label_ref_label (y);
 case SYMBOL_REF:
diff --git a/gcc/rtl.c b/gcc/rtl.c
index 1aa794c82ca..e4ae1683069 100644
--- a/gcc/rtl.c
+++ b/gcc/rtl.c
@@ -466,6 +466,11 @@ rtx_equal_p_cb (const_rtx x, const_rtx y, 
rtx_equal_p_callback_function cb)
 CASE_CONST_UNIQUE:
   return 0;
 
+case CONST_VECTOR:
+  if (!same_vector_encodings_p (x, y))
+   return false;
+  break;
+
 case DEBUG_IMPLICIT_PTR:
   return DEBUG_IMPLICIT_PTR_DECL (x)
 == DEBUG_IMPLICIT_PTR_DECL (y);
@@ -608,6 +613,11 @@ rtx_equal_p (const_rtx x, const_rtx y)
 CASE_CONST_UNIQUE:
   return 0;
 
+case CONST_VECTOR:
+  if (!same_vector_encodings_p (x, y))
+   return false;
+  break;
+
 case DEBUG_IMPLICIT_PTR:
   return DEBUG_IMPLICIT_PTR_DECL (x)
 == DEBUG_IMPLICIT_PTR_DECL (y);
diff --git a/gcc/rtl.h 

[PATCH] Better const_vector printing

2021-04-14 Thread Richard Sandiford via Gcc-patches
Looking at PR99929 showed that we weren't dumping enough information
about variable-length CONST_VECTORs.  Something like:

  (const_vector:VNx4SI [(const_int 1) (const_int 0)])

could be either:

(a) 1, 0, 1, 0, repeating
(b) 1 followed by all zeros

This patch adds more information to the dumps.  There are four cases:

(a) above:

(const_vector:VNx4SI repeat [
  (const_int 1)
  (const_int 0)
])

(b) above:

(const_vector:VNx4SI [
  (const_int 1)
  repeat [
(const_int 0)
  ]
])

a single stepped sequence:

(const_vector:VNx4SI [
  (const_int 0)
  stepped [
(const_int 1)
(const_int 2)
  ]
])

interleaved stepped sequences:

(const_vector:VNx4SI [
  (const_int 0)
  (const_int 40)
  stepped (interleave 2) [
(const_int 1)
(const_int 41)
(const_int 2)
(const_int 42)
  ]
])

There are probably better syntaxes, but hopefully this is at least
an improvement on the status quo.

Tested on aarch64-linux-gnu, arm-linux-gnueabihf, armeb-eabi
and x86_64-linux-gnu.  OK to install now, or should it wait
until GCC 12?  (It only affects SVE in practice.)

Richard


gcc/
* print-rtl.c (rtx_writer::print_rtx_operand_codes_E_and_V): Print
more information about variable-length CONST_VECTORs.
---
 gcc/print-rtl.c | 32 +++-
 1 file changed, 31 insertions(+), 1 deletion(-)

diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
index c7982bce507..081fc50fab8 100644
--- a/gcc/print-rtl.c
+++ b/gcc/print-rtl.c
@@ -370,6 +370,10 @@ rtx_writer::print_rtx_operand_codes_E_and_V (const_rtx 
in_rtx, int idx)
   print_rtx_head, m_indent * 2, "");
   m_sawclose = 0;
 }
+  if (GET_CODE (in_rtx) == CONST_VECTOR
+  && !GET_MODE_NUNITS (GET_MODE (in_rtx)).is_constant ()
+  && CONST_VECTOR_DUPLICATE_P (in_rtx))
+fprintf (m_outfile, " repeat");
   fputs (" [", m_outfile);
   if (XVEC (in_rtx, idx) != NULL)
 {
@@ -377,12 +381,32 @@ rtx_writer::print_rtx_operand_codes_E_and_V (const_rtx 
in_rtx, int idx)
   if (XVECLEN (in_rtx, idx))
m_sawclose = 1;
 
+  int barrier = XVECLEN (in_rtx, idx);
+  if (GET_CODE (in_rtx) == CONST_VECTOR
+ && !GET_MODE_NUNITS (GET_MODE (in_rtx)).is_constant ())
+   barrier = CONST_VECTOR_NPATTERNS (in_rtx);
+
   for (int j = 0; j < XVECLEN (in_rtx, idx); j++)
{
  int j1;
 
+ if (j == barrier)
+   {
+ fprintf (m_outfile, "\n%s%*s",
+  print_rtx_head, m_indent * 2, "");
+ if (!CONST_VECTOR_STEPPED_P (in_rtx))
+   fprintf (m_outfile, "repeat [");
+ else if (CONST_VECTOR_NPATTERNS (in_rtx) == 1)
+   fprintf (m_outfile, "stepped [");
+ else
+   fprintf (m_outfile, "stepped (interleave %d) [",
+CONST_VECTOR_NPATTERNS (in_rtx));
+ m_indent += 2;
+   }
+
  print_rtx (XVECEXP (in_rtx, idx, j));
- for (j1 = j + 1; j1 < XVECLEN (in_rtx, idx); j1++)
+ int limit = MIN (barrier, XVECLEN (in_rtx, idx));
+ for (j1 = j + 1; j1 < limit; j1++)
if (XVECEXP (in_rtx, idx, j) != XVECEXP (in_rtx, idx, j1))
  break;
 
@@ -393,6 +417,12 @@ rtx_writer::print_rtx_operand_codes_E_and_V (const_rtx 
in_rtx, int idx)
}
}
 
+  if (barrier < XVECLEN (in_rtx, idx))
+   {
+ m_indent -= 2;
+ fprintf (m_outfile, "\n%s%*s]", print_rtx_head, m_indent * 2, "");
+   }
+
   m_indent -= 2;
 }
   if (m_sawclose)
-- 
2.17.1



[Committed] IBM Z: Fix error checking for immediate builtin operands

2021-04-14 Thread Andreas Krebbel via Gcc-patches
This fixes the error checking for two of the vector builtins which
accept irregular (e.g. non-contigiuous) ranges of values.

Regression tested on s390x (--with-arch=arch13).
Applied to mainline. Needs to go into 9 and 10 branch as well.

gcc/ChangeLog:

* config/s390/s390-builtins.def (O_M5, O_M12, ...): Add new macros
for mask operand types.
(s390_vec_permi_s64, s390_vec_permi_b64, s390_vec_permi_u64)
(s390_vec_permi_dbl, s390_vpdi): Use the M5 type for the immediate
operand.
(s390_vec_msum_u128, s390_vmslg): Use the M12 type for the
immediate operand.
* config/s390/s390.c (s390_const_operand_ok): Check the new
operand types and generate a list of valid values.

gcc/testsuite/ChangeLog:

* gcc.target/s390/zvector/imm-range-error-1.c: New test.
* gcc.target/s390/zvector/vec_msum_u128-1.c: New test.
---
 gcc/config/s390/s390-builtins.def | 85 ---
 gcc/config/s390/s390.c| 35 ++--
 .../s390/zvector/imm-range-error-1.c  | 26 ++
 .../gcc.target/s390/zvector/vec_msum_u128-1.c | 45 ++
 4 files changed, 156 insertions(+), 35 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/imm-range-error-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_msum_u128-1.c

diff --git a/gcc/config/s390/s390-builtins.def 
b/gcc/config/s390/s390-builtins.def
index 129d7124cba..f77ab750d22 100644
--- a/gcc/config/s390/s390-builtins.def
+++ b/gcc/config/s390/s390-builtins.def
@@ -29,6 +29,9 @@
 #undef O_U16
 #undef O_U32
 
+#undef O_M5
+#undef O_M12
+
 #undef O_S2
 #undef O_S3
 #undef O_S4
@@ -37,6 +40,7 @@
 #undef O_S12
 #undef O_S16
 #undef O_S32
+
 #undef O_ELEM
 #undef O_LIT
 
@@ -85,6 +89,16 @@
 #undef O3_U32
 #undef O4_U32
 
+#undef O1_M5
+#undef O2_M5
+#undef O3_M5
+#undef O4_M5
+
+#undef O1_M12
+#undef O2_M12
+#undef O3_M12
+#undef O4_M12
+
 #undef O1_S2
 #undef O2_S2
 #undef O3_S2
@@ -140,31 +154,34 @@
 #undef O_UIMM_P
 #undef O_SIMM_P
 
-#define O_U1   1 /* unsigned  1 bit literal */
-#define O_U2   2 /* unsigned  2 bit literal */
-#define O_U3   3 /* unsigned  3 bit literal */
-#define O_U4   4 /* unsigned  4 bit literal */
-#define O_U5   5 /* unsigned  5 bit literal */
-#define O_U8   6 /* unsigned  8 bit literal */
-#define O_U12  7 /* unsigned 16 bit literal */
-#define O_U16  8 /* unsigned 16 bit literal */
-#define O_U32  9 /* unsigned 32 bit literal */
-
-#define O_S2  10 /* signed  2 bit literal */
-#define O_S3  11 /* signed  3 bit literal */
-#define O_S4  12 /* signed  4 bit literal */
-#define O_S5  13 /* signed  5 bit literal */
-#define O_S8  14 /* signed  8 bit literal */
-#define O_S12 15 /* signed 12 bit literal */
-#define O_S16 16 /* signed 16 bit literal */
-#define O_S32 17 /* signed 32 bit literal */
-
-#define O_ELEM  18 /* Element selector requiring modulo arithmetic. */
-#define O_LIT   19 /* Operand must be a literal fitting the target type.  */
+#define O_U1 1 /* unsigned  1 bit literal */
+#define O_U2 2 /* unsigned  2 bit literal */
+#define O_U3 3 /* unsigned  3 bit literal */
+#define O_U4 4 /* unsigned  4 bit literal */
+#define O_U5 5 /* unsigned  5 bit literal */
+#define O_U8 6 /* unsigned  8 bit literal */
+#define O_U127 /* unsigned 16 bit literal */
+#define O_U168 /* unsigned 16 bit literal */
+#define O_U329 /* unsigned 32 bit literal */
+
+#define O_M510 /* matches bitmask of 5 */
+#define O_M12   11 /* matches bitmask of 12 */
+
+#define O_S212 /* signed  2 bit literal */
+#define O_S313 /* signed  3 bit literal */
+#define O_S414 /* signed  4 bit literal */
+#define O_S515 /* signed  5 bit literal */
+#define O_S816 /* signed  8 bit literal */
+#define O_S12   17 /* signed 12 bit literal */
+#define O_S16   18 /* signed 16 bit literal */
+#define O_S32   19 /* signed 32 bit literal */
+
+#define O_ELEM  20 /* Element selector requiring modulo arithmetic. */
+#define O_LIT   21 /* Operand must be a literal fitting the target type.  */
 
 #define O_SHIFT 5
 
-#define O_UIMM_P(X) ((X) >= O_U1 && (X) <= O_U32)
+#define O_UIMM_P(X) ((X) >= O_U1 && (X) <= O_M12)
 #define O_SIMM_P(X) ((X) >= O_S2 && (X) <= O_S32)
 #define O_IMM_P(X) ((X) == O_LIT || ((X) >= O_U1 && (X) <= O_S32))
 
@@ -213,6 +230,16 @@
 #define O3_U32 (O_U32 << (2 * O_SHIFT))
 #define O4_U32 (O_U32 << (3 * O_SHIFT))
 
+#define O1_M5 O_M5
+#define O2_M5 (O_M5 << O_SHIFT)
+#define O3_M5 (O_M5 << (2 * O_SHIFT))
+#define O4_M5 (O_M5 << (3 * O_SHIFT))
+
+#define O1_M12 O_M12
+#define O2_M12 (O_M12 << O_SHIFT)
+#define O3_M12 (O_M12 << (2 * O_SHIFT))
+#define O4_M12 (O_M12 << (3 * O_SHIFT))
+
 
 #define O1_S2 O_S2
 #define O2_S2 (O_S2 << O_SHIFT)
@@ -644,12 +671,12 @@ OB_DEF_VAR (s390_vec_perm_dbl,  s390_vperm,   
  0,
 B_DEF  (s390_vperm, vec_permv16qi,  0, 
 B_VX,   0,  

[GCC 12] [PATCH v3] Add general_regs_only function attribute

2021-04-14 Thread H.J. Lu via Gcc-patches
On Tue, Apr 13, 2021 at 8:51 AM Martin Sebor  wrote:
>
> On 4/12/21 7:03 PM, H.J. Lu wrote:
> > On Mon, Apr 12, 2021 at 4:55 PM Martin Sebor  wrote:
> >>
> >> On 4/12/21 3:53 PM, H.J. Lu via Gcc-patches wrote:
> >>> On Mon, Apr 12, 2021 at 2:21 AM Richard Biener
> >>>  wrote:
> 
>  On Sat, Apr 10, 2021 at 5:11 PM H.J. Lu via Gcc-patches
>   wrote:
> >
> > Add inline_ignore_target function attribute to inform the compiler that
> > target specific option mismatch on functions with the always_inline
> > attribute may be ignored.  On x86 targets, this attribute can be used on
> > integer functions to ignore target non-integer option mismatch.
> 
>  I'm not sure I like such attribute but please adjust 
>  default_target_can_inline_p
>  accordingly (only few targets override this hook).
> 
>  Richard.
> 
> >>>
> >>> Like this?
> >>>
> >>> Thanks.
> >>
> >> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> >> index 1ddafb3ff2c..44588566f2d 100644
> >> --- a/gcc/doc/extend.texi
> >> +++ b/gcc/doc/extend.texi
> >> @@ -3187,6 +3187,14 @@ int S::interface (int) __attribute__ ((ifunc
> >> ("_ZN1S8resolverEv")));
> >>Indirect functions cannot be weak.  Binutils version 2.20.1 or higher
> >>and GNU C Library version 2.11.1 are required to use this feature.
> >>
> >> +@item inline_ignore_target
> >> +@cindex @code{inline_ignore_target} function attribute
> >> +The @code{inline_ignore_target} attribute on functions is used to
> >> +inform the compiler that target specific option mismatch on functions
> >> +with the @code{always_inline} attribute may be ignored.  On x86 targets,
> >> +this attribute can be used on integer functions to ignore target
> >> +non-integer option mismatch.
> >> +
> >>@item interrupt
> >>@itemx interrupt_handler
> >>Many GCC back ends support attributes to indicate that a function is
> >>
> >> I'm having a hard time understanding the description above (or
> >> the attribute's name for that matter).
> >
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99744
> >
> > has a testcase.
>
> Thanks.  My overall point is that GCC users should be able to answer
> these questions from reading the documentation of the attribute in
> the manual.
>
> >
> >> The inline_ignore_target function attribute informs the compiler
> >> that "target specific option mismatch on functions with the
> >> @code{always_inline} attribute" may be ignored.
> >>
> >> What does "target specific option mismatch" mean?  Is it a mismatch
> >
> > This refers to the message from GCC:
> >
> > /usr/gcc-11.0.0-x32/lib/gcc/x86_64-pc-linux-gnu/11.0.0/include/ia32intrin.h:112:1:
> > error: inlining failed in call to ‘always_inline’ ‘__rdtsc’: target
> > specific option mismatch
> >112 | __rdtsc (void)
> >| ^~~
>
> But what exactly does the target-specific option refer to, and what
> does it fail to match?  Presumably, it refers to the option in
> the attribute on the function declaration in the PR:
>
>   __attribute__ ((target("general-regs-only")))
>
> and the inability to use the rdtsc instruction with a GPR.
>
> But unless the error either mentions the -mgeneral-regs-only option
> by name or is followed by a note that points to the option in
> the function declaration, the words "target specific option" alone
> aren't enough to understand what the error means, and the text in
> the manual doesn't help.
>
> I would suggest to improve the message and the manual.
>
> >
> >> between target-specific optimization options added to a function by
> >> attribute optimize vs other target-specific optimization options of
> >> the function callers (e.g., added to them by another instance of
> >> attribue optimize, or by #pragma GCC optimize), into which a function
> >> with the attribute may be inlined, and where the conflict between
> >> the two sets of options needs to be reconciled?  And if so, should
> >
> > It is added to support integer functions with always_inline attribute.
> > Currently x86 integer functions with always_inline attribute fail to
> > compile when caller has general-regs-only target attribute and
> > SSE is enabled by default.
>
> Thanks.  I would suggest to also explain this in the manual.
>
> >
> >> it be provided as a generic attribute for all targets?
>
> I'm still wondering if this should be a generic attribute.  Besides
> x86, I see -mgeneral-regs-only also provided by ARM and Aarch64, so
> I would expect the attribute to be useful to those targets as well,
> and to all other targets that add the option or one like it in
> the future.  I believe it's better for portability to add a generic
> attribute even if it's not universally supported, than a target-
> specific one.
>
> >
> > Different targets can have different sets of conflict target specific
> > options.
> >
> >> Also, what's "integer functions" supposed to mean?  Functions that
> >> return integers?
> >
> > An integer function can be compiled with 

Re: [Patch, fortran] 99307 - FAIL: gfortran.dg/class_assign_4.f90 execution test

2021-04-14 Thread Tobias Burnus

On 11.04.21 09:05, Paul Richard Thomas wrote:

Tobias noticed a major technical fault with the resubmission below: I
forgot to attach the patch :-(


LGTM. Plus as remarked in the first review: 'trans-expr_c' typo needs to
be fixed (ChangeLog).

Tobias



Please find it attached this time.

Paul

On Tue, 6 Apr 2021 at 18:08, Paul Richard Thomas
mailto:paul.richard.tho...@gmail.com>>
wrote:

Hi Tobias,

I believe that the attached fixes the problems that you found with
gfc_find_and_cut_at_last_class_ref.

I will test:
   type1%type%array_class2 → NULL is returned  (why?)
   class1%type%array_class2 → ts = class1 but array2_class is used
later on (ups!)
   class1%...%scalar_class2 → ts = class1 but scalar_class2 is used

The ChangeLogs remain the same, apart from the date.

Regtests OK on FC33/x86_64.

Paul


On Mon, 29 Mar 2021 at 14:58, Tobias Burnus
mailto:tob...@codesourcery.com>> wrote:

Hi all,

as preremark I want to note that the testcase class_assign_4.f90
was added for PR83118/PR96012 (fixes problems in handling
class objects, Dec 18, 2020)
and got revised for PR99124 (class defined operators, Feb 23,
2021).
Both patches were then also applied to GCC 9 and 10.

On 26.03.21 17:30, Paul Richard Thomas via Gcc-patches wrote:
> This patch comes in two versions: submit.diff with
Change.Logs or
> submit2.diff with Change2.Logs.
> The first fixes the problem by changing array temporaries
from class
> expressions into class temporaries. This permits the use of
> gfc_get_class_from_expr to obtain the vptr for these
temporaries and all
> the good things that come with that when handling dynamic
types. The second
> part of the fix is to use the array element length from the
class
> descriptor, when reallocating on assignment. This is needed
because the
> vptr is being set too early. I will set about trying to
track down why this
> is happening and fix it after release.
>
> The second version does the same as the first but puts in
place a load of
> tidying up that is permitted by the fix to class array
temporaries.

> I couldn't readily see how to prepare a testcase - ideas?
> Both regtest on FC33/x86_64. The first was tested by
Dominique (see the
> PR). OK for master?

Typo – underscore-'c' should be a dot-'c' – both changelog files

>   * trans-expr_c (gfc_trans_scalar_assign): Make use of
pre and

I think the second longer version is nicer in general, but at
least for
GCC 9/GCC10 the first version is simpler and, hence, less
error prone.

As you only ask about mainline, I would prefer the second one.

However, I am not happy about gfc_find_and_cut_at_last_class_ref:

> + of refs following. If ts is non-null the cut is at the
class entity
> + or component that is followed by an array reference, which
is not +
> an element. */ ... + + if (ts) + { + if (e->symtree + &&
> e->symtree->n.sym->ts.type == BT_CLASS) + *ts =
> >symtree->n.sym->ts; + else + *ts = NULL; + } + for (ref
= e->ref;
> ref; ref = ref->next) { + if (ts && ref->type ==
REF_COMPONENT + &&
> ref->u.c.component->ts.type == BT_CLASS + && ref->next &&
> ref->next->type == REF_COMPONENT + && strcmp
> (ref->next->u.c.component->name, "_data") == 0 + &&
ref->next->next +
> && ref->next->next->type == REF_ARRAY + &&
ref->next->next->u.ar.type
> != AR_ELEMENT) + { + *ts = >u.c.component->ts; +
class_ref = ref;
> + break; + } + + if (ts && *ts == NULL) + return NULL; +
Namely, if there is:
   type1%array_class2 → array_class2 is used for 'ts' and
later (ok)
   type1%type%array_class2 → NULL is returned  (why?)
   class1%type%array_class2 → ts = class1 but array2_class is
used later on (ups!)
   class1%...%scalar_class2 → ts = class1 but scalar_class2 is
used
etc.

Thus this either needs to be cleaned up (separate 'ref' loop for
ts != NULL) – including the wording in the description which
tells what
happens if 'ts' is passed as arg but the expr has rank == 0 – and
what value is assigned to 'ts'. (You can then also fix
'class.c::' to
'class.c: ' in the description above the function.)

Alternatively, you can leave the current code ref handling
code in place
at build_class_array_ref, which might be the simpler alternative.

Otherwise, it looks sensible to me.

Tobias

-
Mentor Graphics (Deutschland) GmbH, 

[PATCH V2][committed] d: Add TARGET_D_REGISTER_OS_TARGET_INFO

2021-04-14 Thread ibuclaw--- via Gcc-patches
> On 05/04/2021 21:43 Iain Buclaw  wrote:
> 
>  
> Hi,
> 
> This patch adds TARGET_D_REGISTER_OS_TARGET_INFO as a new D front-end
> target hook, implementing `__traits(getTargetInfo, "objectFormat")' for
> all targets that have D support files.
> 
> This trait was added earlier in the front-end as a stub, however the
> target-specific implementation was left out until now.
> 
> Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and
> tested on x86_64-darwin for getting the libphobos port set-up.
> 
> Any issues with this, or OK to commit?
> 

After doing some testing on a wide spread of targets, this is not the most
fitting place to have the result of `__traits(getTargetInfo, "objectFormat")' 
set.  So have removed its implementation, but kept the hook in as it will be
useful for Darwin targets later.

Implementation of objectFormat will be done in a follow-up.

Updated patch, regression tested on x86_64-linux-gnu and committed to mainline.

Regards,
Iain.

---
gcc/ChangeLog:

* doc/tm.texi: Regenerate.
* doc/tm.texi.in (D language and ABI): Add @hook for
TARGET_D_REGISTER_OS_TARGET_INFO.

gcc/d/ChangeLog:

* d-target.cc (Target::_init): Call new targetdm hook to register OS
specific target info keys.
* d-target.def (d_register_os_target_info): New hook.
---
 gcc/d/d-target.cc  | 1 +
 gcc/d/d-target.def | 8 
 gcc/doc/tm.texi| 5 +
 gcc/doc/tm.texi.in | 2 ++
 4 files changed, 16 insertions(+)

diff --git a/gcc/d/d-target.cc b/gcc/d/d-target.cc
index d576b74af1c..be354d9f1f0 100644
--- a/gcc/d/d-target.cc
+++ b/gcc/d/d-target.cc
@@ -199,6 +199,7 @@ Target::_init (const Param &)
   /* Initialize target info tables, the keys required by the language are added
  last, so that the OS and CPU handlers can override.  */
   targetdm.d_register_cpu_target_info ();
+  targetdm.d_register_os_target_info ();
   d_add_target_info_handlers (d_language_target_info);
 }
 
diff --git a/gcc/d/d-target.def b/gcc/d/d-target.def
index cd0397c1577..aa6bf55e6e6 100644
--- a/gcc/d/d-target.def
+++ b/gcc/d/d-target.def
@@ -58,6 +58,14 @@ describing the requested target information.",
  void, (void),
  hook_void_void)
 
+/* getTargetInfo keys relating to the target OS.  */
+DEFHOOK
+(d_register_os_target_info,
+ "Same as @code{TARGET_D_CPU_TARGET_INFO}, but is used for keys relating to\n\
+the target operating system.",
+ void, (void),
+ hook_void_void)
+
 /* ModuleInfo section name and brackets.  */
 DEFHOOKPOD
 (d_minfo_section,
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 6201df9a67d..97c8eebcd6f 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -10817,6 +10817,11 @@ added by this hook are made available at compile time 
by the
 describing the requested target information.
 @end deftypefn
 
+@deftypefn {D Target Hook} void TARGET_D_REGISTER_OS_TARGET_INFO (void)
+Same as @code{TARGET_D_CPU_TARGET_INFO}, but is used for keys relating to
+the target operating system.
+@end deftypefn
+
 @deftypevr {D Target Hook} {const char *} TARGET_D_MINFO_SECTION
 Contains the name of the section in which module info references should be
 placed.  This section is expected to be bracketed by two symbols to indicate
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index bde57585b03..e2d49ee9f57 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7359,6 +7359,8 @@ floating-point support; they are not included in this 
mechanism.
 
 @hook TARGET_D_REGISTER_CPU_TARGET_INFO
 
+@hook TARGET_D_REGISTER_OS_TARGET_INFO
+
 @hook TARGET_D_MINFO_SECTION
 
 @hook TARGET_D_MINFO_START_NAME
-- 
2.27.0


[PATCH] Simplify {gimplify_and_,}update_call_from_tree API

2021-04-14 Thread Richard Biener
This removes update_call_from_tree in favor of
gimplify_and_update_call_from_tree, removing some code duplication
and simplifying the API use.  Some users of update_call_from_tree
have been transitioned to replace_call_with_value and the API
and its dependences have been moved to gimple-fold.h.

This shaves off another user of valid_gimple_rhs_p which is now
only used from within gimple-fold.c and thus moved and made private.

Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1.

2021-04-14  Richard Biener  

* tree-ssa-propagate.h (valid_gimple_rhs_p): Remove.
(update_gimple_call): Likewise.
(update_call_from_tree): Likewise.
* tree-ssa-propagate.c (valid_gimple_rhs_p): Remove.
(valid_gimple_call_p): Likewise.
(move_ssa_defining_stmt_for_defs): Likewise.
(finish_update_gimple_call): Likewise.
(update_gimple_call): Likewise.
(update_call_from_tree): Likewise.
(propagate_tree_value_into_stmt): Use replace_call_with_value.
* gimple-fold.h (update_gimple_call): Declare.
* gimple-fold.c (valid_gimple_rhs_p): Move here from
tree-ssa-propagate.c.
(update_gimple_call): Likewise.
(valid_gimple_call_p): Likewise.
(finish_update_gimple_call): Likewise, and simplify.
(gimplify_and_update_call_from_tree): Implement
update_call_from_tree functionality, avoid excessive
push/pop_gimplify_context.
(gimple_fold_builtin): Use only gimplify_and_update_call_from_tree.
(gimple_fold_call): Likewise.
* gimple-ssa-sprintf.c (try_substitute_return_value): Likewise.
* tree-ssa-ccp.c (ccp_folder::fold_stmt): Likewise.
(pass_fold_builtins::execute): Likewise.
(optimize_stack_restore): Use replace_call_with_value.
* tree-cfg.c (fold_loop_internal_call): Likewise.
* tree-ssa-dce.c (maybe_optimize_arith_overflow): Use
only gimplify_and_update_call_from_tree.
* tree-ssa-strlen.c (handle_builtin_strlen): Likewise.
(handle_builtin_strchr): Likewise.
* tsan.c: Include gimple-fold.h instead of tree-ssa-propagate.h.

* config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin):
Use replace_call_with_value.
---
 gcc/config/rs6000/rs6000-call.c |   2 +-
 gcc/gimple-fold.c   | 219 ++-
 gcc/gimple-fold.h   |   1 +
 gcc/gimple-ssa-sprintf.c|   3 +-
 gcc/tree-cfg.c  |   2 +-
 gcc/tree-ssa-ccp.c  |  15 +-
 gcc/tree-ssa-dce.c  |   3 +-
 gcc/tree-ssa-propagate.c| 300 +---
 gcc/tree-ssa-propagate.h|   3 -
 gcc/tree-ssa-strlen.c   |   9 +-
 gcc/tsan.c  |   2 +-
 11 files changed, 226 insertions(+), 333 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index f5676255387..6f6dc47f0ae 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -12369,7 +12369,7 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
/* Convert result back to the lhs type.  */
res = gimple_build (, VIEW_CONVERT_EXPR, TREE_TYPE (lhs), res);
gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
-   update_call_from_tree (gsi, res);
+   replace_call_with_value (gsi, res);
return true;
   }
 /* Vector loads.  */
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 9e6683dbac9..a118ef4bbc3 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -338,6 +338,123 @@ maybe_fold_reference (tree expr, bool is_lhs)
   return NULL_TREE;
 }
 
+/* Return true if EXPR is an acceptable right-hand-side for a
+   GIMPLE assignment.  We validate the entire tree, not just
+   the root node, thus catching expressions that embed complex
+   operands that are not permitted in GIMPLE.  This function
+   is needed because the folding routines in fold-const.c
+   may return such expressions in some cases, e.g., an array
+   access with an embedded index addition.  It may make more
+   sense to have folding routines that are sensitive to the
+   constraints on GIMPLE operands, rather than abandoning any
+   any attempt to fold if the usual folding turns out to be too
+   aggressive.  */
+
+bool
+valid_gimple_rhs_p (tree expr)
+{
+  enum tree_code code = TREE_CODE (expr);
+
+  switch (TREE_CODE_CLASS (code))
+{
+case tcc_declaration:
+  if (!is_gimple_variable (expr))
+   return false;
+  break;
+
+case tcc_constant:
+  /* All constants are ok.  */
+  break;
+
+case tcc_comparison:
+  /* GENERIC allows comparisons with non-boolean types, reject
+those for GIMPLE.  Let vector-typed comparisons pass - rules
+for GENERIC and GIMPLE are the same here.  */
+  if (!(INTEGRAL_TYPE_P (TREE_TYPE (expr))
+   && (TREE_CODE (TREE_TYPE (expr)) == BOOLEAN_TYPE
+   || 

[PATCH V6 5/7] CTF/BTF testsuites

2021-04-14 Thread Jose E. Marchesi via Gcc-patches
This commit adds a new testsuite for the CTF debug format.

2021-04-14  Indu Bhagat  
David Faust  

gcc/testsuite/

* lib/gcc-dg.exp (gcc-dg-frontend-supports-ctf): New procedure.
(gcc-dg-debug-runtest): Add -gctf support.
* gcc.dg/debug/btf/btf-1.c: New test.
* gcc.dg/debug/btf/btf-2.c: Likewise.
* gcc.dg/debug/btf/btf-anonymous-struct-1.c: Likewise.
* gcc.dg/debug/btf/btf-anonymous-union-1.c: Likewise.
* gcc.dg/debug/btf/btf-array-1.c: Likewise.
* gcc.dg/debug/btf/btf-bitfields-1.c: Likewise.
* gcc.dg/debug/btf/btf-bitfields-2.c: Likewise.
* gcc.dg/debug/btf/btf-bitfields-3.c: Likewise.
* gcc.dg/debug/btf/btf-cvr-quals-1.c: Likewise.
* gcc.dg/debug/btf/btf-enum-1.c: Likewise.
* gcc.dg/debug/btf/btf-forward-1.c: Likewise.
* gcc.dg/debug/btf/btf-function-1.c: Likewise.
* gcc.dg/debug/btf/btf-function-2.c: Likewise.
* gcc.dg/debug/btf/btf-int-1.c: Likewise.
* gcc.dg/debug/btf/btf-pointers-1.c: Likewise.
* gcc.dg/debug/btf/btf-struct-1.c: Likewise.
* gcc.dg/debug/btf/btf-typedef-1.c: Likewise.
* gcc.dg/debug/btf/btf-union-1.c: Likewise.
* gcc.dg/debug/btf/btf-variables-1.c: Likewise.
* gcc.dg/debug/btf/btf.exp: Likewise.
* gcc.dg/debug/ctf/ctf-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-anonymous-struct-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-anonymous-union-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-array-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-attr-mode-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-attr-used-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-bitfields-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-complex-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-cvr-quals-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-enum-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-enum-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-file-scope-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-float-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-forward-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-forward-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-func-index-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-function-pointers-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-function-pointers-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-function-pointers-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-functions-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-int-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-objt-index-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-pointers-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-pointers-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-preamble-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-4.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-5.c: Likewise.
* gcc.dg/debug/ctf/ctf-skip-types-6.c: Likewise.
* gcc.dg/debug/ctf/ctf-str-table-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-array-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-pointer-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-struct-pointer-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-struct-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-struct-2.c: Likewise.
* gcc.dg/debug/ctf/ctf-typedef-struct-3.c: Likewise.
* gcc.dg/debug/ctf/ctf-union-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-variables-1.c: Likewise.
* gcc.dg/debug/ctf/ctf-variables-2.c: Likewise.
* gcc.dg/debug/ctf/ctf.exp: Likewise.
---
 gcc/testsuite/gcc.dg/debug/btf/btf-1.c|  6 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-2.c| 10 +++
 .../gcc.dg/debug/btf/btf-anonymous-struct-1.c | 23 ++
 .../gcc.dg/debug/btf/btf-anonymous-union-1.c  | 23 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-array-1.c  | 31 +++
 .../gcc.dg/debug/btf/btf-bitfields-1.c| 34 
 .../gcc.dg/debug/btf/btf-bitfields-2.c| 26 ++
 .../gcc.dg/debug/btf/btf-bitfields-3.c| 43 ++
 

[PATCH V6 7/7] Enable BTF generation in the BPF backend

2021-04-14 Thread Jose E. Marchesi via Gcc-patches
This patch changes the BPF GCC backend in order to use the DWARF debug
hooks and therefore enables the user to generate BTF debugging
information with -gbtf.  Generating BTF is crucial when compiling BPF
programs, since the CO-RE (compile-once, run-everwhere) mechanism
used by the kernel BPF loader relies on it.

Note that since in eBPF it is not possible to unwind frames due to the
restrictive nature of the target architecture, we are disabling the
generation of CFA in this target.

2021-04-14  David Faust 

* config/bpf/bpf.c (bpf_expand_prologue): Do not mark insns as
frame related.
(bpf_expand_epilogue): Likewise.
* config/bpf/bpf.h (DWARF2_FRAME_INFO): Define to 0.
Do not define DBX_DEBUGGING_INFO.
---
 gcc/config/bpf/bpf.c |  4 
 gcc/config/bpf/bpf.h | 12 ++--
 2 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/gcc/config/bpf/bpf.c b/gcc/config/bpf/bpf.c
index 126d4a2798d..e635f9edb40 100644
--- a/gcc/config/bpf/bpf.c
+++ b/gcc/config/bpf/bpf.c
@@ -349,7 +349,6 @@ bpf_expand_prologue (void)
  hard_frame_pointer_rtx,
  fp_offset - 8));
  insn = emit_move_insn (mem, gen_rtx_REG (DImode, regno));
- RTX_FRAME_RELATED_P (insn) = 1;
  fp_offset -= 8;
}
}
@@ -364,7 +363,6 @@ bpf_expand_prologue (void)
 {
   insn = emit_move_insn (stack_pointer_rtx,
 hard_frame_pointer_rtx);
-  RTX_FRAME_RELATED_P (insn) = 1;
 
   if (size > 0)
{
@@ -372,7 +370,6 @@ bpf_expand_prologue (void)
 gen_rtx_PLUS (Pmode,
   stack_pointer_rtx,
   GEN_INT (-size;
- RTX_FRAME_RELATED_P (insn) = 1;
}
 }
 }
@@ -412,7 +409,6 @@ bpf_expand_epilogue (void)
  hard_frame_pointer_rtx,
  fp_offset - 8));
  insn = emit_move_insn (gen_rtx_REG (DImode, regno), mem);
- RTX_FRAME_RELATED_P (insn) = 1;
  fp_offset -= 8;
}
}
diff --git a/gcc/config/bpf/bpf.h b/gcc/config/bpf/bpf.h
index 9e2f5260900..ef0123b56a6 100644
--- a/gcc/config/bpf/bpf.h
+++ b/gcc/config/bpf/bpf.h
@@ -235,17 +235,9 @@ enum reg_class
 
 / Debugging Info /
 
-/* We cannot support DWARF2 because of the limitations of eBPF.  */
+/* In eBPF it is not possible to unwind frames. Disable CFA.  */
 
-/* elfos.h insists in using DWARF.  Undo that here.  */
-#ifdef DWARF2_DEBUGGING_INFO
-# undef DWARF2_DEBUGGING_INFO
-#endif
-#ifdef PREFERRED_DEBUGGING_TYPE
-# undef PREFERRED_DEBUGGING_TYPE
-#endif
-
-#define DBX_DEBUGGING_INFO
+#define DWARF2_FRAME_INFO 0
 
 / Stack Layout and Calling Conventions.  */
 
-- 
2.25.0.2.g232378479e



[PATCH V6 6/7] CTF/BTF documentation

2021-04-14 Thread Jose E. Marchesi via Gcc-patches
This commit documents the new command line options introduced by the
CTF and BTF debug formats.

2021-04-14  Indu Bhagat  

* doc/invoke.texi: Document the CTF and BTF debug info options.
---
 gcc/doc/invoke.texi | 20 
 1 file changed, 20 insertions(+)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 17551246477..da3860c8a3e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -462,6 +462,7 @@ Objective-C and Objective-C++ Dialects}.
 @item Debugging Options
 @xref{Debugging Options,,Options for Debugging Your Program}.
 @gccoptlist{-g  -g@var{level}  -gdwarf  -gdwarf-@var{version} @gol
+-gbtf -gctf  -gctf@var{level} @gol
 -ggdb  -grecord-gcc-switches  -gno-record-gcc-switches @gol
 -gstabs  -gstabs+  -gstrict-dwarf  -gno-strict-dwarf @gol
 -gas-loc-support  -gno-as-loc-support @gol
@@ -9666,6 +9667,25 @@ other DWARF-related options such as
 @option{-fno-dwarf2-cfi-asm}) retain a reference to DWARF Version 2
 in their names, but apply to all currently-supported versions of DWARF.
 
+@item -gbtf
+@opindex gbtf
+Request BTF debug information.
+
+@item -gctf
+@itemx -gctf@var{level}
+@opindex gctf
+Request CTF debug information and use level to specify how much CTF debug
+information should be produced.  If -gctf is specified without a value for
+level, the default level of CTF debug information is 2.
+
+Level 0 produces no CTF debug information at all.  Thus, -gctf0 negates -gctf.
+
+Level 1 produces CTF information for tracebacks only.  This includes callsite
+information, but does not include type information.
+
+Level 2 produces type information for entities (functions, data objects etc.)
+at file-scope or global-scope only.
+
 @item -gstabs
 @opindex gstabs
 Produce debugging information in stabs format (if that is supported),
-- 
2.25.0.2.g232378479e



[PATCH V6 0/7] Support for the CTF and BTF debug formats

2021-04-14 Thread Jose E. Marchesi via Gcc-patches
[Changes from V5:
- Rebased to today's master.
- New preparatory patch: factorize some code in
  gcc-dg-debug-runtest into a new procedure.  This is used by the
  CTF testsuite.
- Expose get_AT_file and related data structures in dwarf2int.
- Move the call to ctf_debug_finalize into dwarf2out_finish, since
  this is needed by the BPF backend.
- CTF fixes and improvements
  + add support for location info
  + remove misplaced comment from ctfout.c
- BTF fixes and improvements
  + skip unrepresentable bitfields.
  + do not output vars referencing removed types
  + correct vlen bytes for removed bitfiels
  + properly create/release preprocessing vectors
  + skip struct members with ref deleted types
  + output correct variable linkage
  + some refactoring and better comments
  + more BTF tests
- This new version of the patch series is available in
  https://github.com/oracle/gcc branch oracle/ctf-v6]

Hi people!

Last year we submitted a first patch series introducing support for
the CTF debugging format in GCC [1].  We got a lot of feedback that
prompted us to change the approach used to generate the debug info,
and this patch series is the result of that.

This series also add support for the BTF debug format, which is needed
by the BPF backend (more on this below.)

This implementation works, but there are several points that need
discussion and agreement with the upstream community, as they impact
the way debugging options work.  We are also proposing a way to add
additional debugging formats in the future.  See below for more
details.

Finally, a patch makes the BPF GCC backend to use the DWARF debug
hooks in order to make -gbtf available to it.

[1] https://gcc.gnu.org/legacy-ml/gcc-patches/2019-05/msg01297.html

About CTF
=

CTF is a debugging format designed in order to express C types in a
very compact way.  The key is compactness and simplicity.  For more
information see:

- CTF specification
  http://www.esperi.org.uk/~oranix/ctf/ctf-spec.pdf

- Compact C-Type support in the GNU toolchain (talk + slides)
  https://linuxplumbersconf.org/event/4/contributions/396/

- On type de-duplication in CTF (talk + slides)
  https://linuxplumbersconf.org/event/7/contributions/725/

About BTF
=

BTF is a debugging format, similar to CTF, that is used in the Linux
kernel as the debugging format for BPF programs.  From the kernel
documentation:

"BTF (BPF Type Format) is the metadata format which encodes the debug
 info related to BPF program/map. The name BTF was used initially to
 describe data types. The BTF was later extended to include function
 info for defined subroutines, and line info for source/line
 information."

Supporting BTF in GCC is important because compiled BPF programs
(which GCC supports as a target) require the type information in order
to be loaded and run in diverse kernel versions.  This mechanism is
known as CO-RE (compile-once, run-everywhere) and is described in the
"Update of the BPF support in the GNU Toolchain" talk mentioned below.

The BTF is documented in the Linux kernel documentation tree:
- linux/Documentation/bpf/btf.rst

CTF in the GNU Toolchain


During the last year we have been working in adding support for CTF to
several components of the GNU toolchain:

- binutils support is already upstream.  It supports linking objects
  with CTF information with full type de-duplication.

- GDB support is to be sent upstream very shortly.  It makes the
  debugger capable to use the CTF information whenever available.
  This is useful in cases where DWARF has been stripped out but CTF is
  kept.

- GCC support is being discussed and submitted in this series.

Overview of the Implementation
==

  dwarf2out.c

The enabled debug formats are hooked in dwarf2out_early_finish.

  dwarf2int.h

Internal interface that exports a few functions and data types
defined in dwarf2out.c.

  dwarf2ctf.c

Code that tranform the internal GCC DWARF DIEs into CTF container
structures.  This file uses the dwarf2int.h interface.

  ctfc.c
  ctfc.h

These two files implement the "CTF container", which is shared
among CTF and BTF, due to the many similarities between both
formats.

  ctfout.c

Code that emits assembler with the .ctf section data, from the CTF
container.

  btfout.c

Code that emits assembler with the .BTF section data, from the CTF
container.

>From debug hooks to debug formats
=

Our first attempt in adding CTF to GCC used the obvious approach of
adding a new set of debug hooks as defined in gcc/debug.h.

During our first interaction with the upstream community we were told
to _not_ use debug hooks, because these are to be obsoleted at some
point.  We were suggested to instead hook our handlers (which
processed type TREE nodes producing CTF types from them) somewhere
else.  So we did.

However at the time we were also facing the need to support 

[PATCH V6 1/7] dwarf: add a dwarf2int.h internal interface

2021-04-14 Thread Jose E. Marchesi via Gcc-patches
This patch introduces a dwarf2int.h header, to be used by code that
needs access to the internal DIE structures and their attributes.

The following functions which were previously defined as static in
dwarf2out.c are now non-static, and extern prototypes for them have
been added to dwarf2int.h:

- get_AT
- AT_int
- get_AT_ref
- get_AT_string
- get_AT_class
- AT_unsigned
- get_AT_unsigned
- get_AT_flag
- add_name_attribute
- new_die_raw
- base_type_die
- lookup_decl_die
- get_AT_file

Note how this patch doens't change the names of these functions to
avoid a massive renaming in dwarf2out.c, but n the future we probably
want these functions to sport a dw_* prefix.

Also, some type definitions have been moved from dwarf2out.c to
dwarf2int.h:

- dw_attr_node
- struct dwarf_file_data

Finally, three new accessor functions have been added to dwarf2out.c
with prototypes in dwarf2int.h:

- dw_get_die_child
- dw_get_die_sib
- dw_get_die_tag

2021-04-14  Jose E. Marchesi  

* dwarf2int.h: New file.
* dwarf2out.c (get_AT): Function is no longer static.
(get_AT_string): Likewise.
(get_AT_flag): Likewise.
(get_AT_unsigned): Likewise.
(get_AT_ref): Likewise.
(new_die_raw): Likewise.
(lookup_decl_die): Likewise.
(base_type_die): Likewise.
(add_name_attribute): Likewise.
(dw_get_die_tag): New function.
(dw_get_die_child): Likewise.
(dw_get_die_sib): Likewise.
Include dwarf2int.h.
* gengtype.c: add dwarf2int.h to open_base_files.
* Makefile.in (GTFILES): Add dwarf2int.h.
---
 gcc/Makefile.in |  1 +
 gcc/dwarf2int.h | 67 +
 gcc/dwarf2out.c | 79 -
 gcc/gengtype.c  |  6 ++--
 4 files changed, 109 insertions(+), 44 deletions(-)
 create mode 100644 gcc/dwarf2int.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 8a5fb3fd99c..e464e8c65c5 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2653,6 +2653,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h 
$(srcdir)/coretypes.h \
   $(srcdir)/ipa-modref.h $(srcdir)/ipa-modref.c \
   $(srcdir)/ipa-modref-tree.h \
   $(srcdir)/signop.h \
+  $(srcdir)/dwarf2int.h \
   $(srcdir)/dwarf2out.h \
   $(srcdir)/dwarf2asm.c \
   $(srcdir)/dwarf2cfi.c \
diff --git a/gcc/dwarf2int.h b/gcc/dwarf2int.h
new file mode 100644
index 000..f49f51d957b
--- /dev/null
+++ b/gcc/dwarf2int.h
@@ -0,0 +1,67 @@
+/* Prototypes for functions manipulating DWARF2 DIEs.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+/* This file contains prototypes for functions defined in dwarf2out.c.  It is
+   intended to be included in source files that need some internal knowledge of
+   the GCC dwarf structures.  */
+
+#ifndef GCC_DWARF2INT_H
+#define GCC_DWARF2INT_H 1
+
+/* Each DIE attribute has a field specifying the attribute kind,
+   a link to the next attribute in the chain, and an attribute value.
+   Attributes are typically linked below the DIE they modify.  */
+
+typedef struct GTY(()) dw_attr_struct {
+  enum dwarf_attribute dw_attr;
+  dw_val_node dw_attr_val;
+}
+dw_attr_node;
+
+extern dw_attr_node *get_AT (dw_die_ref, enum dwarf_attribute);
+extern HOST_WIDE_INT AT_int (dw_attr_node *);
+extern unsigned HOST_WIDE_INT AT_unsigned (dw_attr_node *a);
+extern dw_die_ref get_AT_ref (dw_die_ref, enum dwarf_attribute);
+extern const char *get_AT_string (dw_die_ref, enum dwarf_attribute);
+extern enum dw_val_class AT_class (dw_attr_node *);
+extern unsigned HOST_WIDE_INT AT_unsigned (dw_attr_node *);
+extern unsigned get_AT_unsigned (dw_die_ref, enum dwarf_attribute);
+extern int get_AT_flag (dw_die_ref, enum dwarf_attribute);
+
+extern void add_name_attribute (dw_die_ref, const char *);
+
+extern dw_die_ref new_die_raw (enum dwarf_tag);
+extern dw_die_ref base_type_die (tree, bool);
+
+extern dw_die_ref lookup_decl_die (tree);
+
+extern dw_die_ref dw_get_die_child (dw_die_ref);
+extern dw_die_ref dw_get_die_sib (dw_die_ref);
+extern enum dwarf_tag dw_get_die_tag (dw_die_ref);
+
+/* Data about a single source file.  */
+struct GTY((for_user)) dwarf_file_data {
+  const char * filename;
+  int emitted_number;
+};
+
+extern struct dwarf_file_data *get_AT_file (dw_die_ref,
+   enum dwarf_attribute);
+
+#endif /* 

[PATCH V6 2/7] dwarf: new dwarf_debuginfo_p predicate

2021-04-14 Thread Jose E. Marchesi via Gcc-patches
This patch introduces a dwarf_debuginfo_p predicate that abstracts and
replaces complex checks on write_symbols.

2021-04-14  Indu Bhagat  

gcc/ChangeLog

* flags.h (dwarf_debuginfo_p): New function declaration.
* opts.c (dwarf_debuginfo_p): New function definition.
* config/c6x/c6x.c (c6x_output_file_unwind): Likewise.
* dwarf2cfi.c (cfi_label_required_p): Likewise.
(dwarf2out_do_frame): Likewise.
* final.c (dwarf2_debug_info_emitted_p): Likewise.
(final_scan_insn_1): Likewise.
* targhooks.c (default_debug_unwind_info): Likewise.
* toplev.c (process_options): Likewise.

gcc/c-family/ChangeLog

* c-lex.c (init_c_lex): Use dwarf_debuginfo_p.
---
 gcc/c-family/c-lex.c |  4 ++--
 gcc/config/c6x/c6x.c |  3 +--
 gcc/dwarf2cfi.c  |  9 -
 gcc/final.c  | 15 ++-
 gcc/flags.h  |  3 +++
 gcc/opts.c   |  8 
 gcc/targhooks.c  |  2 +-
 gcc/toplev.c |  6 ++
 8 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 6374b72ed2d..5174b22c303 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -27,6 +27,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stor-layout.h"
 #include "c-pragma.h"
 #include "debug.h"
+#include "flags.h"
 #include "file-prefix-map.h" /* remap_macro_filename()  */
 #include "langhooks.h"
 #include "attribs.h"
@@ -87,8 +88,7 @@ init_c_lex (void)
 
   /* Set the debug callbacks if we can use them.  */
   if ((debug_info_level == DINFO_LEVEL_VERBOSE
-   && (write_symbols == DWARF2_DEBUG
-  || write_symbols == VMS_AND_DWARF2_DEBUG))
+   && dwarf_debuginfo_p ())
   || flag_dump_go_spec != NULL)
 {
   cb->define = cb_define;
diff --git a/gcc/config/c6x/c6x.c b/gcc/config/c6x/c6x.c
index f9ad1e5f6c5..a10e2f8d662 100644
--- a/gcc/config/c6x/c6x.c
+++ b/gcc/config/c6x/c6x.c
@@ -439,8 +439,7 @@ c6x_output_file_unwind (FILE * f)
 {
   if (flag_unwind_tables || flag_exceptions)
{
- if (write_symbols == DWARF2_DEBUG
- || write_symbols == VMS_AND_DWARF2_DEBUG)
+ if (dwarf_debuginfo_p ())
asm_fprintf (f, "\t.cfi_sections .debug_frame, .c6xabi.exidx\n");
  else
asm_fprintf (f, "\t.cfi_sections .c6xabi.exidx\n");
diff --git a/gcc/dwarf2cfi.c b/gcc/dwarf2cfi.c
index 362ff3fdac2..c27ac1960b0 100644
--- a/gcc/dwarf2cfi.c
+++ b/gcc/dwarf2cfi.c
@@ -39,7 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "expr.h"  /* init_return_column_size */
 #include "output.h"/* asm_out_file */
 #include "debug.h" /* dwarf2out_do_frame, dwarf2out_do_cfi_asm */
-
+#include "flags.h" /* dwarf_debuginfo_p */
 
 /* ??? Poison these here until it can be done generically.  They've been
totally replaced in this file; make sure it stays that way.  */
@@ -2289,8 +2289,7 @@ cfi_label_required_p (dw_cfi_ref cfi)
 
   if (dwarf_version == 2
   && debug_info_level > DINFO_LEVEL_TERSE
-  && (write_symbols == DWARF2_DEBUG
- || write_symbols == VMS_AND_DWARF2_DEBUG))
+  && dwarf_debuginfo_p ())
 {
   switch (cfi->dw_cfi_opc)
{
@@ -3557,9 +3556,9 @@ bool
 dwarf2out_do_frame (void)
 {
   /* We want to emit correct CFA location expressions or lists, so we
- have to return true if we're going to output debug info, even if
+ have to return true if we're going to generate debug info, even if
  we're not going to output frame or unwind info.  */
-  if (write_symbols == DWARF2_DEBUG || write_symbols == VMS_AND_DWARF2_DEBUG)
+  if (dwarf_debuginfo_p ())
 return true;
 
   if (saved_do_cfi_asm > 0)
diff --git a/gcc/final.c b/gcc/final.c
index daae115fef5..cae692062b4 100644
--- a/gcc/final.c
+++ b/gcc/final.c
@@ -1442,7 +1442,8 @@ asm_str_count (const char *templ)
 static bool
 dwarf2_debug_info_emitted_p (tree decl)
 {
-  if (write_symbols != DWARF2_DEBUG && write_symbols != VMS_AND_DWARF2_DEBUG)
+  /* When DWARF2 debug info is not generated internally.  */
+  if (!dwarf_debuginfo_p ())
 return false;
 
   if (DECL_IGNORED_P (decl))
@@ -2330,10 +2331,8 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int 
optimize_p ATTRIBUTE_UNUSED,
  break;
 
case NOTE_INSN_BLOCK_BEG:
- if (debug_info_level == DINFO_LEVEL_NORMAL
- || debug_info_level == DINFO_LEVEL_VERBOSE
- || write_symbols == DWARF2_DEBUG
- || write_symbols == VMS_AND_DWARF2_DEBUG
+ if (debug_info_level >= DINFO_LEVEL_NORMAL
+ || dwarf_debuginfo_p ()
  || write_symbols == VMS_DEBUG)
{
  int n = BLOCK_NUMBER (NOTE_BLOCK (insn));
@@ -2368,10 +2367,8 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int 
optimize_p ATTRIBUTE_UNUSED,
case NOTE_INSN_BLOCK_END:
  maybe_output_next_view (seen);
 
- if 

[PATCH V6 3/7] dejagnu: modularize gcc-dg-debug-runtest a bit

2021-04-14 Thread Jose E. Marchesi via Gcc-patches
Move some functionality into a procedure of its own. This is only so that when
the patch for ctf comes along, the gcc-dg-debug-runtest procedure looks bit
more uniform.

gcc/testsuite/ChangeLog:

* lib/gcc-dg.exp (gcc-dg-target-supports-debug-format): New procedure.
---
 gcc/testsuite/lib/gcc-dg.exp | 23 ---
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/lib/gcc-dg.exp b/gcc/testsuite/lib/gcc-dg.exp
index e48a184f991..a2b1c6436ab 100644
--- a/gcc/testsuite/lib/gcc-dg.exp
+++ b/gcc/testsuite/lib/gcc-dg.exp
@@ -621,18 +621,27 @@ proc gcc-dg-runtest { testcases flags default-extra-flags 
} {
 }
 }
 
-proc gcc-dg-debug-runtest { target_compile trivial opt_opts testcases } {
+# Check if the target system supports the debug format
+proc gcc-dg-target-supports-debug-format { target_compile trivial type } {
 global srcdir subdir
 
+set comp_output [$target_compile \
+   "$srcdir/$subdir/$trivial" "trivial.S" assembly \
+   "additional_flags=$type"]
+if { ! [string match "*: target system does not support the * debug 
format*" \
+   $comp_output] } {
+   remove-build-file "trivial.S"
+   return 1
+}
+return 0
+}
+
+proc gcc-dg-debug-runtest { target_compile trivial opt_opts testcases } {
 if ![info exists DEBUG_TORTURE_OPTIONS] {
set DEBUG_TORTURE_OPTIONS ""
foreach type {-gdwarf-2 -gstabs -gstabs+ -gxcoff -gxcoff+} {
-   set comp_output [$target_compile \
-   "$srcdir/$subdir/$trivial" "trivial.S" assembly \
-   "additional_flags=$type"]
-   if { ! [string match "*: target system does not support the * debug 
format*" \
-   $comp_output] } {
-   remove-build-file "trivial.S"
+   if [expr [gcc-dg-target-supports-debug-format \
+ $target_compile $trivial $type]] {
foreach level {1 "" 3} {
if { ($type == "-gdwarf-2") && ($level != "") } {
lappend DEBUG_TORTURE_OPTIONS [list "${type}" 
"-g${level}"]
-- 
2.25.0.2.g232378479e



[committed] d: Remove setting of target-specific global.params flags from front-end

2021-04-14 Thread Iain Buclaw via Gcc-patches
Hi,

This patch removes the setting of all target-specific global.params
flags from the D front-end.  Now that all dependencies on these flags
have been removed, there's no need to test and set them.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and
committed to mainline.

Regards
Iain.

---
gcc/d/ChangeLog:

* d-builtins.cc (d_add_builtin_version): Remove all setting of
target-specific global.params.
* typeinfo.cc (create_typeinfo): Don't add argType fields to
TypeInfo_Struct.
---
 gcc/d/d-builtins.cc | 19 ---
 gcc/d/typeinfo.cc   |  5 +
 2 files changed, 1 insertion(+), 23 deletions(-)

diff --git a/gcc/d/d-builtins.cc b/gcc/d/d-builtins.cc
index ce098617c62..400bce0a141 100644
--- a/gcc/d/d-builtins.cc
+++ b/gcc/d/d-builtins.cc
@@ -418,25 +418,6 @@ d_eval_constant_expression (const Loc , tree cst)
 void
 d_add_builtin_version (const char* ident)
 {
-  /* For now, we need to tell the D frontend what platform is being targeted.
- This should be removed once the frontend has been fixed.  */
-  if (strcmp (ident, "linux") == 0)
-global.params.isLinux = true;
-  else if (strcmp (ident, "OSX") == 0)
-global.params.isOSX = true;
-  else if (strcmp (ident, "Windows") == 0)
-global.params.isWindows = true;
-  else if (strcmp (ident, "FreeBSD") == 0)
-global.params.isFreeBSD = true;
-  else if (strcmp (ident, "OpenBSD") == 0)
-global.params.isOpenBSD = true;
-  else if (strcmp (ident, "Solaris") == 0)
-global.params.isSolaris = true;
-  /* The is64bit field only refers to x86_64 target.  */
-  else if (strcmp (ident, "X86_64") == 0)
-global.params.is64bit = true;
-  /* No other fields are required to be set for the frontend.  */
-
   VersionCondition::addPredefinedGlobalIdent (ident);
 }
 
diff --git a/gcc/d/typeinfo.cc b/gcc/d/typeinfo.cc
index f8ffcbfff25..503480b491d 100644
--- a/gcc/d/typeinfo.cc
+++ b/gcc/d/typeinfo.cc
@@ -1562,9 +1562,6 @@ create_typeinfo (Type *type, Module *mod)
case TK_STRUCT_TYPE:
  if (!tinfo_types[tk])
{
- /* Some ABIs add extra TypeInfo fields on the end.  */
- tree argtype = global.params.is64bit ? ptr_type_node : NULL_TREE;
-
  ident = Identifier::idPool ("TypeInfo_Struct");
  make_internal_typeinfo (tk, ident,
  array_type_node, array_type_node,
@@ -1572,7 +1569,7 @@ create_typeinfo (Type *type, Module *mod)
  ptr_type_node, ptr_type_node,
  d_uint_type, ptr_type_node,
  ptr_type_node, d_uint_type,
- ptr_type_node, argtype, argtype, NULL);
+ ptr_type_node, NULL);
}
  t->vtinfo = TypeInfoStructDeclaration::create (t);
  break;
-- 
2.27.0



[committed] d: Move call to set_linkage_for_decl to declare_extern_var.

2021-04-14 Thread Iain Buclaw via Gcc-patches
Hi,

This patch moves the call to the newly introduced set_linkage_for_decl
in the D front-end from d_finish_decl to declare_extern_var.

This both prevents against it being called twice for declarations that
are defined, and fixes an issue where variables defined in the
compilation get one kind of linkage (weak), and the same variables
declared via declare_extern_var get another (extern).

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, as
well as testing on port of x86_64-mingw32.  Committed to mainline.

Regards,
Iain.

---
gcc/d/ChangeLog:

PR d/99914
* decl.cc (DeclVisitor::visit (StructDeclaration *)): Don't set
DECL_INSTANTIATED on static initializer declarations.
(DeclVisitor::visit (ClassDeclaration *)): Likewise.
(DeclVisitor::visit (EnumDeclaration *)): Likewise.
(d_finish_decl): Move call to set_linkage_for_decl to...
(declare_extern_var): ...here.
---
 gcc/d/decl.cc | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/d/decl.cc b/gcc/d/decl.cc
index b07068ed6ca..8948e40e902 100644
--- a/gcc/d/decl.cc
+++ b/gcc/d/decl.cc
@@ -387,7 +387,6 @@ public:
 /* Generate static initializer.  */
 d->sinit = aggregate_initializer_decl (d);
 DECL_INITIAL (d->sinit) = layout_struct_initializer (d);
-DECL_INSTANTIATED (d->sinit) = (d->isInstantiated () != NULL);
 d_finish_decl (d->sinit);
 
 /* Put out the members.  There might be static constructors in the members
@@ -500,7 +499,6 @@ public:
 
 /* Generate static initializer.  */
 DECL_INITIAL (d->sinit) = layout_class_initializer (d);
-DECL_INSTANTIATED (d->sinit) = (d->isInstantiated () != NULL);
 d_finish_decl (d->sinit);
 
 /* Put out the TypeInfo.  */
@@ -611,7 +609,6 @@ public:
/* Generate static initializer.  */
d->sinit = enum_initializer_decl (d);
DECL_INITIAL (d->sinit) = build_expr (tc->sym->defaultval, true);
-   DECL_INSTANTIATED (d->sinit) = (d->isInstantiated () != NULL);
d_finish_decl (d->sinit);
   }
 
@@ -1379,6 +1376,8 @@ declare_extern_var (tree ident, tree type)
   /* The decl has not been defined -- yet.  */
   DECL_EXTERNAL (decl) = 1;
 
+  set_linkage_for_decl (decl);
+
   return decl;
 }
 
@@ -1540,7 +1539,6 @@ d_finish_decl (tree decl)
 set_decl_tls_model (decl, decl_default_tls_model (decl));
 
   relayout_decl (decl);
-  set_linkage_for_decl (decl);
 
   if (flag_checking && DECL_INITIAL (decl))
 {
-- 
2.27.0



Re: [PATCH 2/3] d: Add TARGET_D_REGISTER_CPU_TARGET_INFO

2021-04-14 Thread ibuclaw--- via Gcc-patches


> On 05/04/2021 21:43 Iain Buclaw  wrote:
> 
>  
> Hi,
> 
> This patch adds TARGET_D_REGISTER_CPU_TARGET_INFO as a new D front-end
> target hook, implementing `__traits(getTargetInfo, "floatAbi")' for all
> targets that have D support files.
> 
> This trait was added earlier in the front-end as a stub, however the
> target-specific implementation was left out until now.
> 
> Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and
> tested on x86_64-darwin for getting the libphobos port set-up.
> 
> Any issues with this, or OK to commit?
> 

As I've seen no objections, this has been committed to mainline.

Regards,
Iain.


Re: [PATCH 1/3] d: Add TARGET_D_HAS_STDCALL_CONVENTION

2021-04-14 Thread ibuclaw--- via Gcc-patches
> On 05/04/2021 21:43 Iain Buclaw  wrote:
> 
>  
> Hi,
> 
> This patch adds TARGET_D_HAS_STDCALL_CONVENTION as a new D front-end
> target hook.  It replaces the use of the D front-end `is64bit' parameter
> in determining whether to insert the "stdcall" function attribute.
> 
> It is also used to determine whether `extern(System)' should be the same
> as `extern(Windows)' in the implementation of Target::systemLinkage.
> 
> Both are prerequesites for being able to compile libphobos on MinGW.
> 
> Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and
> tested on x86_64-w64-mingw64 for getting the libphobos port set-up.
> 
> Any issues with the implementation, or OK to commit?
> 

As I've seen no objections, this has been committed to mainline.

Regards,
Iain.


[PATCH] Avoid more temporaries in IVOPTs

2021-04-14 Thread Richard Biener
This avoids use of valid_gimple_rhs_p and instead gimplifies to
such a RHS, avoiding more SSA copies being generated by IVOPTs.

Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1

2021-04-14  Richard Biener  

* tree-ssa-loop-ivopts.c (rewrite_use_nonlinear_expr): Avoid
valid_gimple_rhs_p by instead gimplifying to one.
---
 gcc/tree-ssa-loop-ivopts.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 4012ae3f19d..12a8a49a307 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -7286,12 +7286,13 @@ rewrite_use_nonlinear_expr (struct ivopts_data *data,
 }
 
   comp = fold_convert (type, comp);
-  if (!valid_gimple_rhs_p (comp)
-  || (gimple_code (use->stmt) != GIMPLE_PHI
- /* We can't allow re-allocating the stmt as it might be pointed
-to still.  */
- && (get_gimple_rhs_num_ops (TREE_CODE (comp))
- >= gimple_num_ops (gsi_stmt (bsi)
+  comp = force_gimple_operand (comp, , false, NULL);
+  gimple_seq_add_seq (_list, seq);
+  if (gimple_code (use->stmt) != GIMPLE_PHI
+  /* We can't allow re-allocating the stmt as it might be pointed
+to still.  */
+  && (get_gimple_rhs_num_ops (TREE_CODE (comp))
+ >= gimple_num_ops (gsi_stmt (bsi
 {
   comp = force_gimple_operand (comp, , true, NULL);
   gimple_seq_add_seq (_list, seq);
-- 
2.26.2


[PATCH] VEC_COND_EXPR verification adjustment

2021-04-14 Thread Richard Biener
This adjusts GIMPLE verification with respect to the VEC_COND_EXPR
changes forcing a split out condition.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-04-14  Richard Biener  

* tree-cfg.c (verify_gimple_assign_ternary): Verify that
VEC_COND_EXPRs have a gimple_val condition.
* tree-ssa-propagate.c (valid_gimple_rhs_p): VEC_COND_EXPR
can no longer have a GENERIC condition.
---
 gcc/tree-cfg.c   | 2 ++
 gcc/tree-ssa-propagate.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 7e3aae5f9c2..4f63aa69ba8 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -4246,6 +4246,8 @@ verify_gimple_assign_ternary (gassign *stmt)
  debug_generic_expr (rhs1_type);
  return true;
}
+  if (!is_gimple_val (rhs1))
+   return true;
   /* Fallthrough.  */
 case COND_EXPR:
   if (!is_gimple_val (rhs1)
diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c
index def16c036ab..17dd1efd81d 100644
--- a/gcc/tree-ssa-propagate.c
+++ b/gcc/tree-ssa-propagate.c
@@ -515,7 +515,7 @@ valid_gimple_rhs_p (tree expr)
default:
  if (get_gimple_rhs_class (code) == GIMPLE_TERNARY_RHS)
{
- if (((code == VEC_COND_EXPR || code == COND_EXPR)
+ if ((code == COND_EXPR
   ? !is_gimple_condexpr (TREE_OPERAND (expr, 0))
   : !is_gimple_val (TREE_OPERAND (expr, 0)))
  || !is_gimple_val (TREE_OPERAND (expr, 1))
-- 
2.26.2


Aarch64 patch ping^3

2021-04-14 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 07, 2021 at 03:53:26PM +0200, Jakub Jelinek via Gcc-patches wrote:
> On Mon, Mar 29, 2021 at 11:16:55AM +0200, Jakub Jelinek wrote:
> > > Looks good to me.  Richard E knows this code better than I do though,
> > > so I think he should have the final say.  He's currently on holiday
> > > but will be back next week.
> > 
> > I'd like to ping this patch.
> 
> Ping.

Ping.

> > > > 2021-03-18  Jakub Jelinek  
> > > >
> > > > PR target/91710
> > > > * config/aarch64/aarch64.c (aarch64_function_arg_alignment): 
> > > > Change
> > > > abi_break argument from bool * to unsigned *, store there the 
> > > > pre-GCC 9
> > > > alignment.
> > > > (aarch64_layout_arg, aarch64_gimplify_va_arg_expr): Adjust 
> > > > callers.
> > > > (aarch64_function_arg_regno_p): Likewise.  Only emit -Wpsabi 
> > > > note if
> > > > the old and new alignment after applying MIN/MAX to it is 
> > > > different.
> > > >
> > > > * gcc.target/aarch64/pr91710.c: New test.

Thanks

Jakub



[committed] arm: fix warning when -mcpu=neoverse-n1 is used with -mfpu=neon [PR100067]

2021-04-14 Thread Richard Earnshaw via Gcc-patches


If the compiler is configured with --with-fpu= (or invoked
with, say, -mfpu=neon), then specifying -mcpu=neoverse-n1 can lead to
an unexpected warning: cc1: warning: switch ‘-mcpu=neoverse-n1’
conflicts with ‘-march=armv8.2-a’ switch

The fix for this is to correctly remove all the feature bits relating
to simd/fp units when -mfpu is used, not just those bits that form
part of the -mfpu specification (which is a subset).

gcc:
    PR target/100067
    * config/arm/arm.c (arm_configure_build_target): Strip 
isa_all_fpbits

    from the isa_delta when -mfpu has been used.
    (arm_options_perform_arch_sanity_checks): It's the architecture 
that

    lacks an FPU not the processor.
---
 gcc/config/arm/arm.c | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 8910dad8214..475fb0d827f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3230,21 +3230,22 @@ arm_configure_build_target (struct arm_build_target *target,
 	  bitmap_xor (isa_delta, cpu_isa, target->isa);
 	  /* Ignore any bits that are quirk bits.  */
 	  bitmap_and_compl (isa_delta, isa_delta, isa_quirkbits);
-	  /* Ignore (for now) any bits that might be set by -mfpu.  */
-	  bitmap_and_compl (isa_delta, isa_delta, isa_all_fpubits_internal);
-
-	  /* And if the target ISA lacks floating point, ignore any
-	 extensions that depend on that.  */
-	  if (!bitmap_bit_p (target->isa, isa_bit_vfpv2))
+	  /* If the user (or the default configuration) has specified a
+	 specific FPU, then ignore any bits that depend on the FPU
+	 configuration.  Do similarly if using the soft-float
+	 ABI.  */
+	  if (opts->x_arm_fpu_index != TARGET_FPU_auto
+	  || arm_float_abi == ARM_FLOAT_ABI_SOFT)
 	bitmap_and_compl (isa_delta, isa_delta, isa_all_fpbits);
 
 	  if (!bitmap_empty_p (isa_delta))
 	{
 	  if (warn_compatible)
 		warning (0, "switch %<-mcpu=%s%> conflicts "
-			 "with %<-march=%s%> switch",
-			 arm_selected_cpu->common.name,
-			 arm_selected_arch->common.name);
+			 "with switch %<-march=%s%>",
+			 opts->x_arm_cpu_string,
+			 opts->x_arm_arch_string);
+
 	  /* -march wins for code generation.
 		 -mcpu wins for default tuning.  */
 	  if (!arm_selected_tune)
@@ -3395,7 +3396,9 @@ arm_configure_build_target (struct arm_build_target *target,
   auto_sbitmap fpu_bits (isa_num_bits);
 
   arm_initialize_isa (fpu_bits, arm_selected_fpu->isa_bits);
-  bitmap_and_compl (target->isa, target->isa, isa_all_fpubits_internal);
+  /* Clear out ALL bits relating to the FPU/simd extensions, to avoid
+	 potentially invalid combinations later on that we can't match.  */
+  bitmap_and_compl (target->isa, target->isa, isa_all_fpbits);
   bitmap_ior (target->isa, target->isa, fpu_bits);
 }
 
@@ -3856,7 +3859,7 @@ arm_options_perform_arch_sanity_checks (void)
 	  arm_pcs_default = ARM_PCS_AAPCS_VFP;
 	  if (!bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2)
 	  && !bitmap_bit_p (arm_active_target.isa, isa_bit_mve))
-	error ("%<-mfloat-abi=hard%>: selected processor lacks an FPU");
+	error ("%<-mfloat-abi=hard%>: selected architecture lacks an FPU");
 	}
   else
 	arm_pcs_default = ARM_PCS_AAPCS;


[committed] testsuite: Fix up libgomp.fortran/alloc-1.F90 testcase [PR100071]

2021-04-14 Thread Jakub Jelinek via Gcc-patches
Hi!

As can be seen under valgrind, the testcase didn't bind in the last part
the fortran pointers properly to the c pointers.

Tested on x86_64-linux, committed to trunk.

2021-04-14  Jakub Jelinek  

PR testsuite/100071
* testsuite/libgomp.fortran/alloc-1.F90: Call c_f_pointer after last
cp = omp_alloc with cp, p arguments instead of cq, q and call
c_f_pointer after last cq = omp_alloc with cq, q.

--- libgomp/testsuite/libgomp.fortran/alloc-1.F90.jj2020-07-18 
00:05:57.245601544 +0200
+++ libgomp/testsuite/libgomp.fortran/alloc-1.F90   2021-04-14 
10:35:19.425186364 +0200
@@ -155,12 +155,13 @@
 cp = omp_alloc (ONEoFIVE,   &
  &  omp_null_allocator)
 if (mod (transfer (cp, intptr), 32_c_intptr_t) /= 0) stop 17
-call c_f_pointer (cq, q, [ONEoFIVE  &
+call c_f_pointer (cp, p, [ONEoFIVE  &
  &/ c_sizeof (i)])
 p(1) = 5
 p(ONEoFIVE / c_sizeof (i)) = 6
 cq = omp_alloc (768_c_size_t, omp_null_allocator)
 if (mod (transfer (cq, intptr), 128_c_intptr_t) /= 0) stop 18
+call c_f_pointer (cq, q, [768 / c_sizeof (i)])
 q(1) = 7
 q(768 / c_sizeof (i)) = 8
 if (c_associated (omp_alloc (768_c_size_t, omp_null_allocator))) &


Jakub



Re: [committed] add test for PR 86058

2021-04-14 Thread Christophe Lyon via Gcc-patches
On Tue, 13 Apr 2021 at 21:50, Martin Sebor via Gcc-patches
 wrote:
>
> The issue has been fixed so r11-8161 just adds the test case:
>https://gcc.gnu.org/g:8084ab15a3e300e3b2c537e56e0f3a1b00778aec
>

Hi,

This new test fails on arm (and aarch64 with -mabi=ilp32):
XFAIL: gcc.dg/pr86058.c pr? (test for warnings, line 13)
FAIL: gcc.dg/pr86058.c actual (test for warnings, line 13)
PASS: gcc.dg/pr86058.c (test for excess errors)

Can you check?

Thanks

> Martin


[PATCH] Fix intrinsics mm_malloc.h in freestanding [PR100057]

2021-04-14 Thread unlvsur unlvsur via Gcc-patches
>From b1774ab1c8aad82b7a5d975ef90c6d3f633780ee Mon Sep 17 00:00:00 2001
From: expnkx 
Date: Wed, 14 Apr 2021 03:14:28 -0400
Subject: [PATCH] Fix intrinsics mm_malloc.h in freestanding [PR100057]

C does not have stdlib.h and C++ cstdint in freestanding does not malloc 
either. This leads
to fail of compilation even with -ffrestanding flag.

Only gmm_malloc checks errno, everything else does not. So we remove the errno
in gmm_malloc too. There is no reason freestanding should behave differently 
with hosted.

gcc/ChangeLog
   PR/100057:
  gcc/config/i386/gmm_malloc.h: use __builtin_malloc and __builtin_free 
instead
  gcc/config/i386/pmm_malloc.h: use __builtin_malloc and __builtin_free 
instead
  gcc/config/rs6000/mm_malloc.h: use __builtin_malloc and __builtin_free 
instead

---
gcc/config/i386/gmm_malloc.h  | 13 -
gcc/config/i386/pmm_malloc.h  | 13 +
gcc/config/rs6000/mm_malloc.h | 13 +
3 files changed, 22 insertions(+), 17 deletions(-)
mode change 100644 => 100755 gcc/config/i386/gmm_malloc.h

diff --git a/gcc/config/i386/gmm_malloc.h b/gcc/config/i386/gmm_malloc.h
old mode 100644
new mode 100755
index 70b38ab557b..276a5f50023
--- a/gcc/config/i386/gmm_malloc.h
+++ b/gcc/config/i386/gmm_malloc.h
@@ -24,10 +24,7 @@
#ifndef _MM_MALLOC_H_INCLUDED
#define _MM_MALLOC_H_INCLUDED
-#include 
-#if __STDC_HOSTED__
-#include 
-#endif
+#include 
 static __inline__ void *
 _mm_malloc (size_t __size, size_t __align)
@@ -38,9 +35,6 @@ _mm_malloc (size_t __size, size_t __align)
   /* Error if align is not a power of two.  */
   if (__align & (__align - 1))
 {
-#if __STDC_HOSTED__
-  errno = EINVAL;
-#endif
   return ((void *) 0);
 }
@@ -54,7 +48,7 @@ _mm_malloc (size_t __size, size_t __align)
 if (__align < 2 * sizeof (void *))
   __align = 2 * sizeof (void *);
-  __malloc_ptr = malloc (__size + __align);
+  __malloc_ptr = __builtin_malloc (__size + __align);
   if (!__malloc_ptr)
 return ((void *) 0);
@@ -72,7 +66,8 @@ static __inline__ void
_mm_free (void *__aligned_ptr)
{
   if (__aligned_ptr)
-free (((void **) __aligned_ptr)[-1]);
+__builtin_free (((void **) __aligned_ptr)[-1]);
}
+
#endif /* _MM_MALLOC_H_INCLUDED */
diff --git a/gcc/config/i386/pmm_malloc.h b/gcc/config/i386/pmm_malloc.h
index 1b0bfe37852..3b97107ccfc 100644
--- a/gcc/config/i386/pmm_malloc.h
+++ b/gcc/config/i386/pmm_malloc.h
@@ -24,14 +24,19 @@
#ifndef _MM_MALLOC_H_INCLUDED
#define _MM_MALLOC_H_INCLUDED
-#include 
+#include 
 /* We can't depend on  since the prototype of posix_memalign
may not be visible.  */
#ifndef __cplusplus
extern int posix_memalign (void **, size_t, size_t);
#else
-extern "C" int posix_memalign (void **, size_t, size_t) throw ();
+extern "C" int posix_memalign (void **, size_t, size_t)
+#if __cplusplus >= 201103L
+noexcept;
+#else
+throw ();
+#endif
#endif
 static __inline void *
@@ -39,7 +44,7 @@ _mm_malloc (size_t __size, size_t __alignment)
{
   void *__ptr;
   if (__alignment == 1)
-return malloc (__size);
+return __builtin_malloc (__size);
   if (__alignment == 2 || (sizeof (void *) == 8 && __alignment == 4))
 __alignment = sizeof (void *);
   if (posix_memalign (&__ptr, __alignment, __size) == 0)
@@ -51,7 +56,7 @@ _mm_malloc (size_t __size, size_t __alignment)
static __inline void
_mm_free (void *__ptr)
{
-  free (__ptr);
+  __builtin_free (__ptr);
}
 #endif /* _MM_MALLOC_H_INCLUDED */
diff --git a/gcc/config/rs6000/mm_malloc.h b/gcc/config/rs6000/mm_malloc.h
index c04348068e0..82aaab411da 100644
--- a/gcc/config/rs6000/mm_malloc.h
+++ b/gcc/config/rs6000/mm_malloc.h
@@ -24,14 +24,19 @@
#ifndef _MM_MALLOC_H_INCLUDED
#define _MM_MALLOC_H_INCLUDED
-#include 
+#include 
 /* We can't depend on  since the prototype of posix_memalign
may not be visible.  */
#ifndef __cplusplus
extern int posix_memalign (void **, size_t, size_t);
#else
-extern "C" int posix_memalign (void **, size_t, size_t) throw ();
+extern "C" int posix_memalign (void **, size_t, size_t)
+#if __cplusplus >= 201103L
+noexcept;
+#else
+throw ();
+#endif
#endif
 static __inline void *
@@ -44,7 +49,7 @@ _mm_malloc (size_t size, size_t alignment)
   void *ptr;
   if (alignment == malloc_align && alignment == vec_align)
-return malloc (size);
+return __builtin_malloc (size);
   if (alignment < vec_align)
 alignment = vec_align;
   if (posix_memalign (, alignment, size) == 0)
@@ -56,7 +61,7 @@ _mm_malloc (size_t size, size_t alignment)
static __inline void
_mm_free (void * ptr)
{
-  free (ptr);
+  __builtin_free (ptr);
}
 #endif /* _MM_MALLOC_H_INCLUDED */
--
2.25.1


FW: [PATCH] Fix intrinsics mm_malloc.h in freestanding [PR100057]

2021-04-14 Thread unlvsur unlvsur via Gcc-patches
>From b1774ab1c8aad82b7a5d975ef90c6d3f633780ee Mon Sep 17 00:00:00 2001
From: expnkx 
Date: Wed, 14 Apr 2021 03:14:28 -0400
Subject: [PATCH] Fix intrinsics mm_malloc.h in freestanding [PR100057]

C does not have stdlib.h and C++ cstdint in freestanding does not malloc 
either. This leads
to fail of compilation even with -ffrestanding flag

Only gmm_malloc checks errno, everything else does not. So we remove the errno
in gmm_malloc too. There is no reason freestanding should behave differently 
with hosted.

gcc/ChangeLog
   PR/100057:
  gcc/config/i386/gmm_malloc.h: use __builtin_malloc and __builtin_malloc 
instead
  gcc/config/i386/pmm_malloc.h: use __builtin_malloc and __builtin_malloc 
instead
  gcc/config/rs6000/mm_malloc.h: use __builtin_malloc and __builtin_malloc 
instead

---
gcc/config/i386/gmm_malloc.h  | 13 -
gcc/config/i386/pmm_malloc.h  | 13 +
gcc/config/rs6000/mm_malloc.h | 13 +
3 files changed, 22 insertions(+), 17 deletions(-)
mode change 100644 => 100755 gcc/config/i386/gmm_malloc.h

diff --git a/gcc/config/i386/gmm_malloc.h b/gcc/config/i386/gmm_malloc.h
old mode 100644
new mode 100755
index 70b38ab557b..276a5f50023
--- a/gcc/config/i386/gmm_malloc.h
+++ b/gcc/config/i386/gmm_malloc.h
@@ -24,10 +24,7 @@
#ifndef _MM_MALLOC_H_INCLUDED
#define _MM_MALLOC_H_INCLUDED
-#include 
-#if __STDC_HOSTED__
-#include 
-#endif
+#include 
 static __inline__ void *
 _mm_malloc (size_t __size, size_t __align)
@@ -38,9 +35,6 @@ _mm_malloc (size_t __size, size_t __align)
   /* Error if align is not a power of two.  */
   if (__align & (__align - 1))
 {
-#if __STDC_HOSTED__
-  errno = EINVAL;
-#endif
   return ((void *) 0);
 }
@@ -54,7 +48,7 @@ _mm_malloc (size_t __size, size_t __align)
 if (__align < 2 * sizeof (void *))
   __align = 2 * sizeof (void *);
-  __malloc_ptr = malloc (__size + __align);
+  __malloc_ptr = __builtin_malloc (__size + __align);
   if (!__malloc_ptr)
 return ((void *) 0);
@@ -72,7 +66,8 @@ static __inline__ void
_mm_free (void *__aligned_ptr)
{
   if (__aligned_ptr)
-free (((void **) __aligned_ptr)[-1]);
+__builtin_free (((void **) __aligned_ptr)[-1]);
}
+
#endif /* _MM_MALLOC_H_INCLUDED */
diff --git a/gcc/config/i386/pmm_malloc.h b/gcc/config/i386/pmm_malloc.h
index 1b0bfe37852..3b97107ccfc 100644
--- a/gcc/config/i386/pmm_malloc.h
+++ b/gcc/config/i386/pmm_malloc.h
@@ -24,14 +24,19 @@
#ifndef _MM_MALLOC_H_INCLUDED
#define _MM_MALLOC_H_INCLUDED
-#include 
+#include 
 /* We can't depend on  since the prototype of posix_memalign
may not be visible.  */
#ifndef __cplusplus
extern int posix_memalign (void **, size_t, size_t);
#else
-extern "C" int posix_memalign (void **, size_t, size_t) throw ();
+extern "C" int posix_memalign (void **, size_t, size_t)
+#if __cplusplus >= 201103L
+noexcept;
+#else
+throw ();
+#endif
#endif
 static __inline void *
@@ -39,7 +44,7 @@ _mm_malloc (size_t __size, size_t __alignment)
{
   void *__ptr;
   if (__alignment == 1)
-return malloc (__size);
+return __builtin_malloc (__size);
   if (__alignment == 2 || (sizeof (void *) == 8 && __alignment == 4))
 __alignment = sizeof (void *);
   if (posix_memalign (&__ptr, __alignment, __size) == 0)
@@ -51,7 +56,7 @@ _mm_malloc (size_t __size, size_t __alignment)
static __inline void
_mm_free (void *__ptr)
{
-  free (__ptr);
+  __builtin_free (__ptr);
}
 #endif /* _MM_MALLOC_H_INCLUDED */
diff --git a/gcc/config/rs6000/mm_malloc.h b/gcc/config/rs6000/mm_malloc.h
index c04348068e0..82aaab411da 100644
--- a/gcc/config/rs6000/mm_malloc.h
+++ b/gcc/config/rs6000/mm_malloc.h
@@ -24,14 +24,19 @@
#ifndef _MM_MALLOC_H_INCLUDED
#define _MM_MALLOC_H_INCLUDED
-#include 
+#include 
 /* We can't depend on  since the prototype of posix_memalign
may not be visible.  */
#ifndef __cplusplus
extern int posix_memalign (void **, size_t, size_t);
#else
-extern "C" int posix_memalign (void **, size_t, size_t) throw ();
+extern "C" int posix_memalign (void **, size_t, size_t)
+#if __cplusplus >= 201103L
+noexcept;
+#else
+throw ();
+#endif
#endif
 static __inline void *
@@ -44,7 +49,7 @@ _mm_malloc (size_t size, size_t alignment)
   void *ptr;
   if (alignment == malloc_align && alignment == vec_align)
-return malloc (size);
+return __builtin_malloc (size);
   if (alignment < vec_align)
 alignment = vec_align;
   if (posix_memalign (, alignment, size) == 0)
@@ -56,7 +61,7 @@ _mm_malloc (size_t size, size_t alignment)
static __inline void
_mm_free (void * ptr)
{
-  free (ptr);
+  __builtin_free (ptr);
}
 #endif /* _MM_MALLOC_H_INCLUDED */
--
2.25.1


[no subject]

2021-04-14 Thread unlvsur unlvsur via Gcc-patches
>From b1774ab1c8aad82b7a5d975ef90c6d3f633780ee Mon Sep 17 00:00:00 2001
From: expnkx 
Date: Wed, 14 Apr 2021 03:14:28 -0400
Subject: [PATCH] Fix intrinsics mm_malloc.h in freestanding [PR100057]

C does not have stdlib.h and C++ cstdint in freestanding does not malloc 
either. This leads
to fail of compilation even with -ffrestanding flag

Only gmm_malloc checks errno, everything else does not. So we remove the errno
in gmm_malloc too. There is no reason freestanding should behave differently 
with hosted.

gcc/ChangeLog
   PR/100057:
  gcc/config/i386/gmm_malloc.h: use __builtin_malloc and __builtin_free 
instead
  gcc/config/i386/pmm_malloc.h: use __builtin_malloc and __builtin_free 
instead
  gcc/config/rs6000/mm_malloc.h: use __builtin_malloc and __builtin_free 
instead

---
gcc/config/i386/gmm_malloc.h  | 13 -
gcc/config/i386/pmm_malloc.h  | 13 +
gcc/config/rs6000/mm_malloc.h | 13 +
3 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/gcc/config/i386/gmm_malloc.h b/gcc/config/i386/gmm_malloc.h
index 70b38ab557b..276a5f50023 100644
--- a/gcc/config/i386/gmm_malloc.h
+++ b/gcc/config/i386/gmm_malloc.h
@@ -24,10 +24,7 @@
#ifndef _MM_MALLOC_H_INCLUDED
#define _MM_MALLOC_H_INCLUDED
-#include 
-#if __STDC_HOSTED__
-#include 
-#endif
+#include 
 static __inline__ void *
 _mm_malloc (size_t __size, size_t __align)
@@ -38,9 +35,6 @@ _mm_malloc (size_t __size, size_t __align)
   /* Error if align is not a power of two.  */
   if (__align & (__align - 1))
 {
-#if __STDC_HOSTED__
-  errno = EINVAL;
-#endif
   return ((void *) 0);
 }
@@ -54,7 +48,7 @@ _mm_malloc (size_t __size, size_t __align)
 if (__align < 2 * sizeof (void *))
   __align = 2 * sizeof (void *);
-  __malloc_ptr = malloc (__size + __align);
+  __malloc_ptr = __builtin_malloc (__size + __align);
   if (!__malloc_ptr)
 return ((void *) 0);
@@ -72,7 +66,8 @@ static __inline__ void
_mm_free (void *__aligned_ptr)
{
   if (__aligned_ptr)
-free (((void **) __aligned_ptr)[-1]);
+__builtin_free (((void **) __aligned_ptr)[-1]);
}
+
#endif /* _MM_MALLOC_H_INCLUDED */
diff --git a/gcc/config/i386/pmm_malloc.h b/gcc/config/i386/pmm_malloc.h
index 1b0bfe37852..3b97107ccfc 100644
--- a/gcc/config/i386/pmm_malloc.h
+++ b/gcc/config/i386/pmm_malloc.h
@@ -24,14 +24,19 @@
#ifndef _MM_MALLOC_H_INCLUDED
#define _MM_MALLOC_H_INCLUDED
-#include 
+#include 
 /* We can't depend on  since the prototype of posix_memalign
may not be visible.  */
#ifndef __cplusplus
extern int posix_memalign (void **, size_t, size_t);
#else
-extern "C" int posix_memalign (void **, size_t, size_t) throw ();
+extern "C" int posix_memalign (void **, size_t, size_t)
+#if __cplusplus >= 201103L
+noexcept;
+#else
+throw ();
+#endif
#endif
 static __inline void *
@@ -39,7 +44,7 @@ _mm_malloc (size_t __size, size_t __alignment)
{
   void *__ptr;
   if (__alignment == 1)
-return malloc (__size);
+return __builtin_malloc (__size);
   if (__alignment == 2 || (sizeof (void *) == 8 && __alignment == 4))
 __alignment = sizeof (void *);
   if (posix_memalign (&__ptr, __alignment, __size) == 0)
@@ -51,7 +56,7 @@ _mm_malloc (size_t __size, size_t __alignment)
static __inline void
_mm_free (void *__ptr)
{
-  free (__ptr);
+  __builtin_free (__ptr);
}
 #endif /* _MM_MALLOC_H_INCLUDED */
diff --git a/gcc/config/rs6000/mm_malloc.h b/gcc/config/rs6000/mm_malloc.h
index c04348068e0..82aaab411da 100644
--- a/gcc/config/rs6000/mm_malloc.h
+++ b/gcc/config/rs6000/mm_malloc.h
@@ -24,14 +24,19 @@
#ifndef _MM_MALLOC_H_INCLUDED
#define _MM_MALLOC_H_INCLUDED
-#include 
+#include 
 /* We can't depend on  since the prototype of posix_memalign
may not be visible.  */
#ifndef __cplusplus
extern int posix_memalign (void **, size_t, size_t);
#else
-extern "C" int posix_memalign (void **, size_t, size_t) throw ();
+extern "C" int posix_memalign (void **, size_t, size_t)
+#if __cplusplus >= 201103L
+noexcept;
+#else
+throw ();
+#endif
#endif
 static __inline void *
@@ -44,7 +49,7 @@ _mm_malloc (size_t size, size_t alignment)
   void *ptr;
   if (alignment == malloc_align && alignment == vec_align)
-return malloc (size);
+return __builtin_malloc (size);
   if (alignment < vec_align)
 alignment = vec_align;
   if (posix_memalign (, alignment, size) == 0)
@@ -56,7 +61,7 @@ _mm_malloc (size_t size, size_t alignment)
static __inline void
_mm_free (void * ptr)
{
-  free (ptr);
+  __builtin_free (ptr);
}
 #endif /* _MM_MALLOC_H_INCLUDED */
--
2.25.1


Sent from Mail for Windows 10



[PATCH] aarch64: Fix several *_ashl3 related regressions [PR100056]

2021-04-14 Thread Jakub Jelinek via Gcc-patches
Hi!

Before combiner added 2 to 2 combinations, the following testcase functions
have been all compiled into 2 instructions, zero/sign extensions or and
followed by orr with lsl, e.g. for the first function
Trying 7 -> 8:
7: r96:SI=r94:SI<<0xb
8: r95:SI=r96:SI|r94:SI
  REG_DEAD r96:SI
  REG_DEAD r94:SI
Successfully matched this instruction:
(set (reg:SI 95)
(ior:SI (ashift:SI (reg/v:SI 94 [ i ])
(const_int 11 [0xb]))
(reg/v:SI 94 [ i ])))
is the important successful try_combine and so we end up with
and w0, w0, 255
orr w0, w0, w0, lsl 11
in the body.
With 2 to 2 combination, before that can trigger, another successful
combination:
Trying 2 -> 7:
2: r94:SI=zero_extend(x0:QI)
  REG_DEAD x0:QI
7: r96:SI=r94:SI<<0xb
is replaced with:
(set (reg/v:SI 94 [ i ])
(zero_extend:SI (reg:QI 0 x0 [ i ])))
and
(set (reg:SI 96)
(and:SI (ashift:SI (reg:SI 0 x0 [ i ])
(const_int 11 [0xb]))
(const_int 522240 [0x7f800])))
and in the end results in 3 instructions in the body:
and w1, w0, 255
ubfiz   w0, w0, 11, 8
orr w0, w0, w1
The following combine splitters help undo that when combiner tries to
combine 3 instructions - the zero/sign extend or and, the other insn
from the 2 to 2 combination ([us]bfiz) and the logical op, the CPUs
don't have an insn to do everything in one op, but we can split it
back into the zero/sign extend or and followed by logical with lsl.

Bootstrapped/regtested on aarch64-linux, ok for trunk?

2021-04-14  Jakub Jelinek  

PR target/100056
* config/aarch64/aarch64.md (*_3):
Add combine splitters for *_ashl3 with
ZERO_EXTEND, SIGN_EXTEND or AND.

* gcc.target/aarch64/pr100056.c: New test.

--- gcc/config/aarch64/aarch64.md.jj2021-04-13 12:40:57.0 +0200
+++ gcc/config/aarch64/aarch64.md   2021-04-13 19:54:17.015764651 +0200
@@ -4431,6 +4431,59 @@ (define_insn "*_"))
+  (match_operand:GPI 4 "const_int_operand"))
+ (zero_extend:GPI (match_operand 3 "register_operand"]
+  "can_create_pseudo_p ()
+   && REG_P (operands[1])
+   && REG_P (operands[3])
+   && REGNO (operands[1]) == REGNO (operands[3])
+   && ((unsigned HOST_WIDE_INT)
+   trunc_int_for_mode (GET_MODE_MASK (GET_MODE (operands[3]))
+  << INTVAL (operands[2]), mode)
+   == UINTVAL (operands[4]))"
+  [(set (match_dup 4) (zero_extend:GPI (match_dup 3)))
+   (set (match_dup 0) (LOGICAL:GPI (ashift:GPI (match_dup 4) (match_dup 2))
+  (match_dup 4)))]
+  "operands[4] = gen_reg_rtx (mode);"
+)
+
+(define_split
+  [(set (match_operand:GPI 0 "register_operand")
+   (LOGICAL:GPI
+ (and:GPI (ashift:GPI (match_operand:GPI 1 "register_operand")
+  (match_operand:QI 2 "aarch64_shift_imm_"))
+  (match_operand:GPI 4 "const_int_operand"))
+ (and:GPI (match_dup 1) (match_operand:GPI 3 "const_int_operand"]
+  "can_create_pseudo_p ()
+   && pow2_or_zerop (UINTVAL (operands[3]) + 1)
+   && ((unsigned HOST_WIDE_INT)
+   trunc_int_for_mode (UINTVAL (operands[3])
+  << INTVAL (operands[2]), mode)
+   == UINTVAL (operands[4]))"
+  [(set (match_dup 4) (and:GPI (match_dup 1) (match_dup 3)))
+   (set (match_dup 0) (LOGICAL:GPI (ashift:GPI (match_dup 4) (match_dup 2))
+  (match_dup 4)))]
+  "operands[4] = gen_reg_rtx (mode);"
+)
+
+(define_split
+  [(set (match_operand:GPI 0 "register_operand")
+   (LOGICAL:GPI
+ (ashift:GPI (sign_extend:GPI (match_operand 1 "register_operand"))
+ (match_operand:QI 2 "aarch64_shift_imm_"))
+ (sign_extend:GPI (match_dup 1]
+  "can_create_pseudo_p ()"
+  [(set (match_dup 4) (sign_extend:GPI (match_dup 1)))
+   (set (match_dup 0) (LOGICAL:GPI (ashift:GPI (match_dup 4) (match_dup 2))
+  (match_dup 4)))]
+  "operands[4] = gen_reg_rtx (mode);"
+)
+
 (define_insn "*_rol3"
   [(set (match_operand:GPI 0 "register_operand" "=r")
(LOGICAL:GPI (rotate:GPI
--- gcc/testsuite/gcc.target/aarch64/pr100056.c.jj  2021-04-13 
14:20:53.334784184 +0200
+++ gcc/testsuite/gcc.target/aarch64/pr100056.c 2021-04-13 19:44:09.358529648 
+0200
@@ -0,0 +1,50 @@
+/* PR target/100056 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-not {\t[us]bfiz\tw[0-9]+, w[0-9]+, 11} } } */
+
+int
+or_shift_u8 (unsigned char i)
+{
+  return i | (i << 11);
+}
+
+int
+or_shift_u3a (unsigned i)
+{
+  i &= 7;
+  return i | (i << 11);
+}
+
+int
+or_shift_u3b (unsigned i)
+{
+  i = (i << 29) >> 29;
+  return i | (i << 11);
+}
+
+int
+or_shift_s16 (signed short i)
+{
+  return i | (i << 11);
+}
+
+int
+or_shift_s8 (signed char i)
+{
+  return i | (i << 11);
+}
+
+int
+or_shift_s13 (int i)
+{
+  i = (i << 19) >> 19;
+  return i | (i << 11);
+}
+
+int
+or_shift_s3 

Re: [RFC] Run pass_sink_code once more after ivopts/fre

2021-04-14 Thread Richard Biener
On Wed, 14 Apr 2021, Xionghu Luo wrote:

> Hi,
> 
> On 2021/3/26 15:35, Xionghu Luo via Gcc-patches wrote:
> >> Also we already have a sinking pass on RTL which even computes
> >> a proper PRE on the reverse graph - -fgcse-sm aka store-motion.c.
> >> I'm not sure whether this deals with non-stores but the
> >> LCM machinery definitely can handle arbitrary expressions.  I wonder
> >> if it makes more sense to extend this rather than inventing a new
> >> ad-hoc sinking pass?
> >  From the literal, my pass doesn't handle or process store instructions
> > like store-motion..  Thanks, will check it.
> 
> Store motion only processes store instructions with data flow equations,
> generating 4 inputs(st_kill, st_avloc, st_antloc, st_transp) and solve it
> by Lazy Code Motion API(5 DF compute call) with 2 outputs (st_delete_map,
> st_insert_map) globally, each store place is independently represented in
> the input bitmap vectors. Output is which should be delete and where to
> insert, current code does what you said "emit copies to a new pseudo at
> the original insn location and use it in followed bb", actually it is
> "store replacement" instead of "store move", why not save one pseudo by
> moving the store instruction to target edge directly?

It probably simply saves the pass from doing analysis whether the
stored value is clobbered on the sinking path, enabling more store
sinking.  For stores that might be even beneficial, for non-stores
it becomes more of a cost issue, yes.

> There are many differences between the newly added rtl-sink pass and
> store-motion pass. 
> 1. Store motion moves only store instructions, rtl-sink ignores store
> instructions. 
> 2. Store motion is a global DF problem solving, rtl-sink only processes
> loop header reversely with dependency check in loop, take the below RTL
> as example,
> "#538,#235,#234,#233" will all be sunk from bb 35 to bb 37 by rtl-sink,
> but it moves #538 first, then #235, there is strong dependency here. It
> seemsdoesn't like the LCM framework that could solve all and do the
> delete-insert in one iteration. 

So my question was whether we want to do both within the LCM store
sinking framework.  The LCM dataflow is also used by RTL PRE which
handles both loads and non-loads so in principle it should be able
to handle stores and non-stores for the sinking case (PRE on the
reverse CFG).

A global dataflow is more powerful than any local ad-hoc method.

Richard.

> However, there are still some common methods could be shared, like the
> def-use check(though store-motion is per bb, rtl-sink is per loop),
> insert_store, commit_edge_insertions etc.
> 
> 
>   508: L508:
>   507: NOTE_INSN_BASIC_BLOCK 34
>12: r139:DI=r140:DI
>   REG_DEAD r140:DI
>   240: L240:
>   231: NOTE_INSN_BASIC_BLOCK 35
>   232: r142:DI=zero_extend(r139:DI#0)
>   233: r371:SI=r142:DI#0-0x1
>   234: r243:DI=zero_extend(r371:SI)
>   REG_DEAD r371:SI
>   235: r452:DI=r262:DI+r139:DI
>   538: r194:DI=r452:DI
>   236: r372:CCUNS=cmp(r142:DI#0,r254:DI#0)
>   237: pc={(geu(r372:CCUNS,0))?L246:pc}
>   REG_DEAD r372:CCUNS
>   REG_BR_PROB 59055804
>   238: NOTE_INSN_BASIC_BLOCK 36
>   239: r140:DI=r139:DI+0x1
>   241: r373:DI=r251:DI-0x1
>   242: r374:SI=zero_extend([r262:DI+r139:DI])
>   REG_DEAD r139:DI
>   243: r375:SI=zero_extend([r373:DI+r140:DI])
>   REG_DEAD r373:DI
>   244: r376:CC=cmp(r374:SI,r375:SI)
>   REG_DEAD r375:SI
>   REG_DEAD r374:SI
>   245: pc={(r376:CC==0)?L508:pc}
>   REG_DEAD r376:CC
>   REG_BR_PROB 1014686028
>   246: L246:
>   247: NOTE_INSN_BASIC_BLOCK 37
>   248: r377:SI=r142:DI#0-0x2
>   REG_DEAD r142:DI
>   249: r256:DI=zero_extend(r377:SI)
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)