Re: [PATCH PR97627]Avoid computing niters info for fake edges

2021-01-28 Thread Richard Biener via Gcc-patches
On Thu, Jan 28, 2021 at 11:31 AM Bin.Cheng  wrote:
>
> On Thu, Jan 28, 2021 at 5:08 PM Richard Biener via Gcc-patches
>  wrote:
> >
> > On Thu, Jan 28, 2021 at 3:49 AM bin.cheng via Gcc-patches
> >  wrote:
> > >
> > > Hi,
> > > As described in commit message, we need to avoid computing niters info 
> > > for fake
> > > edges.  This simple patch does this by two changes.
> > >
> > > Bootstrap and test on X86_64, is it ok?
> >
> > Hmm, so I think the patch is a bit complicated and avoiding niter compute
> > for fake edges would be easier when just returning false for
> > fake edges in number_of_iterations_exit_assumptions?
> I just grepped calls to get_loop_exit_edges, and thought there might
> be cases other than niters analysis that would also like to skip fake
> edges.  But I didn't check the calls one by one.

My hunch is that the usual APIs always want to ignore them, but let's
do a minimal fix that we can backport easily.

> >
> > Which pass was the problematical that had infinite loops connected to exit?
> >
> > I guess the cfgloop code should simply ignore fake exits - they mostly
> > exist to make reverse CFG walks easy.  Specifically single_exit
> > and single_likely_exit but also exit edge recording should ignore them.
> >
> > That said, the testcase seems to be fixed with just
> >
> > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> > index 7d61ef080eb..7775bc7275c 100644
> > --- a/gcc/tree-ssa-loop-niter.c
> > +++ b/gcc/tree-ssa-loop-niter.c
> > @@ -2407,6 +2407,11 @@ number_of_iterations_exit_assumptions (class
> > loop *loop, edge exit,
> >affine_iv iv0, iv1;
> >bool safe;
> >
> > +  /* The condition at a fake exit (if it exists) does not control its
> > + execution.  */
> > +  if (exit->flags & EDGE_FAKE)
> > +return false;
> > +
> >/* Nothing to analyze if the loop is known to be infinite.  */
> >if (loop_constraint_set_p (loop, LOOP_C_INFINITE))
> >  return false;
> >
> > Your dfs_find_deadend change likely "breaks" post-dominance DFS order
> > (this is a very fragile area).
> >
> > So any objection to just simplify the patch to the above hunk?
> Considering we are in late stage3? No objection to this change.  But I
> do think dfs_find_deadend needs to be improved, if not as this patch
> does.  For a loop nest with the outermost loop as the infinite one,
> the function adds fake (exit) edges for inner loops, which is
> counter-intuitive.

Sure, but then this is independent of the PR.  As said, the fake exits
only exist to make reverse CFG walkers easier - yes, for natural
infinite loops we'd like to have "intuitive" post-dom behavior but for
example for irreducible regions there's not much to do.

Richard.

> Thanks,
> bin
> >
> > Thanks,
> > Richard.
> >
> > > Thanks,
> > > bin


Re: [PATCH] c++: Fix infinite looping with invalid operator [PR96137]

2021-01-28 Thread Jason Merrill via Gcc-patches

On 1/28/21 10:57 PM, Marek Polacek wrote:

My r11-86 adjusted cp_parser_class_name to do

-  scope = parser->scope;
+  scope = parser->scope ? parser->scope : parser->context->object_type;
if (scope == error_mark_node)
  return error_mark_node;

but that caused endless looping in cp_parser_type_specifier_seq (the
while (true) loop) in this invalid test, because we never set a parser
error, therefore cp_parser_type_specifier returned error_mark_node
instead of NULL_TREE, and we never issued the "expected type-specifier"
error.

At first I thought I'd just add cp_parser_simulate_error right before
the return, but that regresses crash81.C -- we'd emit multiple errors
for "T::X".  So the next best thing seemed to revert to pre-r11-86
behavior: return early when parser->scope is bad, otherwise proceed to
get the parser error.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


gcc/cp/ChangeLog:

PR c++/96137
* parser.c (cp_parser_class_name): If parser->scope is
error_mark_node, return it, otherwise continue.

gcc/testsuite/ChangeLog:

PR c++/96137
* g++.dg/parse/error63.C: New test.
---
  gcc/cp/parser.c  | 4 +++-
  gcc/testsuite/g++.dg/parse/error63.C | 8 
  2 files changed, 11 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/parse/error63.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index e196db14113..5c1d880c9fc 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -24559,7 +24559,9 @@ cp_parser_class_name (cp_parser *parser,
where we first want to look up A::a in the class of the object
expression, as per [basic.lookup.classref].  */
tree scope = parser->scope ? parser->scope : parser->context->object_type;
-  if (scope == error_mark_node)
+  /* This only checks parser->scope to avoid duplicate errors; if
+ ->object_type is erroneous, go on to give a parse error.  */
+  if (parser->scope == error_mark_node)
  return error_mark_node;
  
/* Any name names a type if we're following the `typename' keyword

diff --git a/gcc/testsuite/g++.dg/parse/error63.C 
b/gcc/testsuite/g++.dg/parse/error63.C
new file mode 100644
index 000..5472ef05a64
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/error63.C
@@ -0,0 +1,8 @@
+// PR c++/96137
+// { dg-do compile }
+
+void
+fn ()
+{
+  X.operator T(); // { dg-error ".X. was not declared in this scope|expected" }
+}

base-commit: e6bce7fe17bf32ce969abc6f77f07acd352f6977





[PATCH] c++: Fix infinite looping with invalid operator [PR96137]

2021-01-28 Thread Marek Polacek via Gcc-patches
My r11-86 adjusted cp_parser_class_name to do

-  scope = parser->scope;
+  scope = parser->scope ? parser->scope : parser->context->object_type;
   if (scope == error_mark_node)
 return error_mark_node;

but that caused endless looping in cp_parser_type_specifier_seq (the
while (true) loop) in this invalid test, because we never set a parser
error, therefore cp_parser_type_specifier returned error_mark_node
instead of NULL_TREE, and we never issued the "expected type-specifier"
error.

At first I thought I'd just add cp_parser_simulate_error right before
the return, but that regresses crash81.C -- we'd emit multiple errors
for "T::X".  So the next best thing seemed to revert to pre-r11-86
behavior: return early when parser->scope is bad, otherwise proceed to
get the parser error.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/96137
* parser.c (cp_parser_class_name): If parser->scope is
error_mark_node, return it, otherwise continue.

gcc/testsuite/ChangeLog:

PR c++/96137
* g++.dg/parse/error63.C: New test.
---
 gcc/cp/parser.c  | 4 +++-
 gcc/testsuite/g++.dg/parse/error63.C | 8 
 2 files changed, 11 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/parse/error63.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index e196db14113..5c1d880c9fc 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -24559,7 +24559,9 @@ cp_parser_class_name (cp_parser *parser,
   where we first want to look up A::a in the class of the object
   expression, as per [basic.lookup.classref].  */
   tree scope = parser->scope ? parser->scope : parser->context->object_type;
-  if (scope == error_mark_node)
+  /* This only checks parser->scope to avoid duplicate errors; if
+ ->object_type is erroneous, go on to give a parse error.  */
+  if (parser->scope == error_mark_node)
 return error_mark_node;
 
   /* Any name names a type if we're following the `typename' keyword
diff --git a/gcc/testsuite/g++.dg/parse/error63.C 
b/gcc/testsuite/g++.dg/parse/error63.C
new file mode 100644
index 000..5472ef05a64
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/error63.C
@@ -0,0 +1,8 @@
+// PR c++/96137
+// { dg-do compile }
+
+void
+fn ()
+{
+  X.operator T(); // { dg-error ".X. was not declared in this scope|expected" }
+}

base-commit: e6bce7fe17bf32ce969abc6f77f07acd352f6977
-- 
2.29.2



[PATCH, rs6000] Optimization for PowerPC 64bit constant generation [PR94395]

2021-01-28 Thread HAO CHEN GUI via Gcc-patches

Hi,

   This patch tries to optimize PowerPC 64 bit constant generation when 
the constant can be transformed from a 32 bit or 16 bit constant by 
rotating, shifting and mask AND.


   The attachments are the patch diff file and change log file.

   Bootstrapped and tested on powerpc64le with no regressions. Is this 
okay for trunk? Any  recommendations? Thanks a lot.


PR target/94395
* config/rs6000/rs6000.c (rs6000_emit_set_32bit_const,
rs6000_rotate_long_const, rs6000_peel_long_const): New functions.
(rs6000_emit_set_long_const, num_insns_constant_gpr): Call new
functions.
* testsuite/gcc.target/powerpc/pr94395.c: New test.
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index f26fc13484b..bcb867ffe94 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1109,6 +1109,9 @@ static tree rs6000_handle_longcall_attribute (tree *, 
tree, tree, int, bool *);
 static tree rs6000_handle_altivec_attribute (tree *, tree, tree, int, bool *);
 static tree rs6000_handle_struct_attribute (tree *, tree, tree, int, bool *);
 static tree rs6000_builtin_vectorized_libmass (combined_fn, tree, tree);
+static HOST_WIDE_INT rs6000_rotate_long_const (unsigned HOST_WIDE_INT, int *);
+static HOST_WIDE_INT rs6000_peel_long_const (unsigned HOST_WIDE_INT, int *,
+int *);
 static void rs6000_emit_set_long_const (rtx, HOST_WIDE_INT);
 static int rs6000_memory_move_cost (machine_mode, reg_class_t, bool);
 static bool rs6000_debug_rtx_costs (rtx, machine_mode, int, int, int *, bool);
@@ -5868,12 +5871,28 @@ num_insns_constant_gpr (HOST_WIDE_INT value)
 
   else if (TARGET_POWERPC64)
 {
+  int rotate, head, tail;
+  HOST_WIDE_INT imm1, imm2;
+  unsigned HOST_WIDE_INT uc = value;
   HOST_WIDE_INT low  = ((value & 0x) ^ 0x8000) - 0x8000;
   HOST_WIDE_INT high = value >> 31;
 
   if (high == 0 || high == -1)
return 2;
 
+  /* A long constant can be transformed from both a 16 bit constant and
+a 32 bit constant. So we first test imm1 and imm2 if they're 16
+bit.  */
+  imm1 = rs6000_rotate_long_const (uc, );
+  if (SIGNED_INTEGER_16BIT_P (imm1))
+   return 2;
+  imm2 = rs6000_peel_long_const (uc, , );
+  if (SIGNED_INTEGER_16BIT_P (imm2))
+   return 2;
+  if (SIGNED_INTEGER_NBIT_P (imm1, 32)
+ || SIGNED_INTEGER_NBIT_P (imm2, 32))
+   return 3;
+
   high >>= 1;
 
   if (low == 0 || low == high)
@@ -9720,6 +9739,96 @@ rs6000_emit_set_const (rtx dest, rtx source)
   return true;
 }
 
+/* Function to load 32 a bit constant.  */
+static void
+rs6000_emit_set_32bit_const (rtx dest, HOST_WIDE_INT c)
+{
+  gcc_assert (SIGNED_INTEGER_NBIT_P (c, 32));
+
+  rtx temp = can_create_pseudo_p () ? gen_reg_rtx (DImode) : dest;
+
+  if (SIGNED_INTEGER_16BIT_P (c))
+emit_insn (gen_rtx_SET (dest, GEN_INT (c)));
+  else
+{
+  emit_insn (gen_rtx_SET (copy_rtx (temp),
+GEN_INT (c & ~(HOST_WIDE_INT) 0x)));
+  emit_insn (gen_rtx_SET (dest,
+ gen_rtx_IOR (DImode, copy_rtx (temp),
+  GEN_INT (c & 0x;
+}
+}
+
+/* Helper function of rs6000_emit_set_long_const to left rotate a long
+   constant. It returns the result immediately when it finds a 32 bit
+   constant. It at most rotates for 31 bits.
+   For instant, the constant 0x1234 can be transformed to
+   a 32 bit constant 0x4123 by left rotating 12 bits.  */
+static HOST_WIDE_INT
+rs6000_rotate_long_const (unsigned HOST_WIDE_INT c, int *rot)
+{
+  int bitsize = GET_MODE_BITSIZE (DImode);
+  bool found = false;
+  unsigned HOST_WIDE_INT imm = c;
+  unsigned HOST_WIDE_INT m = imm >> (bitsize - 1);
+  int rotate = 0;
+
+  while (rotate < 31 && !found)
+{
+  imm = imm << 1 | m;
+  if (clz_hwi (imm) > 32 || clz_hwi (~imm) > 32)
+   found = true;
+  rotate++;
+  m = imm >> (bitsize - 1);
+}
+
+  *rot = rotate;
+  return imm;
+}
+
+/* Helper function of rs6000_emit_set_long_const to reutrn a constant by
+   removing consecutive 0s and 1s at the head and tail then setting all high
+   bits.
+   For instance, 0x00fff2345000 can be transformed to 0xfff2345 by
+   peeling the head and tail,  then to 0x234 by setting all
+   high bits.
+
+   0x0 0 0 0 0 0 f f f 2 3 4 5 0 0 0
+ |_|  |___|
+  head_zero  tail_zero
+
+   0xf f f f f f f f f 2 3 4 5 f f f
+ |___||___|
+   head_one  tail_one
+
+   The HEAD_PAD is set to the number of consecutive 0s at the head. If there
+   is no consecutive 0s, HEAD_PAD is set to 0.
+   If the tail has consecutive 0s, TAIL_PAD is a positive number. Otherwise,
+   TAIL_PAD is set to a negative number.  */
+static HOST_WIDE_INT
+rs6000_peel_long_const (unsigned HOST_WIDE_INT c, 

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Michael Meissner via Gcc-patches
I rebusmitted the patch after verifying it still builds and works with the
current branch as:

https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564486.html

| Subject: [PATCH] Add conversions between _Float128 and Decimal.
| Message-ID: <20210129024208.ga25...@ibm-toto.the-meissners.org>

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH] Add conversions between _Float128 and Decimal.

2021-01-28 Thread Michael Meissner via Gcc-patches
[PATCH] Add conversions between _Float128 and Decimal.

This patch implements conversions between _Float128 and the 3 Decimal
floating types.  It does by extendending the dfp-bit conversions to add a
new binary floating point type (KF), and doing the conversions in the same
mannor as the other binary/decimal conversions.

In particular for conversions from _Float128 to Decimal, it uses a sprintf
variant to convert _Float128 to strings, and a type specific function that
converts the string output to the appropriate Decimal type

For conversions from one of the Decimal types to _Float128, it uses a decimal
function to convert to string (i.e. __decimalToString), and then uses a
variant of strtold to convert to _Float128.

If the user is linked against GLIBC 2.32 or newer, then the sprintf and strtold
variant functions can use the features directly in GLIBC 2.32 to do this
conversion.

If you have an older GLIBC and want to convert _Float128 to one of the Decimal
types, it will convert the _Float128 to __ibm128 and then convert that to
Decimal.

Similarly if you have one of the Decimal types, and want to convert to
_Float128, it will first convert the Decimal type to __ibm128, and then convert
__ibm128 to _Float128.

These functions will primarily be used if/when the default PowerPC long double
type is changed to IEEE 128-bit, but they could also be used if the user
explicitly converts _Float128 to/from a Decimal type.

One test case relating to Decimal fails if I build a compiler where the default
is IEEE 128-bit:

*   c-c++-common/dfp/convert-bfp-11.c

I have patches for this test, and they have been submitted separately.

I have tested this patch by doing builds, bootstraps, and make check with 3
builds on a power9 little endian server:

*   Build one used the default long double being IBM 128-bit;
*   Build two set the long double default to IEEE 128-bit; (and)
*   Build three set the long double default to 64-bit.

The compilers built fine providing I recompiled gmp, mpc, and mpfr with the
appropriate long double options.  There were a few differences in the test
suite runs that will be addressed in later patches, but over all it works
well.  This patch is required to be able to build a toolchain where the default
long double is IEEE 128-bit.  Can I check this patch into the master branch for
GCC 11?

I have also built compilers with this patch on a big endian power8 system that
has both 32-bit and 64-bit support.  There were no regressions in running these
tests on the system.

Can I check this patch into the master branch?

libgcc/
2021-01-28  Michael Meissner  

* config/rs6000/_dd_to_kf.c: New file.
* config/rs6000/_kf_to_dd.c: New file.
* config/rs6000/_kf_to_sd.c: New file.
* config/rs6000/_kf_to_td.c: New file.
* config/rs6000/_sd_to_kf.c: New file.
* config/rs6000/_sprintfkf.c: New file.
* config/rs6000/_sprintfkf.h: New file.
* config/rs6000/_strtokf.h: New file.
* config/rs6000/_strtokf.c: New file.
* config/rs6000/_td_to_kf.c: New file.
* config/rs6000/quad-float128.h: Add new declarations.
* config/rs6000/t-float128 (fp128_dec_funcs): New macro.
(fp128_decstr_funcs): New macro.
(ibm128_dec_funcs): New macro.
(fp128_ppc_funcs): Add the new conversions.
(fp128_dec_objs): Force Decimal <-> __float128 conversions to be
compiled with -mabi=ieeelongdouble.
(fp128_decstr_objs): Force __float128 <-> string conversions to be
compiled with -mabi=ibmlongdouble.
(ibm128_dec_objs): Force Decimal <-> __float128 conversions to be
compiled with -mabi=ieeelongdouble.
(FP128_CFLAGS_DECIMAL): New macro.
(IBM128_CFLAGS_DECIMAL): New macro.
* dfp-bit.c (DFP_TO_BFP): Add PowerPC _Float128 support.
(BFP_TO_DFP): Add PowerPC _Float128 support.
* dfp-bit.h (BFP_KIND): Add new binary floating point kind for
IEEE 128-bit floating point.
(DFP_TO_BFP): Add PowerPC _Float128 support.
(BFP_TO_DFP): Add PowerPC _Float128 support.
(BFP_SPRINTF): New macro.
---
 libgcc/config/rs6000/_dd_to_kf.c | 37 ++
 libgcc/config/rs6000/_kf_to_dd.c | 37 ++
 libgcc/config/rs6000/_kf_to_sd.c | 37 ++
 libgcc/config/rs6000/_kf_to_td.c | 37 ++
 libgcc/config/rs6000/_sd_to_kf.c | 37 ++
 libgcc/config/rs6000/_sprintfkf.c| 57 
 libgcc/config/rs6000/_sprintfkf.h| 28 ++
 libgcc/config/rs6000/_strtokf.c  | 56 +++
 libgcc/config/rs6000/_strtokf.h  | 27 +
 libgcc/config/rs6000/_td_to_kf.c | 37 ++
 libgcc/config/rs6000/quad-float128.h |  8 
 libgcc/config/rs6000/t-float128  | 37 +-
 libgcc/dfp-bit.c | 12 +-
 libgcc/dfp-bit.h | 

Re: Default to DWARF5

2021-01-28 Thread Paul A. Clarke via Gcc-patches
The subject commit, 3804e937b0e252a7e42632fe6d9f898f1851a49c, causes a
failure in the test suite for the IBM Advance Toolchain.  The test in
question uses "perf probe" to set a tracepoint at "main" in a newly built
(with GCC 11) binary of "/bin/ld".  With the patch applied, the command
enters an infinte loop, calling libdw1 functions but making no progress.

The infinite loop can be found in the Linux kernel
tools/perf/utils/probe-finder.c:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/probe-finder.c?h=v5.11-rc5#n1190

Reverting this patch permits the command to succeed.

PC


Go driver patch committed: Always act as though -g was passed

2021-01-28 Thread Ian Lance Taylor via Gcc-patches
The go1 compiler always turns on debugging, to support Go stack traces
and functions like runtime.Callers.  With the recent switch to turn on
DWARF 5 by default, this caused failures with some versions of gas,
such as 2.35.1, because the assembly code would assume DWARF 5 but the
driver would not pass --gdwarf-5 to gas.  gas would then give an
error: "file number less than one".

This change avoids that problem by having the gccgo driver spec add a
-g option to the command line if no other -g option is present.  The
newly added -g option is passed to the assembler as --gdwarf-5.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian

* gospec.c (lang_specific_driver): Add -g if no debugging options
were passed.
a7dbc8f83a4c146f5ab851ad7796c3ddbfe17b51
diff --git a/gcc/go/gospec.c b/gcc/go/gospec.c
index aaf64e73949..cf8d0f2b60e 100644
--- a/gcc/go/gospec.c
+++ b/gcc/go/gospec.c
@@ -127,6 +127,9 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
   /* The first input file with an extension of .go.  */
   const char *first_go_file = NULL;  
 
+  /* Whether we saw any -g option.  */
+  bool saw_opt_g = false;
+
   argc = *in_decoded_options_count;
   decoded_options = *in_decoded_options;
   added_libraries = *in_added_libraries;
@@ -208,6 +211,18 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
  saw_opt_o = true;
  break;
 
+   case OPT_g:
+   case OPT_gdwarf:
+   case OPT_gdwarf_:
+   case OPT_ggdb:
+   case OPT_gstabs:
+   case OPT_gstabs_:
+   case OPT_gvms:
+   case OPT_gxcoff:
+   case OPT_gxcoff_:
+ saw_opt_g = true;
+ break;
+
case OPT_static:
  static_link = 1;
  break;
@@ -271,6 +286,15 @@ lang_specific_driver (struct cl_decoded_option 
**in_decoded_options,
   j++;
 }
 
+  /* The go1 compiler is going to enable debug info by default.  If we
+ don't see any -g options, force -g, so that we invoke the
+ assembler with the right debug option.  */
+  if (!saw_opt_g)
+{
+  generate_option (OPT_g, "1", 0, CL_DRIVER, _decoded_options[j]);
+  j++;
+}
+
   /* NOTE: We start at 1 now, not 0.  */
   while (i < argc)
 {


[PATCH] adjust "partly out of bounds" warning (PR 98503)

2021-01-28 Thread Martin Sebor via Gcc-patches

The GCC 11 -Warray-bounds enhancement to diagnose accesses whose
leading offset is in bounds but whose trailing offset is not has
been causing some confusion.  When the warning is issued for
an access to an in-bounds member via a pointer to a struct that's
larger than the pointed-to object, phrasing this strictly invalid
access in terms of array subscripts can be misleading, especially
when the source code doesn't involve any arrays or indexing.

Since the problem boils down to an aliasing violation much more
so than an actual out-of-bounds access, the attached patch adjusts
the code to issue a -Wstrict-aliasing warning in these cases instead
of -Warray-bounds.  In addition, since the aliasing assumptions in
GCC can be disabled by -fno-strict-aliasing, the patch also makes
these instances of the warning conditional on -fstrict-aliasing
being in effect.

Martin
PR middle-end/98503 -Warray-bounds when -Wstrict-aliasing would be more appropriate

gcc/ChangeLog:

	PR middle-end/98503
	* gimple-array-bounds.cc (array_bounds_checker::check_mem_ref):
	Issue -Wstrict-aliasing for a subset of violations.
	(array_bounds_checker::check_array_bounds):  Set new member.
	* gimple-array-bounds.h (array_bounds_checker::cref_of_mref): New
	data member.

gcc/testsuite/ChangeLog:

	PR middle-end/98503
	* g++.dg/warn/Warray-bounds-10.C: Adjust text of expected warnings.
	* g++.dg/warn/Warray-bounds-11.C: Same.
	* g++.dg/warn/Warray-bounds-12.C: Same.
	* g++.dg/warn/Warray-bounds-13.C: Same.
	* gcc.dg/Warray-bounds-63.c: Avoid -Wstrict-aliasing.  Adjust text
	of expected warnings.
	* gcc.dg/Warray-bounds-66.c: Adjust text of expected warnings.
	* gcc.dg/Wstrict-aliasing-2.c: New test.
	* gcc.dg/Wstrict-aliasing-3.c: New test.

diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index 2576556f76b..f6b2af0d681 100644
--- a/gcc/gimple-array-bounds.cc
+++ b/gcc/gimple-array-bounds.cc
@@ -670,6 +670,8 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
 	axssize = wi::to_offset (access_size);
 
   const bool uboob = !lboob && offrange[0] + axssize > ubound;
+  /* Set to OFFRANGE converted to index range.  */
+  offset_int idxrange[2] = { offrange[0], offrange[1] };
   if (lboob || uboob)
 {
   /* Treat a reference to a non-array object as one to an array
@@ -681,43 +683,84 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
 	 to compute the index to print in the diagnostic; arrays
 	 in MEM_REF don't mean anything.  A type with no size like
 	 void is as good as having a size of 1.  */
-  tree type = TREE_TYPE (ref);
-  while (TREE_CODE (type) == ARRAY_TYPE)
-	type = TREE_TYPE (type);
+  tree type = strip_array_types (TREE_TYPE (ref));
   if (tree size = TYPE_SIZE_UNIT (type))
 	{
-	  offrange[0] = offrange[0] / wi::to_offset (size);
-	  offrange[1] = offrange[1] / wi::to_offset (size);
+	  idxrange[0] = offrange[0] / wi::to_offset (size);
+	  idxrange[1] = offrange[1] / wi::to_offset (size);
 	}
 }
 
   if (lboob)
 {
-  if (offrange[0] == offrange[1])
+  if (idxrange[0] == idxrange[1])
 	warned = warning_at (location, OPT_Warray_bounds,
 			 "array subscript %wi is outside array bounds "
 			 "of %qT",
-			 offrange[0].to_shwi (), reftype);
+			 idxrange[0].to_shwi (), reftype);
   else
 	warned = warning_at (location, OPT_Warray_bounds,
 			 "array subscript [%wi, %wi] is outside "
 			 "array bounds of %qT",
-			 offrange[0].to_shwi (),
-			 offrange[1].to_shwi (), reftype);
+			 idxrange[0].to_shwi (),
+			 idxrange[1].to_shwi (), reftype);
 }
   else if (uboob && !ignore_off_by_one)
 {
+  bool done = false;
+
   tree backtype = reftype;
   if (alloc_stmt)
 	/* If the memory was dynamically allocated refer to it as if
 	   it were an untyped array of bytes.  */
 	backtype = build_array_type_nelts (unsigned_char_type_node,
 	   arrbounds[1].to_uhwi ());
+  else if (cref_of_mref)
+	{
+	  /* For a COMPONENT_REF (struct S, MEM_REF (T, ...), fld) see
+	 if the offset of fld's last byte is in bounds of struct S,
+	 and if so, issue -Wstrict-aliasing instead.  */
+	  tree fld = TREE_OPERAND (cref_of_mref, 1);
+	  offset_int fldoff = offrange[0] + wi::to_offset (byte_position (fld));
+	  if (tree fldsiz = DECL_SIZE_UNIT (fld))
+	if (tree_fits_uhwi_p (fldsiz))
+	  fldoff += wi::to_offset (fldsiz);
+
+	  if (fldoff <= arrbounds[1])
+	{
+	  if (flag_strict_aliasing)
+		{
+		  /* Only warn when -fstrict-aliasing is enabled (as per
+		 the manual).  */
+		  if (offrange[0] == 0)
+		warned = warning_at (location, OPT_Wstrict_aliasing,
+	 "access to %qT by an lvalue of %qT "
+	 "violates strict-aliasing rules",
+	 TREE_TYPE (reftype), axstype);
+		  else
+		warned = warning_at (location, OPT_Wstrict_aliasing,
+	 "access to %qT by %<%T[%wi]%> "
+	 "violates strict-aliasing rules",
+	 TREE_TYPE (reftype), axstype,
+

Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]

2021-01-28 Thread Marek Polacek via Gcc-patches
On Thu, Jan 28, 2021 at 11:38:34PM +0100, Eric Botcazou wrote:
> > Aware.  I don't have access to, e.g., a sparc box.  But the test I've added
> > uses -mstrict-align where possible to check that the issue is resolved.
> 
> There are relatively fast SPARC64/Linux (gcc202) and SPARC/Solaris machines 
> (gcc210 and gcc211) in the Compile Farm:
>   https://cfarm.tetaneutral.net/machines/list/

Ah, great.  I'll give gcc202 a go.  Thanks,

Marek



Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]

2021-01-28 Thread Eric Botcazou
> Aware.  I don't have access to, e.g., a sparc box.  But the test I've added
> uses -mstrict-align where possible to check that the issue is resolved.

There are relatively fast SPARC64/Linux (gcc202) and SPARC/Solaris machines 
(gcc210 and gcc211) in the Compile Farm:
  https://cfarm.tetaneutral.net/machines/list/

-- 
Eric Botcazou




Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]

2021-01-28 Thread Marek Polacek via Gcc-patches
On Thu, Jan 28, 2021 at 11:02:36PM +0100, Eric Botcazou wrote:
> > Bootstrapped/regtested on
> > * x86_64-pc-linux-gnu
> > * powerpc64le-unknown-linux-gnu
> > * aarch64-linux-gnu
> > ok for trunk?
> 
> None of them is strict alignment though, isn't it?

Aware.  I don't have access to, e.g., a sparc box.  But the test I've added
uses -mstrict-align where possible to check that the issue is resolved.

--
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]

2021-01-28 Thread Eric Botcazou
> Bootstrapped/regtested on
> * x86_64-pc-linux-gnu
> * powerpc64le-unknown-linux-gnu
> * aarch64-linux-gnu
> ok for trunk?

None of them is strict alignment though, isn't it?

-- 
Eric Botcazou




Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Segher Boessenkool
On Thu, Jan 28, 2021 at 01:58:26PM -0600, Peter Bergner wrote:
> On 1/28/21 1:47 PM, Segher Boessenkool wrote:
> > On Thu, Jan 28, 2021 at 02:30:56PM -0500, Michael Meissner wrote:
> >> The second patch I want you to review is:
> > 
> > "This patch replaces the following three patches:"
> > 
> > Please send a patch that modifies *current* code, and that is *tested*
> > with that.  With a good explanation, and a commit message for *that*
> > patch (not for other, older patches).
> 
> Isn't the email Mike linked to which was submitted yesterday exactly this?

It says it is not.  That is where I stopped reading.

> Admittedly, it doesn't have a "commit message header", but it does seem to
> be against current code and he did say he tested it and there is some
> explanation given, just not in a commit message form.  Is it just a nice
> commit message you're looking for now?

Just like for any other patch, I need to know what it *is*, why this is
an improvement, and all that.  The usual.  And not in ten pages but in a
few sentences.

After such an introduction you can go into detail where that is
warranted, for example necessary to understand the patch, but you do not
explain the history of the world in a commit message.

I do not have the bandwidth to spend hours on this again today trying to
figure out what this patch is.


Segher


Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]

2021-01-28 Thread Jason Merrill via Gcc-patches

On 1/28/21 3:36 PM, Marek Polacek wrote:

On Thu, Jan 28, 2021 at 03:18:55PM -0500, Jason Merrill via Gcc-patches wrote:

On 1/28/21 10:34 AM, Marek Polacek wrote:

A year ago I submitted this patch:

~~
Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it
gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by
build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1.

When we strip_typedefs the element of the array "const d", we see it's
a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is
char, but we need to add the const qualifier, so we call
cp_build_qualified_type -> build_qualified_type
where get_qualified_type checks to see if we already have such a type
by walking the variants list, which in this case is:

char -> c -> const char -> const char -> d -> const d

Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN,
we choose the first const char, which has TYPE_USER_ALIGN set.  If the
element type of an array has TYPE_USER_ALIGN, the array type gets it too.

So we can make check_base_type stricter.  I was afraid that it might make
us reuse types less often, but measuring showed that we build the same
amount of types with and without the patch, while bootstrapping.
~~

However, the patch broke a few tests on STRICT_ALIGNMENT platforms and
had to be reverted.  This is another try.  The original patch is kept
unchanged, but I added the finalize_type_size hunk that ought to fix the
STRICT_ALIGNMENT issues.

The problem is that finalize_type_size can clear TYPE_USER_ALIGN on the
main variant of a type, but doesn't clear it on any of the variants.
Then we end up with types which share the same TYPE_MAIN_VARIANT, but
their TYPE_CANONICAL differs and then the usual "canonical types differ
for identical types" follows.

I've created alignas19.C to exercise this scenario.  What happens is:
- when parsing the class S we create a type S in xref_tag,
- we see alignas(8) so common_handle_aligned_attribute sets T_U_A in S,
- we parse the member function fn and build_memfn_type creates a copy
of S to add const; this variant has T_U_A set,
- we finish_struct S which calls layout_class_type -> finish_record_type
-> finalize_size_type where we reset T_U_A in S (but const S keeps it),
- finish_non_static_data_member for arr calls maybe_dummy_object with
type = S,
- maybe_dummy_object calls same_type_ignoring_top_level_qualifiers_p
to check if S and TREE_TYPE (current_class_ref), which is const S,
are the same,
- same_type_ignoring_top_level_qualifiers_p creates cv-unqualified
versions of the passed types.  Previously we'd use our main variant
S when stripping "const S" of const, but since the T_U_A flags don't
match (check_base_type), we create a new variant S'.  Then we crash in
comptypes because S and S' have the same TYPE_MAIN_VARIANT but
different TYPE_CANONICALs.

With my patch we'll clear T_U_A for S's variants too, and then instead
of S' we'll just use S.  Does this seem sane?

Bootstrapped/regtested on
* x86_64-pc-linux-gnu
* powerpc64le-unknown-linux-gnu
* aarch64-linux-gnu
ok for trunk?

gcc/ChangeLog:

PR c++/94775
* stor-layout.c (finalize_type_size): If we reset TYPE_USER_ALIGN in
the main variant, maybe reset it in its variants too.
* tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match.
(check_aligned_type): Check if TYPE_USER_ALIGN match.

gcc/testsuite/ChangeLog:

PR c++/94775
* g++.dg/cpp0x/alignas19.C: New test.
* g++.dg/warn/Warray-bounds15.C: New test.
---
   gcc/stor-layout.c   | 16 +++--
   gcc/testsuite/g++.dg/cpp0x/alignas19.C  |  8 +
   gcc/testsuite/g++.dg/warn/Warray-bounds15.C | 40 +
   gcc/tree.c  |  4 ++-
   4 files changed, 64 insertions(+), 4 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp0x/alignas19.C
   create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds15.C

diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 7d6b1b5ce52..784f131ebb8 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -1926,6 +1926,7 @@ finalize_type_size (tree type)
However, where strict alignment is not required, avoid
over-aligning structures, since most compilers do not do this
alignment.  */
+  bool tua_cleared_p = false;
 if (TYPE_MODE (type) != BLKmode
 && TYPE_MODE (type) != VOIDmode
 && (STRICT_ALIGNMENT || !AGGREGATE_TYPE_P (type)))
@@ -1937,7 +1938,9 @@ finalize_type_size (tree type)
 if (mode_align >= TYPE_ALIGN (type))
{
  SET_TYPE_ALIGN (type, mode_align);
- TYPE_USER_ALIGN (type) = 0;
+ /* Remember that we're about to reset this flag.  */
+ tua_cleared_p = TYPE_USER_ALIGN (type);
+ TYPE_USER_ALIGN (type) = false;
}
   }
@@ -1991,14 +1994,21 @@ finalize_type_size (tree type)
 /* 

Re: [PATCH v3] clear VLA bounds in attribute access (PR 97172)

2021-01-28 Thread Martin Sebor via Gcc-patches

On 1/28/21 2:22 AM, Jakub Jelinek wrote:

On Thu, Jan 28, 2021 at 09:31:46AM +0100, Richard Biener via Gcc-patches wrote:

+  if (TREE_CODE (bound) == PLUS_EXPR
+  && integer_all_onesp (TREE_OPERAND (bound, 1)))
+{
+  bound = TREE_OPERAND (bound, 0);
+  if (TREE_CODE (bound) == NOP_EXPR)
+   bound = TREE_OPERAND (bound, 0);
+}

so it either does or does not strip a -1 but has no indication on
whether it did that?  That looks fragile and broken.


Yeah.  Plus again STRIP_NOPs in there should be used.  But certainly it
needs to remember whether there was + -1 or not and compensate for it.


The free-lang-data parts are OK.


But is fld the right spot to do it?
If it is only the FE that emits the warnings, shouldn't it be stripped
already before gimplification so that it isn't actually corrupted?

I mean in c_parse_final_cleanups or c_common_parse_file depending on
if it is just C or C++ too?


I've been asking for a good place to do it since December and
free_lang_data was the only suggestion.  I even mentioned it
to you this Tuesday.  c_parse_final_cleanups looks like a better
place but I don't understand why you've waited until now to point
me to it.



Plus, if the expressions in the attribute don't contain SAVE_EXPRs, why
there isn't unshare_expr called on them (and if they do, I don't see how
it would be guaranteed, can't one e.g. do
int bar (void);
void foo (int n, int a[n + 1][bar () + 2], int b[sizeof (a[0]) + 32])
{
}
?) the unsharing variant I've pasted into the PR.


I'm not sure what you're asking here or if you're just repeating
the same question we went over in November when I posted the first
patch to unshare the expressions.  In that thread, after I explained
that the expressions in the attribute aren't evaluated, you ultimately
agreed it wasn't necessary:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559953.html

The front end wraps each VLA bound that needs to be evaluated in
a SAVE_EXPR (and, as Joseph points out in the same thread, if
there are more of them, they're wrapped in a COMPOUND_EXPR).  But
the attribute doesn't use these SAVE_EXPRs -- instead, it uses
the expressions before they're wrapped.  This wasn't a choice I
made deliberately.  I just didn't know about the COMPOUND_EXPR
wrapped bounds.

Martin


Re: [PATCH v3] clear VLA bounds in attribute access (PR 97172)

2021-01-28 Thread Martin Sebor via Gcc-patches

On 1/28/21 1:31 AM, Richard Biener wrote:

On Thu, Jan 28, 2021 at 12:08 AM Martin Sebor via Gcc-patches
 wrote:


Attached is another attempt to fix the problem caused by allowing
front-end trees representing nontrivial VLA bound expressions to
stay in attribute access attached to functions.  Since removing
these trees seems to be everyone's preference this patch does that
by extending the free_lang_data pass to look for and zero out these
trees.

Because free_lang_data only frees anything when LTO is enabled and
we want these trees cleared regardless to keep them from getting
clobbered during gimplification, this change also modifies the pass
to do the clearing even when the pass is otherwise inactive.


   if (TREE_CODE (bound) == NOP_EXPR)
+bound = TREE_OPERAND (bound, 0);
+
+  if (TREE_CODE (bound) == CONVERT_EXPR)
+{
+  tree op0 = TREE_OPERAND (bound, 0);
+  tree bndtyp = TREE_TYPE (bound);
+  tree op0typ = TREE_TYPE (op0);
+  if (TYPE_PRECISION (bndtyp) == TYPE_PRECISION (op0typ))
+   bound = op0;
+}
+
+  if (TREE_CODE (bound) == NON_LVALUE_EXPR)
+bound = TREE_OPERAND (bound, 0);

all of the above can be just

STRIP_NOPS (bound);

which also handles nesting of the above in any order.


No, it can't be just STRIP_NOPS.

The goal is to strip the meaningless (to the user) cast to sizetype
from the array type.  For example:

  void f (int n, int[n]);
  void f (int n, int[n + 1]);

I want the type in the warning to reflect the source:

  warning: argument 2 of type ‘int[n + 1]’ declared with mismatched 
bound ‘n + 1’ [-Wvla-parameter]


and not:

  warning: ... ‘int[(sizetype)(n + 1)]’ ...



+  if (TREE_CODE (bound) == PLUS_EXPR
+  && integer_all_onesp (TREE_OPERAND (bound, 1)))
+{
+  bound = TREE_OPERAND (bound, 0);
+  if (TREE_CODE (bound) == NOP_EXPR)
+   bound = TREE_OPERAND (bound, 0);
+}

so it either does or does not strip a -1 but has no indication on
whether it did that?  That looks fragile and broken.


Indication to what?  The caller?  The function is only used to recover
a meaningful VLA bound for warnings and its callers aren't interested
in whether the -1 was stripped or not.



Anyway, the split out of this function seems unrelated to the
original problem and should be submitted separately.


It was a remnant of the previous patch where it was used to format
the string representation of the VLA bounds and called from three
sites.  I've removed the function from this revision (and restored
the one site in the pretty printer that open-codes the same thing).



+  for (vblist = TREE_VALUE (vblist); vblist; vblist = TREE_CHAIN (vblist))
+   {
+ tree *pvbnd = _VALUE (vblist);
+ if (!*pvbnd || DECL_P (*pvbnd))
+   continue;

so this doesn't let constant bounds prevail but only symbolical ones?  Not
that I care but I'd have expected || CONSTANT_CLASS_P (*pvbnd)


There must be some confusion here.  There are no constant VLA bounds.
The essential purpose of this patch is to remove expressions from
the attributes, such as PLUS_EXPR, that denote nontrivial VLA bounds.
The test above retains decls that might refer to function parameters
or global variables so that they can be mentioned in middle end
warnings.

Attached is yet another revision of this fix that moves the call
to attr_access:free_lang_data() to c_parse_final_cleanups() as
Jakub suggested.

Martin
PR middle-end/97172 - ICE: tree code 'ssa_name' is not supported in LTO streams

gcc/ChangeLog:

	PR middle-end/97172
	* attribs.c (attr_access::free_lang_data): Define new function.
	* attribs.h (attr_access::free_lang_data): Declare new function.

gcc/c/ChangeLog:

	PR middle-end/97172
	* c-decl.c (free_attr_access_data): New function.
	(c_parse_final_cleanups): Call free_attr_access_data.

gcc/testsuite/ChangeLog:

	PR middle-end/97172
	* gcc.dg/pr97172.c: New test.

diff --git a/gcc/attribs.c b/gcc/attribs.c
index 94991fbbeab..81322d40f1d 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -2238,6 +2238,38 @@ attr_access::vla_bounds (unsigned *nunspec) const
   return list_length (size);
 }
 
+/* Reset front end-specific attribute access data from ATTRS.
+   Called from the free_lang_data pass.  */
+
+/* static */ void
+attr_access::free_lang_data (tree attrs)
+{
+  for (tree acs = attrs; (acs = lookup_attribute ("access", acs));
+   acs = TREE_CHAIN (acs))
+{
+  tree vblist = TREE_VALUE (acs);
+  vblist = TREE_CHAIN (vblist);
+  if (!vblist)
+	continue;
+
+  vblist = TREE_VALUE (vblist);
+  if (!vblist)
+	continue;
+
+  for (vblist = TREE_VALUE (vblist); vblist; vblist = TREE_CHAIN (vblist))
+	{
+	  tree *pvbnd = _VALUE (vblist);
+	  if (!*pvbnd || DECL_P (*pvbnd))
+	continue;
+
+	  /* VLA bounds that are expressions as opposed to DECLs are
+	 only used in the front end.  Reset them to keep front end
+	 trees leaking into the middle end (see pr97172) and to
+	 free up memory.  */
+	  *pvbnd = 

Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]

2021-01-28 Thread Marek Polacek via Gcc-patches
On Thu, Jan 28, 2021 at 03:18:55PM -0500, Jason Merrill via Gcc-patches wrote:
> On 1/28/21 10:34 AM, Marek Polacek wrote:
> > A year ago I submitted this patch:
> > 
> > ~~
> > Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it
> > gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by
> > build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1.
> > 
> > When we strip_typedefs the element of the array "const d", we see it's
> > a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is
> > char, but we need to add the const qualifier, so we call
> > cp_build_qualified_type -> build_qualified_type
> > where get_qualified_type checks to see if we already have such a type
> > by walking the variants list, which in this case is:
> > 
> >char -> c -> const char -> const char -> d -> const d
> > 
> > Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN,
> > we choose the first const char, which has TYPE_USER_ALIGN set.  If the
> > element type of an array has TYPE_USER_ALIGN, the array type gets it too.
> > 
> > So we can make check_base_type stricter.  I was afraid that it might make
> > us reuse types less often, but measuring showed that we build the same
> > amount of types with and without the patch, while bootstrapping.
> > ~~
> > 
> > However, the patch broke a few tests on STRICT_ALIGNMENT platforms and
> > had to be reverted.  This is another try.  The original patch is kept
> > unchanged, but I added the finalize_type_size hunk that ought to fix the
> > STRICT_ALIGNMENT issues.
> > 
> > The problem is that finalize_type_size can clear TYPE_USER_ALIGN on the
> > main variant of a type, but doesn't clear it on any of the variants.
> > Then we end up with types which share the same TYPE_MAIN_VARIANT, but
> > their TYPE_CANONICAL differs and then the usual "canonical types differ
> > for identical types" follows.
> > 
> > I've created alignas19.C to exercise this scenario.  What happens is:
> > - when parsing the class S we create a type S in xref_tag,
> > - we see alignas(8) so common_handle_aligned_attribute sets T_U_A in S,
> > - we parse the member function fn and build_memfn_type creates a copy
> >of S to add const; this variant has T_U_A set,
> > - we finish_struct S which calls layout_class_type -> finish_record_type
> >-> finalize_size_type where we reset T_U_A in S (but const S keeps it),
> > - finish_non_static_data_member for arr calls maybe_dummy_object with
> >type = S,
> > - maybe_dummy_object calls same_type_ignoring_top_level_qualifiers_p
> >to check if S and TREE_TYPE (current_class_ref), which is const S,
> >are the same,
> > - same_type_ignoring_top_level_qualifiers_p creates cv-unqualified
> >versions of the passed types.  Previously we'd use our main variant
> >S when stripping "const S" of const, but since the T_U_A flags don't
> >match (check_base_type), we create a new variant S'.  Then we crash in
> >comptypes because S and S' have the same TYPE_MAIN_VARIANT but
> >different TYPE_CANONICALs.
> > 
> > With my patch we'll clear T_U_A for S's variants too, and then instead
> > of S' we'll just use S.  Does this seem sane?
> > 
> > Bootstrapped/regtested on
> > * x86_64-pc-linux-gnu
> > * powerpc64le-unknown-linux-gnu
> > * aarch64-linux-gnu
> > ok for trunk?
> > 
> > gcc/ChangeLog:
> > 
> > PR c++/94775
> > * stor-layout.c (finalize_type_size): If we reset TYPE_USER_ALIGN in
> > the main variant, maybe reset it in its variants too.
> > * tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match.
> > (check_aligned_type): Check if TYPE_USER_ALIGN match.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/94775
> > * g++.dg/cpp0x/alignas19.C: New test.
> > * g++.dg/warn/Warray-bounds15.C: New test.
> > ---
> >   gcc/stor-layout.c   | 16 +++--
> >   gcc/testsuite/g++.dg/cpp0x/alignas19.C  |  8 +
> >   gcc/testsuite/g++.dg/warn/Warray-bounds15.C | 40 +
> >   gcc/tree.c  |  4 ++-
> >   4 files changed, 64 insertions(+), 4 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp0x/alignas19.C
> >   create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds15.C
> > 
> > diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
> > index 7d6b1b5ce52..784f131ebb8 100644
> > --- a/gcc/stor-layout.c
> > +++ b/gcc/stor-layout.c
> > @@ -1926,6 +1926,7 @@ finalize_type_size (tree type)
> >However, where strict alignment is not required, avoid
> >over-aligning structures, since most compilers do not do this
> >alignment.  */
> > +  bool tua_cleared_p = false;
> > if (TYPE_MODE (type) != BLKmode
> > && TYPE_MODE (type) != VOIDmode
> > && (STRICT_ALIGNMENT || !AGGREGATE_TYPE_P (type)))
> > @@ -1937,7 +1938,9 @@ finalize_type_size (tree type)
> > if (mode_align >= TYPE_ALIGN (type))
> > {
> >   SET_TYPE_ALIGN 

Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]

2021-01-28 Thread Jason Merrill via Gcc-patches

On 1/28/21 10:34 AM, Marek Polacek wrote:

A year ago I submitted this patch:

~~
Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it
gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by
build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1.

When we strip_typedefs the element of the array "const d", we see it's
a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is
char, but we need to add the const qualifier, so we call
cp_build_qualified_type -> build_qualified_type
where get_qualified_type checks to see if we already have such a type
by walking the variants list, which in this case is:

   char -> c -> const char -> const char -> d -> const d

Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN,
we choose the first const char, which has TYPE_USER_ALIGN set.  If the
element type of an array has TYPE_USER_ALIGN, the array type gets it too.

So we can make check_base_type stricter.  I was afraid that it might make
us reuse types less often, but measuring showed that we build the same
amount of types with and without the patch, while bootstrapping.
~~

However, the patch broke a few tests on STRICT_ALIGNMENT platforms and
had to be reverted.  This is another try.  The original patch is kept
unchanged, but I added the finalize_type_size hunk that ought to fix the
STRICT_ALIGNMENT issues.

The problem is that finalize_type_size can clear TYPE_USER_ALIGN on the
main variant of a type, but doesn't clear it on any of the variants.
Then we end up with types which share the same TYPE_MAIN_VARIANT, but
their TYPE_CANONICAL differs and then the usual "canonical types differ
for identical types" follows.

I've created alignas19.C to exercise this scenario.  What happens is:
- when parsing the class S we create a type S in xref_tag,
- we see alignas(8) so common_handle_aligned_attribute sets T_U_A in S,
- we parse the member function fn and build_memfn_type creates a copy
   of S to add const; this variant has T_U_A set,
- we finish_struct S which calls layout_class_type -> finish_record_type
   -> finalize_size_type where we reset T_U_A in S (but const S keeps it),
- finish_non_static_data_member for arr calls maybe_dummy_object with
   type = S,
- maybe_dummy_object calls same_type_ignoring_top_level_qualifiers_p
   to check if S and TREE_TYPE (current_class_ref), which is const S,
   are the same,
- same_type_ignoring_top_level_qualifiers_p creates cv-unqualified
   versions of the passed types.  Previously we'd use our main variant
   S when stripping "const S" of const, but since the T_U_A flags don't
   match (check_base_type), we create a new variant S'.  Then we crash in
   comptypes because S and S' have the same TYPE_MAIN_VARIANT but
   different TYPE_CANONICALs.

With my patch we'll clear T_U_A for S's variants too, and then instead
of S' we'll just use S.  Does this seem sane?

Bootstrapped/regtested on
* x86_64-pc-linux-gnu
* powerpc64le-unknown-linux-gnu
* aarch64-linux-gnu
ok for trunk?

gcc/ChangeLog:

PR c++/94775
* stor-layout.c (finalize_type_size): If we reset TYPE_USER_ALIGN in
the main variant, maybe reset it in its variants too.
* tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match.
(check_aligned_type): Check if TYPE_USER_ALIGN match.

gcc/testsuite/ChangeLog:

PR c++/94775
* g++.dg/cpp0x/alignas19.C: New test.
* g++.dg/warn/Warray-bounds15.C: New test.
---
  gcc/stor-layout.c   | 16 +++--
  gcc/testsuite/g++.dg/cpp0x/alignas19.C  |  8 +
  gcc/testsuite/g++.dg/warn/Warray-bounds15.C | 40 +
  gcc/tree.c  |  4 ++-
  4 files changed, 64 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/alignas19.C
  create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds15.C

diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 7d6b1b5ce52..784f131ebb8 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -1926,6 +1926,7 @@ finalize_type_size (tree type)
   However, where strict alignment is not required, avoid
   over-aligning structures, since most compilers do not do this
   alignment.  */
+  bool tua_cleared_p = false;
if (TYPE_MODE (type) != BLKmode
&& TYPE_MODE (type) != VOIDmode
&& (STRICT_ALIGNMENT || !AGGREGATE_TYPE_P (type)))
@@ -1937,7 +1938,9 @@ finalize_type_size (tree type)
if (mode_align >= TYPE_ALIGN (type))
{
  SET_TYPE_ALIGN (type, mode_align);
- TYPE_USER_ALIGN (type) = 0;
+ /* Remember that we're about to reset this flag.  */
+ tua_cleared_p = TYPE_USER_ALIGN (type);
+ TYPE_USER_ALIGN (type) = false;
}
  }
  
@@ -1991,14 +1994,21 @@ finalize_type_size (tree type)
  
/* Copy it into all variants.  */

for (variant = TYPE_MAIN_VARIANT (type);
-  variant != 0;
+  variant != NULL_TREE;
   

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Peter Bergner via Gcc-patches
On 1/28/21 1:47 PM, Segher Boessenkool wrote:
> On Thu, Jan 28, 2021 at 02:30:56PM -0500, Michael Meissner wrote:
>> The second patch I want you to review is:
> 
> "This patch replaces the following three patches:"
> 
> Please send a patch that modifies *current* code, and that is *tested*
> with that.  With a good explanation, and a commit message for *that*
> patch (not for other, older patches).

Isn't the email Mike linked to which was submitted yesterday exactly this?
Admittedly, it doesn't have a "commit message header", but it does seem to
be against current code and he did say he tested it and there is some
explanation given, just not in a commit message form.  Is it just a nice
commit message you're looking for now?


Peter





Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Segher Boessenkool
On Thu, Jan 28, 2021 at 02:30:56PM -0500, Michael Meissner wrote:
> On Thu, Jan 28, 2021 at 12:59:18PM -0600, Segher Boessenkool wrote:
> > On Thu, Jan 28, 2021 at 01:10:39PM -0500, Michael Meissner wrote:
> > > > The whole thread is at 
> > > > https://patchwork.ozlabs.org/project/gcc/patch/2020112524.ga...@ibm-toto.the-meissners.org/
> > > >  .
> > > > 
> > > > I approved *that* version of the patch.
> > > 
> > > Yes you approved the built-in renaming patch and I checked in the patch 
> > > that
> > > you approved (rather than the modifications that I submitted).
> > > 
> > > However, this is about the second patch (that provides conversions between
> > > _Float128 and Decimal), which as far as I can tell was not approved.  
> > > That is
> > > what this particular question is about.
> > 
> > I see no second patch?
> > 
> > > So, I rewrote the whole patch so that it will work with older GLIBC's.
> > 
> > Did you check in what I approved or not?  It is a very simple question,
> > it's just about facts, with a simple yes/no answer.
> 
> Yes I checked in that patch earlier this morning (the original patch that you
> approved).
> 
> The second patch I want you to review is:

"This patch replaces the following three patches:"

Please send a patch that modifies *current* code, and that is *tested*
with that.  With a good explanation, and a commit message for *that*
patch (not for other, older patches).


Segher


Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Michael Meissner via Gcc-patches
On Thu, Jan 28, 2021 at 12:59:18PM -0600, Segher Boessenkool wrote:
> On Thu, Jan 28, 2021 at 01:10:39PM -0500, Michael Meissner wrote:
> > > The whole thread is at 
> > > https://patchwork.ozlabs.org/project/gcc/patch/2020112524.ga...@ibm-toto.the-meissners.org/
> > >  .
> > > 
> > > I approved *that* version of the patch.
> > 
> > Yes you approved the built-in renaming patch and I checked in the patch that
> > you approved (rather than the modifications that I submitted).
> > 
> > However, this is about the second patch (that provides conversions between
> > _Float128 and Decimal), which as far as I can tell was not approved.  That 
> > is
> > what this particular question is about.
> 
> I see no second patch?
> 
> > So, I rewrote the whole patch so that it will work with older GLIBC's.
> 
> Did you check in what I approved or not?  It is a very simple question,
> it's just about facts, with a simple yes/no answer.

Yes I checked in that patch earlier this morning (the original patch that you
approved).

The second patch I want you to review is:
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564407.html

| Date: Wed, 27 Jan 2021 16:19:52 -0500
| Subject: [PATCH, revised, #2] PowerPC: Add float128/Decimal conversions.
| Message-ID: <20210127211952.ga28...@ibm-toto.the-meissners.org>

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] c++, v2: Fix -Weffc++ in templates [PR98841]

2021-01-28 Thread Jason Merrill via Gcc-patches

On 1/28/21 10:24 AM, Jakub Jelinek wrote:

On Thu, Jan 28, 2021 at 10:04:12AM -0500, Jason Merrill wrote:

We emit a bogus warning on the following testcase, suggesting that the
operator should return *this even when it does that already.
The problem is that normally cp_build_indirect_ref_1 ensures that *this
is folded as current_class_ref, but in templates (if return type is
non-dependent, otherwise check_return_expr doesn't check it) it didn't
go through cp_build_indirect_ref_1, but just built another INDIRECT_REF.


That seems like a bug in build_x_indirect_ref.


So do you want this instead (if it passes bootstrap/regtest)?


OK.


2021-01-28  Jakub Jelinek  

PR c++/98841
* typeck.c (build_x_indirect_ref): For *this, return current_class_ref.

* g++.dg/warn/effc5.C: New test.

--- gcc/cp/typeck.c.jj  2021-01-27 11:48:49.715890458 +0100
+++ gcc/cp/typeck.c 2021-01-28 16:17:18.712755173 +0100
@@ -3326,7 +3326,15 @@ build_x_indirect_ref (location_t loc, tr
  {
/* Retain the type if we know the operand is a pointer.  */
if (TREE_TYPE (expr) && INDIRECT_TYPE_P (TREE_TYPE (expr)))
-   return build_min (INDIRECT_REF, TREE_TYPE (TREE_TYPE (expr)), expr);
+   {
+ if (expr == current_class_ptr
+ || (TREE_CODE (expr) == NOP_EXPR
+ && TREE_OPERAND (expr, 0) == current_class_ptr
+ && (same_type_ignoring_top_level_qualifiers_p
+   (TREE_TYPE (expr), TREE_TYPE (current_class_ptr)
+   return current_class_ref;
+ return build_min (INDIRECT_REF, TREE_TYPE (TREE_TYPE (expr)), expr);
+   }
if (type_dependent_expression_p (expr))
return build_min_nt_loc (loc, INDIRECT_REF, expr);
expr = build_non_dependent_expr (expr);
--- gcc/testsuite/g++.dg/warn/effc5.C.jj2021-01-28 16:15:05.820256255 
+0100
+++ gcc/testsuite/g++.dg/warn/effc5.C   2021-01-28 16:15:05.820256255 +0100
@@ -0,0 +1,17 @@
+// PR c++/98841
+// { dg-do compile }
+// { dg-options "-Weffc++" }
+
+struct S {
+  template 
+  S& operator=(const T&) { return *this; } // { dg-bogus "should return a 
reference to" }
+  S& operator=(const S&) { return *this; }
+};
+
+void
+foo ()
+{
+  S s, t;
+  s = 1;
+  s = t;
+}


Jakub





Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Segher Boessenkool
On Thu, Jan 28, 2021 at 01:10:39PM -0500, Michael Meissner wrote:
> > The whole thread is at 
> > https://patchwork.ozlabs.org/project/gcc/patch/2020112524.ga...@ibm-toto.the-meissners.org/
> >  .
> > 
> > I approved *that* version of the patch.
> 
> Yes you approved the built-in renaming patch and I checked in the patch that
> you approved (rather than the modifications that I submitted).
> 
> However, this is about the second patch (that provides conversions between
> _Float128 and Decimal), which as far as I can tell was not approved.  That is
> what this particular question is about.

I see no second patch?

> So, I rewrote the whole patch so that it will work with older GLIBC's.

Did you check in what I approved or not?  It is a very simple question,
it's just about facts, with a simple yes/no answer.


Segher


[committed] libstdc++: Fix copyright dates for simd headers and tests

2021-01-28 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

* include/experimental/bits/numeric_traits.h: Update copyright
dates.
* include/experimental/bits/simd.h: Likewise.
* include/experimental/bits/simd_builtin.h: Likewise.
* include/experimental/bits/simd_converter.h: Likewise.
* include/experimental/bits/simd_detail.h: Likewise.
* include/experimental/bits/simd_fixed_size.h: Likewise.
* include/experimental/bits/simd_math.h: Likewise.
* include/experimental/bits/simd_neon.h: Likewise.
* include/experimental/bits/simd_ppc.h: Likewise.
* include/experimental/bits/simd_scalar.h: Likewise.
* include/experimental/bits/simd_x86.h: Likewise.
* include/experimental/bits/simd_x86_conversions.h: Likewise.
* include/experimental/simd: Likewise.
* testsuite/experimental/simd/*: Likewise.

Committed to trunk.

commit a054608c9c409245575e3dfe61b9a36e1bf7ffcf
Author: Jonathan Wakely 
Date:   Thu Jan 28 18:13:03 2021

libstdc++: Fix copyright dates for simd headers and tests

libstdc++-v3/ChangeLog:

* include/experimental/bits/numeric_traits.h: Update copyright
dates.
* include/experimental/bits/simd.h: Likewise.
* include/experimental/bits/simd_builtin.h: Likewise.
* include/experimental/bits/simd_converter.h: Likewise.
* include/experimental/bits/simd_detail.h: Likewise.
* include/experimental/bits/simd_fixed_size.h: Likewise.
* include/experimental/bits/simd_math.h: Likewise.
* include/experimental/bits/simd_neon.h: Likewise.
* include/experimental/bits/simd_ppc.h: Likewise.
* include/experimental/bits/simd_scalar.h: Likewise.
* include/experimental/bits/simd_x86.h: Likewise.
* include/experimental/bits/simd_x86_conversions.h: Likewise.
* include/experimental/simd: Likewise.
* testsuite/experimental/simd/*: Likewise.

diff --git a/libstdc++-v3/include/experimental/bits/numeric_traits.h 
b/libstdc++-v3/include/experimental/bits/numeric_traits.h
index 1b60874b788..0a1c711a559 100644
--- a/libstdc++-v3/include/experimental/bits/numeric_traits.h
+++ b/libstdc++-v3/include/experimental/bits/numeric_traits.h
@@ -1,6 +1,6 @@
 // Definition of numeric_limits replacement traits P1841R1 -*- C++ -*-
 
-// Copyright (C) 2020 Free Software Foundation, Inc.
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
diff --git a/libstdc++-v3/include/experimental/bits/simd.h 
b/libstdc++-v3/include/experimental/bits/simd.h
index 00eec50d64f..084849e2dc9 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -1,6 +1,6 @@
 // Definition of the public simd interfaces -*- C++ -*-
 
-// Copyright (C) 2020 Free Software Foundation, Inc.
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h 
b/libstdc++-v3/include/experimental/bits/simd_builtin.h
index f2c99faa4ee..efca65fa6e3 100644
--- a/libstdc++-v3/include/experimental/bits/simd_builtin.h
+++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h
@@ -1,6 +1,6 @@
 // Simd Abi specific implementations -*- C++ -*-
 
-// Copyright (C) 2020 Free Software Foundation, Inc.
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
diff --git a/libstdc++-v3/include/experimental/bits/simd_converter.h 
b/libstdc++-v3/include/experimental/bits/simd_converter.h
index dc4598743f9..9c8bf382df9 100644
--- a/libstdc++-v3/include/experimental/bits/simd_converter.h
+++ b/libstdc++-v3/include/experimental/bits/simd_converter.h
@@ -1,6 +1,6 @@
 // Generic simd conversions -*- C++ -*-
 
-// Copyright (C) 2020 Free Software Foundation, Inc.
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // software; you can redistribute it and/or modify it under the
diff --git a/libstdc++-v3/include/experimental/bits/simd_detail.h 
b/libstdc++-v3/include/experimental/bits/simd_detail.h
index a49a9d88b7f..1e75812d098 100644
--- a/libstdc++-v3/include/experimental/bits/simd_detail.h
+++ b/libstdc++-v3/include/experimental/bits/simd_detail.h
@@ -1,6 +1,6 @@
 // Internal macros for the simd implementation -*- C++ -*-
 
-// Copyright (C) 2020 Free Software Foundation, Inc.
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
 // 

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-28 Thread Michael Meissner via Gcc-patches
On Wed, Jan 27, 2021 at 09:11:49PM -0600, Segher Boessenkool wrote:
> On Tue, Jan 26, 2021 at 06:43:06PM -0500, Michael Meissner wrote:
> > I posted this patch on January 14th, 2021:
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563498.html
> > 
> > | Date: Thu, 14 Jan 2021 12:09:36 -0500
> > | Subject: [PATCH] PowerPC: Add float128/Decimal conversions.
> > | Message-ID: <20210114170936.ga3...@ibm-toto.the-meissners.org>
> > 
> > You had a question about what changed, and I replied:
> > 
> > | In your last message, you said that it was unacceptable that the 
> > conversion
> > | fails if the user uses an old GLIBC.  So I rewrote the code using weak
> > | references.  If the user has at least GLIBC 2.32, it will use the IEEE 
> > 128-bit
> > | string support in the library.
> > |
> > | If an older GLIBC is used, I then use the IBM 128-bit format as an 
> > intermediate
> > | value.  Obviously there are cases where IEEE 128-bit can hold values that 
> > IBM
> > | 128-bit can't (mostly due to the increased exponent range in IEEE 
> > 128-bit), but
> > | it at least does the conversion for the numbers in the common range.
> > |
> > | In doing this transformation, I needed to do minor edits to the main 
> > decimal
> > | to/from binary conversion functions to allow the KF functions to be 
> > declared.
> > | Previously, I used preprocessor magic to rename the functions.
> > 
> > This is the second most important patch in the IEEE 128-bit work.  What do I
> > need to do to be able to commit the patch?
> 
> The whole thread is at 
> https://patchwork.ozlabs.org/project/gcc/patch/2020112524.ga...@ibm-toto.the-meissners.org/
>  .
> 
> I approved *that* version of the patch.

Yes you approved the built-in renaming patch and I checked in the patch that
you approved (rather than the modifications that I submitted).

However, this is about the second patch (that provides conversions between
_Float128 and Decimal), which as far as I can tell was not approved.  That is
what this particular question is about.

You did not like the previous _Float128 <-> Decimal conversion patch because it
was tied into having a minimum GLIBC version.

So, I rewrote the whole patch so that it will work with older GLIBC's.

If you have a newer GLIBC (detected at runtime via a weak reference), it uses
the support functions in the new GLIBC to do the conversion.  I had a cut+paste
error in the original patch that I resubmitted, and I resubmitted with the
changed line.

If you have an older GLIBC and want to convert _Float128 to one of the Decimal
types, it will convert the _Float128 to __ibm128 and then convert that to
Decimal.

Similarly if you have one of the Decimal types, and want to convert to
_Float128, it will first convert the Decimal type to __ibm128, and then convert
__ibm128 to _Float128.

So what do I do?  Can I check in this patch or do I need to do further
modifications.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] testsuite: Run vec_insert case on P8 and P9 with option specified

2021-01-28 Thread Segher Boessenkool
Hi!

On Thu, Jan 28, 2021 at 02:40:25AM -0600, Xionghu Luo wrote:
> Move common functions to header file for cleanup.
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-01-27  Xionghu Luo  
> 
>   * gcc.target/powerpc/pr79251.p8.c: Move definition to ...
>   * gcc.target/powerpc/pr79251.h: ...this.
>   * gcc.target/powerpc/pr79251.p9.c: Likewise.
>   * gcc.target/powerpc/pr79251-run.c: Rename to...
>   * gcc.target/powerpc/pr79251-run.p8.c: ...this.
>   * gcc.target/powerpc/pr79251-run.p9.c: New test.

> --- a/gcc/testsuite/gcc.target/powerpc/pr79251.p8.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr79251.p8.c
> @@ -6,8 +6,6 @@
>  #include 
>  #include "pr79251.h"
>  
> -TEST_VEC_INSERT_ALL (test)

The changelog does not mention this (unless that is what "move
definition" means -- please improve the changelog a bit).

But the patch is fine as far as I can see.  Okay for trunk.  Thanks!


Segher


Re: arm: Adjust cost of vector of constant zero

2021-01-28 Thread Christophe Lyon via Gcc-patches
On Wed, 27 Jan 2021 at 15:03, Kyrylo Tkachov  wrote:
>
>
>
> > -Original Message-
> > From: Christophe Lyon 
> > Sent: 27 January 2021 13:56
> > To: Kyrylo Tkachov 
> > Cc: Kyrylo Tkachov via Gcc-patches 
> > Subject: Re: arm: Adjust cost of vector of constant zero
> >
> > On Wed, 27 Jan 2021 at 14:44, Kyrylo Tkachov 
> > wrote:
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: Christophe Lyon 
> > > > Sent: 27 January 2021 13:12
> > > > To: Kyrylo Tkachov 
> > > > Cc: Kyrylo Tkachov via Gcc-patches 
> > > > Subject: Re: arm: Adjust cost of vector of constant zero
> > > >
> > > > On Wed, 27 Jan 2021 at 10:15, Kyrylo Tkachov
> > 
> > > > wrote:
> > > > >
> > > > > Hi Christophe,
> > > > >
> > > > > > -Original Message-
> > > > > > From: Gcc-patches  On Behalf
> > Of
> > > > > > Christophe Lyon via Gcc-patches
> > > > > > Sent: 26 January 2021 18:03
> > > > > > To: gcc Patches 
> > > > > > Subject: arm: Adjust cost of vector of constant zero
> > > > > >
> > > > > > Neon vector comparisons have a dedicated version when comparing
> > with
> > > > > > constant zero: it means its cost is free.
> > > > > >
> > > > > > Adjust the cost in arm_rtx_costs_internal accordingly, for Neon 
> > > > > > only,
> > > > > > since MVE does not support this.
> > > > >
> > > > > I guess the other way to do this would be in the comparison code
> > handling
> > > > in this function where we could check for a const_vector of zeroes and a
> > > > Neon mode and avoid recursing into the operands.
> > > > > That would avoid the extra switch statement in your patch.
> > > > > WDYT?
> > > >
> > > > Do you mean like so:
> > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > > > index 4a5f265..542c15e 100644
> > > > --- a/gcc/config/arm/arm.c
> > > > +++ b/gcc/config/arm/arm.c
> > > > @@ -11316,6 +11316,28 @@ arm_rtx_costs_internal (rtx x, enum
> > rtx_code
> > > > code, enum rtx_code outer_code,
> > > > *cost = 0;
> > > > return true;
> > > >   }
> > > > +  /* Neon has special instructions when comparing with 0 (vceq, 
> > > > vcge,
> > > > vcgt,
> > > > + vcle and vclt). */
> > > > +  else if (TARGET_NEON
> > > > +&& TARGET_HARD_FLOAT
> > > > +&& (VALID_NEON_DREG_MODE (mode) ||
> > VALID_NEON_QREG_MODE
> > > > (mode))
> > > > +&& (XEXP (x, 1) == CONST0_RTX (mode)))
> > > > + {
> > > > +   switch (code)
> > > > + {
> > > > + case EQ:
> > > > + case GE:
> > > > + case GT:
> > > > + case LE:
> > > > + case LT:
> > > > +   *cost = 0;
> > > > +   return true;
> > > > +
> > > > + default:
> > > > +   break;
> > > > + }
> > > > + }
> > > > +
> > > >return false;
> > > >
> > > > I'm not sure I can remove the switch, since the other comparisons are
> > > > not supported by Neon anyway.
> > > >
> > >
> > > No, I mean where:
> > > case EQ:
> > > case NE:
> > > case LT:
> > > case LE:
> > > case GT:
> > > case GE:
> > > case LTU:
> > > case LEU:
> > > case GEU:
> > > case GTU:
> > > case ORDERED:
> > > case UNORDERED:
> > > case UNEQ:
> > > case UNLE:
> > > case UNLT:
> > > case UNGE:
> > > case UNGT:
> > > case LTGT:
> > >   if (outer_code == SET)
> > > {
> > >   /* Is it a store-flag operation?  */
> > >   if (REG_P (XEXP (x, 0)) && REGNO (XEXP (x, 0)) == CC_REGNUM
> > >
> > > you reorder the codes that are relevant to NEON, handle them for a vector
> > zero argument (and the right target checks), and fall through to the rest if
> > not.
> > >
> >
> > OK, I didn't find reordering this appealing :-)
> >
> > Like so, then?
> >
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index 4a5f265..88e398d 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -11211,11 +11211,23 @@ arm_rtx_costs_internal (rtx x, enum rtx_code
> > code, enum rtx_code outer_code,
> >return true;
> >
> >  case EQ:
> > -case NE:
> > -case LT:
> > -case LE:
> > -case GT:
> >  case GE:
> > +case GT:
> > +case LE:
> > +case LT:
> > +  /* Neon has special instructions when comparing with 0 (vceq, vcge,
> > vcgt,
> > + vcle and vclt). */
> > +  if (TARGET_NEON
> > +   && TARGET_HARD_FLOAT
> > +   && (VALID_NEON_DREG_MODE (mode) || VALID_NEON_QREG_MODE
> > (mode))
> > +   && (XEXP (x, 1) == CONST0_RTX (mode)))
> > + {
> > +   *cost = 0;
> > +   return true;
> > + }
> > +
> > +  /* Fall through.  */
> > +case NE:
> >  case LTU:
> >  case LEU:
> >  case GEU:
> >
>
> I find it much cleaner, but I guess it's subjective
> Ok if it passes bootstrap and testing.
> Thanks,
> Kyrill
>

Pushed after successful bootstrap as 31a0ab9213f780d2fa1da6e4879df214c0f247f9
(r11-6961)

Thanks,

Christophe

> >
> > Thanks,
> >
> > Christophe
> >
> > > Kyrill
> > >
> > > > Thanks,
> > > >
> > > > Christophe
> > > >
> > > >
> > > > > Thanks,
> > > > > Kyrill
> > > 

[PATCH] use memcpy instead of strncpy for dyn_string insertion

2021-01-28 Thread Martin Sebor via Gcc-patches

Calling strncpy in libiberty's dyn_string_insert() with the last
argument equal to the length of the second triggers the warning:

/src/gcc/master/libiberty/dyn-string.c:280:3: warning: ‘strncpy’ output 
truncated before terminating nul copying as many bytes from a string as 
its length [-Wstringop-truncation]

  280 |   strncpy (dest->s + pos, src, length);
  |   ^~~~
/src/gcc/master/libiberty/dyn-string.c:272:16: note: length computed here
  272 |   int length = strlen (src);
  |^~~~

The attached patch avoids the warning by calling memcpy instead.

I'll go ahead and commit the patch as obvious if there are no
concerns/objections.

Martin
libiberty/ChangeLog:

	* dyn-string.c (dyn_string_insert_cstr): Use memcpy instead of strncpy
	to avoid warnings.

diff --git a/libiberty/dyn-string.c b/libiberty/dyn-string.c
index ea711182ca5..8d2456b86c8 100644
--- a/libiberty/dyn-string.c
+++ b/libiberty/dyn-string.c
@@ -277,7 +277,7 @@ dyn_string_insert_cstr (dyn_string_t dest, int pos, const char *src)
   for (i = dest->length; i >= pos; --i)
 dest->s[i + length] = dest->s[i];
   /* Splice in the new stuff.  */
-  strncpy (dest->s + pos, src, length);
+  memcpy (dest->s + pos, src, length);
   /* Compute the new length.  */
   dest->length += length;
   return 1;


[COMMITTED] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-28 Thread Michael Meissner via Gcc-patches
After testing the patch I submitted on November 17th that was approved to make
sure it still works, I commited the patch to the master branch.  Sorry about
the intermediate rewrites.

[PATCH] Map long double built-ins correctly with IEEE 128-bit long double.

The PowerPC has two different 128-bit long double types, one that uses a pair
of doubles to get more mantissa range, and the other using the IEEE 128-bit
754R binary floating point format.  The pair of doubles has been used as the
traditional format, and we are in the process of moving to allow an
implementation to switch to using IEEE 128-bit floating point.  The GLIBC and
LIBSTDC++ libraries have been modified to have functions using the two
different formats in their libraries with different names.

This patch goes through all of the built-in functions that either take long
double arguments or return long double, and changes the name from the
traditional name to the IEEE 128-bit name.  The minimum GLIBC version to
support IEEE 128-bit floating point is 2.32.

The names changed are:

*   l is usually mapped to __ieee128;
*   printf is mapped to __printfieee128; (and)
*   scanf is mapped to __isoc99_scanfieee128.

A few functions have different mappings:

*   dreml   => __remainderieee128;
*   gammal  => __lgammaieee128;
*   gammal_r=> __lgammaieee128_r;
*   lgammal_r   => __lgammaieee128_r;
*   nexttoward  => __nexttoward_to_ieee128;
*   nexttowardf => __nexttowardf_to_ieee128;
*   nexttowardl => __nexttowardl_to_ieee128;
*   pow10l  => __exp10ieee128;
*   scalbl  => __scalbieee128;
*   significandl=> __significandieee128; (and)
*   sincosl => __sincosieee128.

gcc/
2021-01-28  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_mangle_decl_assembler_name): Add
support for mapping built-in function names for long double
built-in functions if long double is IEEE 128-bit.

gcc/testsuite/
2021-01-28  Michael Meissner  

* gcc.target/powerpc/float128-longdouble-math.c: New test.
* gcc.target/powerpc/float128-longdouble-stdio.c: New test.
* gcc.target/powerpc/float128-math.c: Adjust test for new name
being generated.  Add support for running test on power10.  Add
support for running if long double defaults to 64-bits.
---
 gcc/config/rs6000/rs6000.c| 135 --
 .../powerpc/float128-longdouble-math.c| 442 ++
 .../powerpc/float128-longdouble-stdio.c   |  36 ++
 .../gcc.target/powerpc/float128-math.c|  16 +-
 4 files changed, 589 insertions(+), 40 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-longdouble-math.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-longdouble-stdio.c

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ccfbe7ba9f1..fbaff289a40 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -27338,57 +27338,128 @@ rs6000_globalize_decl_name (FILE * stream, tree decl)
library before you can switch the real*16 type at compile time.
 
We use the TARGET_MANGLE_DECL_ASSEMBLER_NAME hook to change this name.  We
-   only do this if the default is that long double is IBM extended double, and
-   the user asked for IEEE 128-bit.  */
+   only do this transformation if the __float128 type is enabled.  This
+   prevents us from doing the transformation on older 32-bit ports that might
+   have enabled using IEEE 128-bit floating point as the default long double
+   type.  */
 
 static tree
 rs6000_mangle_decl_assembler_name (tree decl, tree id)
 {
-  if (!TARGET_IEEEQUAD_DEFAULT && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
+  if (TARGET_FLOAT128_TYPE && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
   && TREE_CODE (decl) == FUNCTION_DECL
-  && DECL_IS_UNDECLARED_BUILTIN (decl))
+  && DECL_IS_UNDECLARED_BUILTIN (decl)
+  && DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL)
 {
   size_t len = IDENTIFIER_LENGTH (id);
   const char *name = IDENTIFIER_POINTER (id);
+  char *newname = NULL;
 
-  if (name[len - 1] == 'l')
+  /* See if it is one of the built-in functions with an unusual name.  */
+  switch (DECL_FUNCTION_CODE (decl))
{
- bool uses_ieee128_p = false;
- tree type = TREE_TYPE (decl);
- machine_mode ret_mode = TYPE_MODE (type);
+   case BUILT_IN_DREML:
+ newname = xstrdup ("__remainderieee128");
+ break;
 
- /* See if the function returns a IEEE 128-bit floating point type or
-complex type.  */
- if (ret_mode == TFmode || ret_mode == TCmode)
-   uses_ieee128_p = true;
- else
-   {
- function_args_iterator args_iter;
- tree arg;
+   case BUILT_IN_GAMMAL:
+ newname = xstrdup ("__lgammaieee128");
+ break;
+
+   case 

[PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]

2021-01-28 Thread Marek Polacek via Gcc-patches
A year ago I submitted this patch:

~~
Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it
gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by
build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1.

When we strip_typedefs the element of the array "const d", we see it's
a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is
char, but we need to add the const qualifier, so we call
cp_build_qualified_type -> build_qualified_type
where get_qualified_type checks to see if we already have such a type
by walking the variants list, which in this case is:

  char -> c -> const char -> const char -> d -> const d

Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN,
we choose the first const char, which has TYPE_USER_ALIGN set.  If the
element type of an array has TYPE_USER_ALIGN, the array type gets it too.

So we can make check_base_type stricter.  I was afraid that it might make
us reuse types less often, but measuring showed that we build the same
amount of types with and without the patch, while bootstrapping.
~~

However, the patch broke a few tests on STRICT_ALIGNMENT platforms and
had to be reverted.  This is another try.  The original patch is kept
unchanged, but I added the finalize_type_size hunk that ought to fix the
STRICT_ALIGNMENT issues.

The problem is that finalize_type_size can clear TYPE_USER_ALIGN on the
main variant of a type, but doesn't clear it on any of the variants.
Then we end up with types which share the same TYPE_MAIN_VARIANT, but
their TYPE_CANONICAL differs and then the usual "canonical types differ
for identical types" follows.

I've created alignas19.C to exercise this scenario.  What happens is:
- when parsing the class S we create a type S in xref_tag,
- we see alignas(8) so common_handle_aligned_attribute sets T_U_A in S,
- we parse the member function fn and build_memfn_type creates a copy
  of S to add const; this variant has T_U_A set,
- we finish_struct S which calls layout_class_type -> finish_record_type
  -> finalize_size_type where we reset T_U_A in S (but const S keeps it),
- finish_non_static_data_member for arr calls maybe_dummy_object with
  type = S,
- maybe_dummy_object calls same_type_ignoring_top_level_qualifiers_p
  to check if S and TREE_TYPE (current_class_ref), which is const S,
  are the same,
- same_type_ignoring_top_level_qualifiers_p creates cv-unqualified
  versions of the passed types.  Previously we'd use our main variant
  S when stripping "const S" of const, but since the T_U_A flags don't
  match (check_base_type), we create a new variant S'.  Then we crash in
  comptypes because S and S' have the same TYPE_MAIN_VARIANT but
  different TYPE_CANONICALs.

With my patch we'll clear T_U_A for S's variants too, and then instead
of S' we'll just use S.  Does this seem sane?

Bootstrapped/regtested on
* x86_64-pc-linux-gnu
* powerpc64le-unknown-linux-gnu
* aarch64-linux-gnu
ok for trunk?

gcc/ChangeLog:

PR c++/94775
* stor-layout.c (finalize_type_size): If we reset TYPE_USER_ALIGN in
the main variant, maybe reset it in its variants too.
* tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match.
(check_aligned_type): Check if TYPE_USER_ALIGN match.

gcc/testsuite/ChangeLog:

PR c++/94775
* g++.dg/cpp0x/alignas19.C: New test.
* g++.dg/warn/Warray-bounds15.C: New test.
---
 gcc/stor-layout.c   | 16 +++--
 gcc/testsuite/g++.dg/cpp0x/alignas19.C  |  8 +
 gcc/testsuite/g++.dg/warn/Warray-bounds15.C | 40 +
 gcc/tree.c  |  4 ++-
 4 files changed, 64 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alignas19.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds15.C

diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 7d6b1b5ce52..784f131ebb8 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -1926,6 +1926,7 @@ finalize_type_size (tree type)
  However, where strict alignment is not required, avoid
  over-aligning structures, since most compilers do not do this
  alignment.  */
+  bool tua_cleared_p = false;
   if (TYPE_MODE (type) != BLKmode
   && TYPE_MODE (type) != VOIDmode
   && (STRICT_ALIGNMENT || !AGGREGATE_TYPE_P (type)))
@@ -1937,7 +1938,9 @@ finalize_type_size (tree type)
   if (mode_align >= TYPE_ALIGN (type))
{
  SET_TYPE_ALIGN (type, mode_align);
- TYPE_USER_ALIGN (type) = 0;
+ /* Remember that we're about to reset this flag.  */
+ tua_cleared_p = TYPE_USER_ALIGN (type);
+ TYPE_USER_ALIGN (type) = false;
}
 }
 
@@ -1991,14 +1994,21 @@ finalize_type_size (tree type)
 
   /* Copy it into all variants.  */
   for (variant = TYPE_MAIN_VARIANT (type);
-  variant != 0;
+  variant != NULL_TREE;
   variant = TYPE_NEXT_VARIANT (variant))
{
  TYPE_SIZE 

[PATCH] c++, v2: Fix -Weffc++ in templates [PR98841]

2021-01-28 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 28, 2021 at 10:04:12AM -0500, Jason Merrill wrote:
> > We emit a bogus warning on the following testcase, suggesting that the
> > operator should return *this even when it does that already.
> > The problem is that normally cp_build_indirect_ref_1 ensures that *this
> > is folded as current_class_ref, but in templates (if return type is
> > non-dependent, otherwise check_return_expr doesn't check it) it didn't
> > go through cp_build_indirect_ref_1, but just built another INDIRECT_REF.
> 
> That seems like a bug in build_x_indirect_ref.

So do you want this instead (if it passes bootstrap/regtest)?

2021-01-28  Jakub Jelinek  

PR c++/98841
* typeck.c (build_x_indirect_ref): For *this, return current_class_ref.

* g++.dg/warn/effc5.C: New test.

--- gcc/cp/typeck.c.jj  2021-01-27 11:48:49.715890458 +0100
+++ gcc/cp/typeck.c 2021-01-28 16:17:18.712755173 +0100
@@ -3326,7 +3326,15 @@ build_x_indirect_ref (location_t loc, tr
 {
   /* Retain the type if we know the operand is a pointer.  */
   if (TREE_TYPE (expr) && INDIRECT_TYPE_P (TREE_TYPE (expr)))
-   return build_min (INDIRECT_REF, TREE_TYPE (TREE_TYPE (expr)), expr);
+   {
+ if (expr == current_class_ptr
+ || (TREE_CODE (expr) == NOP_EXPR
+ && TREE_OPERAND (expr, 0) == current_class_ptr
+ && (same_type_ignoring_top_level_qualifiers_p
+   (TREE_TYPE (expr), TREE_TYPE (current_class_ptr)
+   return current_class_ref;
+ return build_min (INDIRECT_REF, TREE_TYPE (TREE_TYPE (expr)), expr);
+   }
   if (type_dependent_expression_p (expr))
return build_min_nt_loc (loc, INDIRECT_REF, expr);
   expr = build_non_dependent_expr (expr);
--- gcc/testsuite/g++.dg/warn/effc5.C.jj2021-01-28 16:15:05.820256255 
+0100
+++ gcc/testsuite/g++.dg/warn/effc5.C   2021-01-28 16:15:05.820256255 +0100
@@ -0,0 +1,17 @@
+// PR c++/98841
+// { dg-do compile }
+// { dg-options "-Weffc++" }
+
+struct S {
+  template 
+  S& operator=(const T&) { return *this; } // { dg-bogus "should return a 
reference to" }
+  S& operator=(const S&) { return *this; }
+};
+
+void
+foo ()
+{
+  S s, t;
+  s = 1;
+  s = t;
+}


Jakub



Re: [PATCH] c++: fix string literal member initializer bug [PR90926]

2021-01-28 Thread Jason Merrill via Gcc-patches

On 1/28/21 9:54 AM, Tom Greenslade (thomgree) wrote:

While trying to fix the suggested overload resolution issue I run into another 
bug:

struct A
{
   char str[4];
};

void f(A) {};

int main ()
{
   f({"foo"});
}

Does not compile on GCC: "error: could not convert ‘{"foo"}’ from ‘’ to ‘A’", but works fine on Clang.
Is this a known bug or a new one?


That looks like the same bug you're fixing with this patch.


-Original Message-
From: Jason Merrill 
Sent: 22 January 2021 16:30
To: Tom Greenslade (thomgree) ; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] c++: fix string literal member initializer bug [PR90926]

On 12/17/20 5:12 PM, Thomas Greenslade (thomgree) via Gcc-patches wrote:

build_aggr_conv did not correctly handle string literal member initializers.
Extended can_convert_array to handle this case. The additional checks
of compatibility of character types, and whether string literal will
fit, would be quite complicated, so are deferred until the actual conversion 
takes place.


It seems that we need to check the type, though not the length.

[over.ics.list]: "Otherwise, if the parameter type is a character array
125 and the initializer list has a single element that is an appropriately-typed 
string-literal (9.4.2), the implicit conversion sequence is the identity 
conversion."

So this should be unambiguous:

struct A
{
char str[10];
};

struct B
{
char16_t str[10];
};

void f(A);
void f(B);

int
main ()
{
f({"foo"});  // calls A overload

f({u"foo"}); // calls B overload

}

You could factor the type matching code out of digest_init_r and use it here.


Testcase added for this.

Bootstrapped/regtested on x86_64-pc-linux-gnu.

gcc/cp/ChangeLog:

  PR c++/90926
  * call.c (can_convert_array): Extend to handle all valid aggregate
  initializers of an array; including by string literals, not just by
  brace-init-list.
  (build_aggr_conv): Call can_convert_array more often, not just in
  brace-init-list case.
  * g++.dg/cpp1y/nsdmi-aggr12.C: New test.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c index
c2d62e582bf..e4ba31f3f2b 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -887,28 +887,41 @@ strip_standard_conversion (conversion *conv)
 return conv;
   }

-/* Subroutine of build_aggr_conv: check whether CTOR, a braced-init-list,
-   is a valid aggregate initializer for array type ATYPE.  */
+/* Subroutine of build_aggr_conv: check whether FROM is a valid aggregate
+   initializer for array type ATYPE.  */

   static bool
-can_convert_array (tree atype, tree ctor, int flags, tsubst_flags_t
complain)
+can_convert_array (tree atype, tree from, int flags, tsubst_flags_t
+complain)
   {
-  unsigned i;
 tree elttype = TREE_TYPE (atype);
-  for (i = 0; i < CONSTRUCTOR_NELTS (ctor); ++i)
+  unsigned i;
+
+  if (TREE_CODE (from) == CONSTRUCTOR)
   {
-  tree val = CONSTRUCTOR_ELT (ctor, i)->value;
-  bool ok;
-  if (TREE_CODE (elttype) == ARRAY_TYPE
- && TREE_CODE (val) == CONSTRUCTOR)
-   ok = can_convert_array (elttype, val, flags, complain);
-  else
-   ok = can_convert_arg (elttype, TREE_TYPE (val), val, flags,
- complain);
-  if (!ok)
-   return false;
+  for (i = 0; i < CONSTRUCTOR_NELTS (from); ++i)
+   {
+ tree val = CONSTRUCTOR_ELT (from, i)->value;
+ bool ok;
+ if (TREE_CODE (elttype) == ARRAY_TYPE)
+   ok = can_convert_array (elttype, val, flags, complain);
+ else
+   ok = can_convert_arg (elttype, TREE_TYPE (val), val, flags,
+ complain);
+ if (!ok)
+   return false;
+   }
+  return true;
   }
-  return true;
+
+  if (   char_type_p (TYPE_MAIN_VARIANT (elttype))
+  && TREE_CODE (tree_strip_any_location_wrapper (from)) == STRING_CST)
+/* Defer the other necessary checks (compatibility of character types and
+   whether string literal will fit) until the conversion actually takes
+   place.  */
+return true;
+
+  /* No other valid way to aggregate initialize an array.  */  return
+ false;
   }

   /* Helper for build_aggr_conv.  Return true if FIELD is in PSET, or
if @@ -965,8 +978,7 @@ build_aggr_conv (tree type, tree ctor, int flags, 
tsubst_flags_t complain)
tree ftype = TREE_TYPE (idx);
bool ok;

- if (TREE_CODE (ftype) == ARRAY_TYPE
- && TREE_CODE (val) == CONSTRUCTOR)
+ if (TREE_CODE (ftype) == ARRAY_TYPE)
  ok = can_convert_array (ftype, val, flags, complain);
else
  ok = can_convert_arg (ftype, TREE_TYPE (val), val,
flags, @@ -1013,9 +1025,8 @@ build_aggr_conv (tree type, tree ctor, int flags, 
tsubst_flags_t complain)
val = empty_ctor;
  }

-  if (TREE_CODE (ftype) == ARRAY_TYPE
- && TREE_CODE (val) == CONSTRUCTOR)
-   ok = can_convert_array 

Re: [PATCH] c++: Fix -Weffc++ in templates [PR98841]

2021-01-28 Thread Jason Merrill via Gcc-patches

On 1/28/21 3:58 AM, Jakub Jelinek wrote:

Hi!

We emit a bogus warning on the following testcase, suggesting that the
operator should return *this even when it does that already.
The problem is that normally cp_build_indirect_ref_1 ensures that *this
is folded as current_class_ref, but in templates (if return type is
non-dependent, otherwise check_return_expr doesn't check it) it didn't
go through cp_build_indirect_ref_1, but just built another INDIRECT_REF.


That seems like a bug in build_x_indirect_ref.


Which means it then doesn't compare pointer-equal to current_class_ref.

The following patch just checks if it is the same thing like
cp_build_indirect_ref_1 would check if it would be called.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-01-27  Jakub Jelinek  

PR c++/98841
* typeck.c (check_return_expr): If processing_template_decl, don't
rely on *this expression being always pointer-equal to
current_class_ref.

* g++.dg/warn/effc5.C: New test.

--- gcc/cp/typeck.c.jj  2021-01-25 10:02:28.380126847 +0100
+++ gcc/cp/typeck.c 2021-01-27 11:21:13.117875551 +0100
@@ -10188,6 +10188,19 @@ check_return_expr (tree retval, bool *no
  /* Returning '*this' is obviously OK.  */
  if (retval == current_class_ref)
warn = false;
+ /* In templates, '*this' might not compare pointer equal to
+current_class_ref.  */
+ else if (processing_template_decl
+  && retval
+  && TREE_CODE (retval) == INDIRECT_REF
+  && (TREE_OPERAND (retval, 0) == current_class_ptr
+  || (TREE_CODE (TREE_OPERAND (retval, 0)) == NOP_EXPR
+  && (TREE_OPERAND (TREE_OPERAND (retval, 0), 0)
+  == current_class_ptr)
+  && (same_type_ignoring_top_level_qualifiers_p
+(TREE_TYPE (TREE_OPERAND (retval, 0)),
+ TREE_TYPE (current_class_ptr))
+   warn = false;
  /* If we are calling a function whose return type is the same of
 the current class reference, it is ok.  */
  else if (INDIRECT_REF_P (retval)
--- gcc/testsuite/g++.dg/warn/effc5.C.jj2021-01-27 11:24:34.451565246 
+0100
+++ gcc/testsuite/g++.dg/warn/effc5.C   2021-01-27 11:23:23.836375556 +0100
@@ -0,0 +1,17 @@
+// PR c++/98841
+// { dg-do compile }
+// { dg-options "-Weffc++" }
+
+struct S {
+  template 
+  S& operator=(const T&) { return *this; } // { dg-bogus "should return a 
reference to" }
+  S& operator=(const S&) { return *this; }
+};
+
+void
+foo ()
+{
+  S s, t;
+  s = 1;
+  s = t;
+}

Jakub





RE: [PATCH] c++: fix string literal member initializer bug [PR90926]

2021-01-28 Thread Tom Greenslade (thomgree) via Gcc-patches
While trying to fix the suggested overload resolution issue I run into another 
bug:

struct A
{
  char str[4];
};

void f(A) {};

int main ()
{
  f({"foo"});
}

Does not compile on GCC: "error: could not convert ‘{"foo"}’ from 
‘’ to ‘A’", but works fine on Clang.
Is this a known bug or a new one?

-Original Message-
From: Jason Merrill  
Sent: 22 January 2021 16:30
To: Tom Greenslade (thomgree) ; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] c++: fix string literal member initializer bug [PR90926]

On 12/17/20 5:12 PM, Thomas Greenslade (thomgree) via Gcc-patches wrote:
> build_aggr_conv did not correctly handle string literal member initializers.
> Extended can_convert_array to handle this case. The additional checks 
> of compatibility of character types, and whether string literal will 
> fit, would be quite complicated, so are deferred until the actual conversion 
> takes place.

It seems that we need to check the type, though not the length.

[over.ics.list]: "Otherwise, if the parameter type is a character array
125 and the initializer list has a single element that is an 
appropriately-typed string-literal (9.4.2), the implicit conversion sequence is 
the identity conversion."

So this should be unambiguous:

struct A
{
   char str[10];
};

struct B
{
   char16_t str[10];
};

void f(A);
void f(B);

int
main ()
{
   f({"foo"});  // calls A overload 

   f({u"foo"}); // calls B overload 

}

You could factor the type matching code out of digest_init_r and use it here.

> Testcase added for this.
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu.
> 
> gcc/cp/ChangeLog:
> 
>  PR c++/90926
>  * call.c (can_convert_array): Extend to handle all valid aggregate
>  initializers of an array; including by string literals, not just by
>  brace-init-list.
>  (build_aggr_conv): Call can_convert_array more often, not just in
>  brace-init-list case.
>  * g++.dg/cpp1y/nsdmi-aggr12.C: New test.
> 
> diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 
> c2d62e582bf..e4ba31f3f2b 100644
> --- a/gcc/cp/call.c
> +++ b/gcc/cp/call.c
> @@ -887,28 +887,41 @@ strip_standard_conversion (conversion *conv)
> return conv;
>   }
> 
> -/* Subroutine of build_aggr_conv: check whether CTOR, a braced-init-list,
> -   is a valid aggregate initializer for array type ATYPE.  */
> +/* Subroutine of build_aggr_conv: check whether FROM is a valid aggregate
> +   initializer for array type ATYPE.  */
> 
>   static bool
> -can_convert_array (tree atype, tree ctor, int flags, tsubst_flags_t 
> complain)
> +can_convert_array (tree atype, tree from, int flags, tsubst_flags_t 
> +complain)
>   {
> -  unsigned i;
> tree elttype = TREE_TYPE (atype);
> -  for (i = 0; i < CONSTRUCTOR_NELTS (ctor); ++i)
> +  unsigned i;
> +
> +  if (TREE_CODE (from) == CONSTRUCTOR)
>   {
> -  tree val = CONSTRUCTOR_ELT (ctor, i)->value;
> -  bool ok;
> -  if (TREE_CODE (elttype) == ARRAY_TYPE
> - && TREE_CODE (val) == CONSTRUCTOR)
> -   ok = can_convert_array (elttype, val, flags, complain);
> -  else
> -   ok = can_convert_arg (elttype, TREE_TYPE (val), val, flags,
> - complain);
> -  if (!ok)
> -   return false;
> +  for (i = 0; i < CONSTRUCTOR_NELTS (from); ++i)
> +   {
> + tree val = CONSTRUCTOR_ELT (from, i)->value;
> + bool ok;
> + if (TREE_CODE (elttype) == ARRAY_TYPE)
> +   ok = can_convert_array (elttype, val, flags, complain);
> + else
> +   ok = can_convert_arg (elttype, TREE_TYPE (val), val, flags,
> + complain);
> + if (!ok)
> +   return false;
> +   }
> +  return true;
>   }
> -  return true;
> +
> +  if (   char_type_p (TYPE_MAIN_VARIANT (elttype))
> +  && TREE_CODE (tree_strip_any_location_wrapper (from)) == STRING_CST)
> +/* Defer the other necessary checks (compatibility of character types and
> +   whether string literal will fit) until the conversion actually takes
> +   place.  */
> +return true;
> +
> +  /* No other valid way to aggregate initialize an array.  */  return 
> + false;
>   }
> 
>   /* Helper for build_aggr_conv.  Return true if FIELD is in PSET, or 
> if @@ -965,8 +978,7 @@ build_aggr_conv (tree type, tree ctor, int flags, 
> tsubst_flags_t complain)
>tree ftype = TREE_TYPE (idx);
>bool ok;
> 
> - if (TREE_CODE (ftype) == ARRAY_TYPE
> - && TREE_CODE (val) == CONSTRUCTOR)
> + if (TREE_CODE (ftype) == ARRAY_TYPE)
>  ok = can_convert_array (ftype, val, flags, complain);
>else
>  ok = can_convert_arg (ftype, TREE_TYPE (val), val, 
> flags, @@ -1013,9 +1025,8 @@ build_aggr_conv (tree type, tree ctor, int 
> flags, tsubst_flags_t complain)
>val = empty_ctor;
>  }
> 
> -  if (TREE_CODE (ftype) == ARRAY_TYPE
> - && 

RE: [PATCH] aarch64: Use RTL builtins for [su]mlsl_n intrinsics

2021-01-28 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Jonathan Wright 
> Sent: 28 January 2021 13:23
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH] aarch64: Use RTL builtins for [su]mlsl_n intrinsics
> 
> Hi,
> 
> As subject, this patch rewrites [su]mlsl_n Neon intrinsics to use RTL builtins
> rather than inline assembly code, allowing for better scheduling and
> optimization.
> 
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
> 
> Ok for master?
> 

Ok.
Thanks,
Kyrill

> Thanks,
> Jonathan
> 
> ---
> 
> gcc/ChangeLog:
> 
> 2021-01-27  Jonathan Wright  
> 
> * config/aarch64/aarch64-simd-builtins.def: Add [su]mlsl_n
> builtin generator macros.
> * config/aarch64/aarch64-simd.md (aarch64_mlsl_n):
> Define.
> * config/aarch64/arm_neon.h (vmlsl_n_s16): Use RTL builtin
> instead of inline asm.
> (vmlsl_n_s32): Likewise.
> (vmlsl_n_u16): Likewise.
> (vmlsl_n_u32): Likewise.



[PATCH] aarch64: Fix gcc.target/aarch64/narrow_high-intrinsics.c testism

2021-01-28 Thread Kyrylo Tkachov via Gcc-patches
Pushing to fix recently-updated assembly generation.

gcc/testsuite/

* gcc.target/aarch64/narrow_high-intrinsics.c: Fix shrn2 scan.


shrn2.patch
Description: shrn2.patch


[PATCH] aarch64: Use RTL builtins for [su]mlsl_n intrinsics

2021-01-28 Thread Jonathan Wright via Gcc-patches
Hi,

As subject, this patch rewrites [su]mlsl_n Neon intrinsics to use RTL builtins
rather than inline assembly code, allowing for better scheduling and
optimization.

Regression tested and bootstrapped on aarch64-none-linux-gnu - no
issues.

Ok for master?

Thanks,
Jonathan

---

gcc/ChangeLog:

2021-01-27  Jonathan Wright  

* config/aarch64/aarch64-simd-builtins.def: Add [su]mlsl_n
builtin generator macros.
* config/aarch64/aarch64-simd.md (aarch64_mlsl_n):
Define.
* config/aarch64/arm_neon.h (vmlsl_n_s16): Use RTL builtin
instead of inline asm.
(vmlsl_n_s32): Likewise.
(vmlsl_n_u16): Likewise.
(vmlsl_n_u32): Likewise.
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index 4f8e28dc3c8478ea50aad333b21bd83f4a4b750e..2b582bee9133039b05b4fdbef92766a30caeab20 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -192,6 +192,10 @@
   BUILTIN_VD_BHSI (TERNOP, smlsl, 0, NONE)
   BUILTIN_VD_BHSI (TERNOPU, umlsl, 0, NONE)
 
+  /* Implemented by aarch64_mlsl_n.  */
+  BUILTIN_VD_HSI (TERNOP, smlsl_n, 0, NONE)
+  BUILTIN_VD_HSI (TERNOPU, umlsl_n, 0, NONE)
+
   /* Implemented by aarch64_mlal.  */
   BUILTIN_VD_BHSI (TERNOP, smlal, 0, NONE)
   BUILTIN_VD_BHSI (TERNOPU, umlal, 0, NONE)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index d78f26be19a16163eb1b8f661c6100ac290e6c6b..f2539cf84e30032ed609c12de7530d3e9be77b60 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1868,6 +1868,21 @@
   [(set_attr "type" "neon_mla__long")]
 )
 
+(define_insn "aarch64_mlsl_n"
+  [(set (match_operand: 0 "register_operand" "=w")
+(minus:
+  (match_operand: 1 "register_operand" "0")
+  (mult:
+(ANY_EXTEND:
+  (vec_duplicate:VD_HSI
+	  (match_operand: 3 "register_operand" "")))
+(ANY_EXTEND:
+  (match_operand:VD_HSI 2 "register_operand" "w")]
+  "TARGET_SIMD"
+  "mlsl\t%0., %2., %3.[0]"
+  [(set_attr "type" "neon_mla__long")]
+)
+
 (define_insn "aarch64_simd_vec_mult_lo_"
  [(set (match_operand: 0 "register_operand" "=w")
(mult: (ANY_EXTEND: (vec_select:
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 004c73d9e0ec4c33e24968d17e4307f858b51263..95c5e36530f1a3b72672f62737ced45704323fff 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -8166,48 +8166,28 @@ __extension__ extern __inline int32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmlsl_n_s16 (int32x4_t __a, int16x4_t __b, int16_t __c)
 {
-  int32x4_t __result;
-  __asm__ ("smlsl %0.4s, %2.4h, %3.h[0]"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "x"(__c)
-   : /* No clobbers */);
-  return __result;
+  return __builtin_aarch64_smlsl_nv4hi (__a, __b, __c);
 }
 
 __extension__ extern __inline int64x2_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmlsl_n_s32 (int64x2_t __a, int32x2_t __b, int32_t __c)
 {
-  int64x2_t __result;
-  __asm__ ("smlsl %0.2d, %2.2s, %3.s[0]"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  return __builtin_aarch64_smlsl_nv2si (__a, __b, __c);
 }
 
 __extension__ extern __inline uint32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmlsl_n_u16 (uint32x4_t __a, uint16x4_t __b, uint16_t __c)
 {
-  uint32x4_t __result;
-  __asm__ ("umlsl %0.4s, %2.4h, %3.h[0]"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "x"(__c)
-   : /* No clobbers */);
-  return __result;
+  return __builtin_aarch64_umlsl_nv4hi_ (__a, __b, __c);
 }
 
 __extension__ extern __inline uint64x2_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmlsl_n_u32 (uint64x2_t __a, uint32x2_t __b, uint32_t __c)
 {
-  uint64x2_t __result;
-  __asm__ ("umlsl %0.2d, %2.2s, %3.s[0]"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  return __builtin_aarch64_umlsl_nv2si_ (__a, __b, __c);
 }
 
 __extension__ extern __inline int16x8_t


Re: [PATCH][X86] Enable X86_TUNE_AVX256_UNALIGNED_{LOAD, STORE}_OPTIMAL for generic tune [PR target/98172]

2021-01-28 Thread H.J. Lu via Gcc-patches
On Thu, Jan 28, 2021 at 1:21 AM Richard Biener via Gcc-patches
 wrote:
>
> On Thu, Jan 28, 2021 at 7:32 AM Hongtao Liu via Gcc-patches
>  wrote:
> >
> > Hi:
> >GCC11 will be the system GCC 2 years from now, and for the
> > processors then, they shouldn't even need to split a 256-bit vector
> > into 2 128-bits vectors.
> >.i.e. Test SPEC2017 with the below 2 options on Zen3/ICL show
> > option B is better than Option A.
> > Option A:
> > -march=x86-64 -mtune=generic -mavx2 -mfma -Ofast
> >
> > Option B:
> > Option A + 
> > -mtune-ctrl="256_unaligned_load_optimal,256_unaligned_store_optimal"
> >
> >   Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}.
>
> Given the explicit list for unaligned loads it's a no-brainer to change that
> for X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL.  Given both
> BDVER and ZNVER1 are listed for X86_TUNE_AVX256_UNALIGNED_STORE_OPTIMAL
> we should try to benchmark the effect on ZNVER1 - Martin, do we still
> have a znver1 machine around?

They are also turned on for Sandybridge.  I don't believe we should keep it
in GCC 11 to penalize today's CPUs as well as CPUs in 2024.

> Note that with the settings differing in a way to split stores but not to 
> split
> loads, loading a just stored value can cause bad STLF and quite a
> performance hit (since znver1 has 128bit data paths that shouldn't
> be an issue there but it would have an issue for actually aligned data
> on CPUs with 256bit data paths).
>
> Thanks,
> Richard.
>
> >   Ok for trunk?
> >
> >
> >
> >
> > --
> > BR,
> > Hongtao



-- 
H.J.


Re: [PATCH] c++: Fix up handling of register ... asm ("...") vars in templates [PR33661, PR98847]

2021-01-28 Thread Jason Merrill via Gcc-patches

On 1/28/21 3:52 AM, Jakub Jelinek wrote:

Hi!

As the testcase shows, for vars appearing in templates, we don't attach
the asm spec string to the pattern decls, nor pass it back to cp_finish_decl
during instantiation.

The following patch does that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2021-01-27  Jakub Jelinek  

PR c++/33661
PR c++/98847
* decl.c (cp_finish_decl): For register vars with asmspec in templates
call set_user_assembler_name and set DECL_HARD_REGISTER.
* pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars,
pass asmspec_tree to cp_finish_decl.

* g++.target/i386/pr98847.C: New test.

--- gcc/cp/decl.c.jj2021-01-07 22:54:34.0 +0100
+++ gcc/cp/decl.c   2021-01-27 10:48:14.399582310 +0100
@@ -7840,6 +7840,12 @@ cp_finish_decl (tree decl, tree init, bo
  retrofit_lang_decl (decl);
  SET_DECL_DEPENDENT_INIT_P (decl, true);
}
+
+  if (VAR_P (decl) && DECL_REGISTER (decl) && asmspec)
+   {
+ set_user_assembler_name (decl, asmspec);
+ DECL_HARD_REGISTER (decl) = 1;
+   }
return;
  }
  
--- gcc/cp/pt.c.jj	2021-01-22 10:07:53.643466723 +0100

+++ gcc/cp/pt.c 2021-01-27 10:59:04.109126313 +0100
@@ -18227,6 +18227,7 @@ tsubst_expr (tree t, tree args, tsubst_f
bool const_init = false;
unsigned int cnt = 0;
tree first = NULL_TREE, ndecl = error_mark_node;
+   tree asmspec_tree = NULL_TREE;
maybe_push_decl (decl);
  
  		if (VAR_P (decl)

@@ -18250,7 +18251,18 @@ tsubst_expr (tree t, tree args, tsubst_f
   now.  */
predeclare_vla (decl);
  
-		cp_finish_decl (decl, init, const_init, NULL_TREE, 0);

+   if (VAR_P (decl) && DECL_HARD_REGISTER (pattern_decl))
+ {
+   tree id = DECL_ASSEMBLER_NAME (pattern_decl);
+   const char *asmspec = IDENTIFIER_POINTER (id);
+   gcc_assert (asmspec[0] == '*');
+   asmspec_tree
+ = build_string (IDENTIFIER_LENGTH (id) - 1,
+ asmspec + 1);
+   TREE_TYPE (asmspec_tree) = char_array_type_node;
+ }
+
+   cp_finish_decl (decl, init, const_init, asmspec_tree, 0);
  
  		if (ndecl != error_mark_node)

  cp_finish_decomp (ndecl, first, cnt);
--- gcc/testsuite/g++.target/i386/pr98847.C.jj  2021-01-27 11:01:55.312161637 
+0100
+++ gcc/testsuite/g++.target/i386/pr98847.C 2021-01-27 11:00:26.601179659 
+0100
@@ -0,0 +1,20 @@
+// PR c++/98847
+// { dg-do run }
+// { dg-options "-O2 -masm=att" }
+
+template 
+int
+foo ()
+{
+  register int edx asm ("edx");
+  asm ("movl $1234, %%edx" : "=r" (edx));
+  return edx;
+}
+
+int
+main ()
+{
+  if (foo<0> () != 1234)
+__builtin_abort ();
+  return 0;
+}

Jakub





RE: aarch64: Use RTL builtins for [su]mlal_n intrinsics

2021-01-28 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Jonathan Wright 
> Sent: 28 January 2021 12:04
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: aarch64: Use RTL builtins for [su]mlal_n intrinsics
> 
> Hi,
> 
> As subject, this patch rewrites [su]mlal_n Neon intrinsics to use RTL builtins
> rather than inline assembly code, allowing for better scheduling and
> optimization.
> 
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
> 
> Ok for master?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Jonathan
> 
> ---
> 
> gcc/ChangeLog:
> 
> 2021-01-26  Jonathan Wright  
> 
> * config/aarch64/aarch64-simd-builtins.def: Add [su]mlal_n
> builtin generator macros.
> * config/aarch64/aarch64-simd.md (aarch64_mlal_n):
> Define.
> * config/aarch64/arm_neon.h (vmlal_n_s16): Use RTL builtin
> instead of inline asm.
> (vmlal_n_s32): Likewise.
> (vmlal_n_u16): Likewise.
> (vmlal_n_u32): Likewise.



Re: c++: header unit template alias merging [PR 98770]

2021-01-28 Thread Nathan Sidwell

On 1/28/21 7:54 AM, Nathan Sidwell wrote:


Typedefs are streamed by streaming the underlying type, and then
recreating the typedef.  But this breaks checking a duplicate is the
same as the original when it is a template alias -- we end up checking
a template alias (eg __void_t) against the underlying type (void).
And those are not the same template alias.  This stops pretendig that
the underlying type is the typedef for that checking and tells
is_matching_decl 'you have a typedef', so it knows what to do.  (We do
not want to recreate the typedef of the duplicate, because that whole
set of nodes is going to go away.)


d'oh!
PR c++/98770
gcc/cp/
* module.cc (trees_out::decl_value): Swap is_typedef & TYPE_NAME
check order.
(trees_in::decl_value): Do typedef frobbing only when installing
a new typedef, adjust is_matching_decl call.  Swap is_typedef
& TYPE_NAME check.
(trees_in::is_matching_decl): Add is_typedef parm. Adjust variable
names and deal with typedef checking.
gcc/testsuite/
* g++.dg/modules/pr98770_a.C: New.
* g++.dg/modules/pr98770_b.C: New.


--
Nathan Sidwell


c++: header unit template alias merging [PR 98770]

2021-01-28 Thread Nathan Sidwell


Typedefs are streamed by streaming the underlying type, and then
recreating the typedef.  But this breaks checking a duplicate is the
same as the original when it is a template alias -- we end up checking
a template alias (eg __void_t) against the underlying type (void).
And those are not the same template alias.  This stops pretendig that
the underlying type is the typedef for that checking and tells
is_matching_decl 'you have a typedef', so it knows what to do.  (We do
not want to recreate the typedef of the duplicate, because that whole
set of nodes is going to go away.)

PR c++/98770
gcc/cp/
gcc/testsuite/
* g++.dg/modules/pr98770_a.C: New.
* g++.dg/modules/pr98770_b.C: New.


--
Nathan Sidwell
diff --git c/gcc/cp/module.cc w/gcc/cp/module.cc
index 18f5de8724b..daf75b16007 100644
--- c/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -3029,7 +3029,7 @@ public:
   bool read_definition (tree decl);
   
 private:
-  bool is_matching_decl (tree existing, tree decl);
+  bool is_matching_decl (tree existing, tree decl, bool is_typedef);
   static bool install_implicit_member (tree decl);
   bool read_function_def (tree decl, tree maybe_template);
   bool read_var_def (tree decl, tree maybe_template);
@@ -7864,8 +7864,8 @@ trees_out::decl_value (tree decl, depset *dep)
 			 || !dep == (VAR_OR_FUNCTION_DECL_P (inner)
  && DECL_LOCAL_DECL_P (inner)));
   else if ((TREE_CODE (inner) == TYPE_DECL
-	&& TYPE_NAME (TREE_TYPE (inner)) == inner
-	&& !is_typedef)
+	&& !is_typedef
+	&& TYPE_NAME (TREE_TYPE (inner)) == inner)
 	   || TREE_CODE (inner) == FUNCTION_DECL)
 {
   bool write_defn = !dep && has_definition (decl);
@@ -8088,12 +8088,6 @@ trees_in::decl_value ()
 		 && TREE_CODE (inner) == TYPE_DECL
 		 && DECL_ORIGINAL_TYPE (inner)
 		 && !TREE_TYPE (inner));
-  if (is_typedef)
-{
-  /* Frob it to be ready for cloning.  */
-  TREE_TYPE (inner) = DECL_ORIGINAL_TYPE (inner);
-  DECL_ORIGINAL_TYPE (inner) = NULL_TREE;
-}
 
   existing = back_refs[~tag];
   bool installed = install_entity (existing);
@@ -8156,7 +8150,12 @@ trees_in::decl_value ()
 	}
 
   if (is_typedef)
-	set_underlying_type (inner);
+	{
+	  /* Frob it to be ready for cloning.  */
+	  TREE_TYPE (inner) = DECL_ORIGINAL_TYPE (inner);
+	  DECL_ORIGINAL_TYPE (inner) = NULL_TREE;
+	  set_underlying_type (inner);
+	}
 
   if (inner_tag)
 	/* Set the TEMPLATE_DECL's type.  */
@@ -8218,7 +8217,7 @@ trees_in::decl_value ()
 	/* Set the TEMPLATE_DECL's type.  */
 	TREE_TYPE (decl) = TREE_TYPE (inner);
 
-  if (!is_matching_decl (existing, decl))
+  if (!is_matching_decl (existing, decl, is_typedef))
 	unmatched_duplicate (existing);
 
   /* And our result is the existing node.  */
@@ -8257,8 +8256,8 @@ trees_in::decl_value ()
   if (inner
   && !NAMESPACE_SCOPE_P (inner)
   && ((TREE_CODE (inner) == TYPE_DECL
-	   && TYPE_NAME (TREE_TYPE (inner)) == inner
-	   && !is_typedef)
+	   && !is_typedef
+	   && TYPE_NAME (TREE_TYPE (inner)) == inner)
 	  || TREE_CODE (inner) == FUNCTION_DECL)
   && u ())
 read_definition (decl);
@@ -11088,7 +11087,7 @@ trees_in::binfo_mergeable (tree *type)
decls_match because it can cause instantiations of constraints.  */
 
 bool
-trees_in::is_matching_decl (tree existing, tree decl)
+trees_in::is_matching_decl (tree existing, tree decl, bool is_typedef)
 {
   // FIXME: We should probably do some duplicate decl-like stuff here
   // (beware, default parms should be the same?)  Can we just call
@@ -11099,35 +11098,36 @@ trees_in::is_matching_decl (tree existing, tree decl)
   // can elide some of the checking
   gcc_checking_assert (TREE_CODE (existing) == TREE_CODE (decl));
 
-  tree inner = decl;
+  tree d_inner = decl;
+  tree e_inner = existing;
   if (TREE_CODE (decl) == TEMPLATE_DECL)
 {
-  inner = DECL_TEMPLATE_RESULT (decl);
-  gcc_checking_assert (TREE_CODE (DECL_TEMPLATE_RESULT (existing))
-			   == TREE_CODE (inner));
+  d_inner = DECL_TEMPLATE_RESULT (d_inner);
+  e_inner = DECL_TEMPLATE_RESULT (e_inner);
+  gcc_checking_assert (TREE_CODE (e_inner) == TREE_CODE (d_inner));
 }
 
   gcc_checking_assert (!map_context_from);
   /* This mapping requres the new decl on the lhs and the existing
  entity on the rhs of the comparitors below.  */
-  map_context_from = inner;
-  map_context_to = STRIP_TEMPLATE (existing);
+  map_context_from = d_inner;
+  map_context_to = e_inner;
 
-  if (TREE_CODE (inner) == FUNCTION_DECL)
+  if (TREE_CODE (d_inner) == FUNCTION_DECL)
 {
   tree e_ret = fndecl_declared_return_type (existing);
   tree d_ret = fndecl_declared_return_type (decl);
 
-  if (decl != inner && DECL_NAME (inner) == fun_identifier
-	  && LAMBDA_TYPE_P (DECL_CONTEXT (inner)))
+  if (decl != d_inner && DECL_NAME (d_inner) == fun_identifier
+	  && LAMBDA_TYPE_P (DECL_CONTEXT (d_inner)))
 	/* This has a recursive type that will compare 

aarch64: Use RTL builtins for [su]mlal_n intrinsics

2021-01-28 Thread Jonathan Wright via Gcc-patches
Hi,

As subject, this patch rewrites [su]mlal_n Neon intrinsics to use RTL builtins
rather than inline assembly code, allowing for better scheduling and
optimization.

Regression tested and bootstrapped on aarch64-none-linux-gnu - no
issues.

Ok for master?

Thanks,
Jonathan

---

gcc/ChangeLog:

2021-01-26  Jonathan Wright  

* config/aarch64/aarch64-simd-builtins.def: Add [su]mlal_n
builtin generator macros.
* config/aarch64/aarch64-simd.md (aarch64_mlal_n):
Define.
* config/aarch64/arm_neon.h (vmlal_n_s16): Use RTL builtin
instead of inline asm.
(vmlal_n_s32): Likewise.
(vmlal_n_u16): Likewise.
(vmlal_n_u32): Likewise.
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index a71ae4d724136c8b626d397bf6187e8b595a2b8a..4f8e28dc3c8478ea50aad333b21bd83f4a4b750e 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -196,6 +196,10 @@
   BUILTIN_VD_BHSI (TERNOP, smlal, 0, NONE)
   BUILTIN_VD_BHSI (TERNOPU, umlal, 0, NONE)
 
+  /* Implemented by aarch64_mlal_n.  */
+  BUILTIN_VD_HSI (TERNOP, smlal_n, 0, NONE)
+  BUILTIN_VD_HSI (TERNOPU, umlal_n, 0, NONE)
+
   /* Implemented by aarch64_mlsl_hi.  */
   BUILTIN_VQW (TERNOP, smlsl_hi, 0, NONE)
   BUILTIN_VQW (TERNOPU, umlsl_hi, 0, NONE)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index db56b61baf2093c88d8757b25580b3032f00a355..d78f26be19a16163eb1b8f661c6100ac290e6c6b 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1839,6 +1839,21 @@
   [(set_attr "type" "neon_mla__long")]
 )
 
+(define_insn "aarch64_mlal_n"
+  [(set (match_operand: 0 "register_operand" "=w")
+(plus:
+  (mult:
+(ANY_EXTEND:
+  (vec_duplicate:VD_HSI
+	  (match_operand: 3 "register_operand" "")))
+(ANY_EXTEND:
+  (match_operand:VD_HSI 2 "register_operand" "w")))
+  (match_operand: 1 "register_operand" "0")))]
+  "TARGET_SIMD"
+  "mlal\t%0., %2., %3.[0]"
+  [(set_attr "type" "neon_mla__long")]
+)
+
 (define_insn "aarch64_mlsl"
   [(set (match_operand: 0 "register_operand" "=w")
 (minus:
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 674ccc63b69ca1945dc684d2b06c1e31f52bfdb3..004c73d9e0ec4c33e24968d17e4307f858b51263 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -7608,48 +7608,28 @@ __extension__ extern __inline int32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmlal_n_s16 (int32x4_t __a, int16x4_t __b, int16_t __c)
 {
-  int32x4_t __result;
-  __asm__ ("smlal %0.4s,%2.4h,%3.h[0]"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "x"(__c)
-   : /* No clobbers */);
-  return __result;
+  return __builtin_aarch64_smlal_nv4hi (__a, __b, __c);
 }
 
 __extension__ extern __inline int64x2_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmlal_n_s32 (int64x2_t __a, int32x2_t __b, int32_t __c)
 {
-  int64x2_t __result;
-  __asm__ ("smlal %0.2d,%2.2s,%3.s[0]"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  return __builtin_aarch64_smlal_nv2si (__a, __b, __c);
 }
 
 __extension__ extern __inline uint32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmlal_n_u16 (uint32x4_t __a, uint16x4_t __b, uint16_t __c)
 {
-  uint32x4_t __result;
-  __asm__ ("umlal %0.4s,%2.4h,%3.h[0]"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "x"(__c)
-   : /* No clobbers */);
-  return __result;
+  return __builtin_aarch64_umlal_nv4hi_ (__a, __b, __c);
 }
 
 __extension__ extern __inline uint64x2_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vmlal_n_u32 (uint64x2_t __a, uint32x2_t __b, uint32_t __c)
 {
-  uint64x2_t __result;
-  __asm__ ("umlal %0.2d,%2.2s,%3.s[0]"
-   : "=w"(__result)
-   : "0"(__a), "w"(__b), "w"(__c)
-   : /* No clobbers */);
-  return __result;
+  return __builtin_aarch64_umlal_nv2si_ (__a, __b, __c);
 }
 
 __extension__ extern __inline int16x8_t


[PATCH] aarch64: Reimplement vshrn_high_n* intrinsics using builtins

2021-01-28 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

This patch reimplements the vshrn_high_n* intrinsics that generate the SHRN2 
instruction.
It is a vec_concat of the narrowing shift with the bottom part of the 
destination register,
so we need a little-endian and a big-endian version and an expander to pick 
between them.

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def (shrn2): Define builtin.
* config/aarch64/aarch64-simd.md (aarch64_shrn2_insn_le): Define.
(aarch64_shrn2_insn_be): Likewise.
(aarch64_shrn2): Likewise.
* config/aarch64/arm_neon.h (vshrn_high_n_s16): Reimlplement using
builtins.
(vshrn_high_n_s32): Likewise.
(vshrn_high_n_s64): Likewise.
(vshrn_high_n_u16): Likewise.
(vshrn_high_n_u32): Likewise.
(vshrn_high_n_u64): Likewise.


vshrn-hi.patch
Description: vshrn-hi.patch


[PATCH] aarch64: Reimplement vshrn_n* intrinsics using builtins

2021-01-28 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

This patch reimplements the vshrn_n* intrinsics to use RTL builtins.
These perform a narrowing right shift.

Although the intrinsic generates the half-width mode (e.g. V8HI -> V8QI), the 
new pattern
generates a full 128-bit mode (V8HI -> V16QI) by representing the 
fill-with-zeroes semantics
of the SHRN instruction. The narrower (V8QI) result is extracted with a lowpart 
subreg.
I found this allows the RTL optimisers to do a better job at optimising 
redundant moves away
in frequently-occurring SHRN+SRHN2 pairs, like in:
uint8x16_t
foo (uint16x8_t in1, uint16x8_t in2)
{
  uint8x8_t tmp = vshrn_n_u16 (in2, 7);
  uint8x16_t tmp2 = vshrn_high_n_u16 (tmp, in1, 4);
  return tmp2;
}

Bootstrapped and tested on aarch64-none-linux-gnu and aarch64_be-none-elf.
Pushing to trunk.
Thanks,
Kyrill

gcc/ChangeLog:

* config/aarch64/aarch64-simd-builtins.def (shrn): Define builtin.
* config/aarch64/aarch64-simd.md (aarch64_shrn_insn_le): Define.
(aarch64_shrn_insn_be): Likewise.
(aarch64_shrn): Likewise.
* config/aarch64/arm_neon.h (vshrn_n_s16): Reimplement using builtins.
(vshrn_n_s32): Likewise.
(vshrn_n_s64): Likewise.
(vshrn_n_u16): Likewise.
(vshrn_n_u32): Likewise.
(vshrn_n_u64): Likewise.
* config/aarch64/iterators.md (vn_mode): New mode attribute.


vshrn.patch
Description: vshrn.patch


[PATCH] PING lra: clear lra_insn_recog_data after simplifying a mem subreg

2021-01-28 Thread Ilya Leoshkevich via Gcc-patches
Hello,

I would like to ping the following patch:

lra: clear lra_insn_recog_data after simplifying a mem subreg
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563428.html

Best regards,
Ilya



Re: [PATCH PR97627]Avoid computing niters info for fake edges

2021-01-28 Thread Bin.Cheng via Gcc-patches
On Thu, Jan 28, 2021 at 5:08 PM Richard Biener via Gcc-patches
 wrote:
>
> On Thu, Jan 28, 2021 at 3:49 AM bin.cheng via Gcc-patches
>  wrote:
> >
> > Hi,
> > As described in commit message, we need to avoid computing niters info for 
> > fake
> > edges.  This simple patch does this by two changes.
> >
> > Bootstrap and test on X86_64, is it ok?
>
> Hmm, so I think the patch is a bit complicated and avoiding niter compute
> for fake edges would be easier when just returning false for
> fake edges in number_of_iterations_exit_assumptions?
I just grepped calls to get_loop_exit_edges, and thought there might
be cases other than niters analysis that would also like to skip fake
edges.  But I didn't check the calls one by one.
>
> Which pass was the problematical that had infinite loops connected to exit?
>
> I guess the cfgloop code should simply ignore fake exits - they mostly
> exist to make reverse CFG walks easy.  Specifically single_exit
> and single_likely_exit but also exit edge recording should ignore them.
>
> That said, the testcase seems to be fixed with just
>
> diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
> index 7d61ef080eb..7775bc7275c 100644
> --- a/gcc/tree-ssa-loop-niter.c
> +++ b/gcc/tree-ssa-loop-niter.c
> @@ -2407,6 +2407,11 @@ number_of_iterations_exit_assumptions (class
> loop *loop, edge exit,
>affine_iv iv0, iv1;
>bool safe;
>
> +  /* The condition at a fake exit (if it exists) does not control its
> + execution.  */
> +  if (exit->flags & EDGE_FAKE)
> +return false;
> +
>/* Nothing to analyze if the loop is known to be infinite.  */
>if (loop_constraint_set_p (loop, LOOP_C_INFINITE))
>  return false;
>
> Your dfs_find_deadend change likely "breaks" post-dominance DFS order
> (this is a very fragile area).
>
> So any objection to just simplify the patch to the above hunk?
Considering we are in late stage3? No objection to this change.  But I
do think dfs_find_deadend needs to be improved, if not as this patch
does.  For a loop nest with the outermost loop as the infinite one,
the function adds fake (exit) edges for inner loops, which is
counter-intuitive.

Thanks,
bin
>
> Thanks,
> Richard.
>
> > Thanks,
> > bin


Re: [PATCH] libstdc++: implement locale support for AIX

2021-01-28 Thread CHIGOT, CLEMENT via Gcc-patches
>> * While you now define _GLIBCXX_C_LOCALE_XPG7 in
>>  config/locale/xpg7/c_locale.h, config/os/aix/ctype_configure_char.cc
>>   still tests the previous _GLIBCXX_C_LOCALE_IEEE_2008.
>
> Arf, I've missed that. it might not be the last patch then. 
> (I've made so much versions of it as I've tried to backport it to our 
> old versions. It seems that I've mixed up things...) 

Well, after several tries this morning, I can tell that I still don't 
understand at all how to setup this ctype for char... 
I'm sure it have improved things at a moment. But now, whether 
I'm removing or adding it, nothing changes.. 
Anyway, _GLIBCXX_C_LOCALE_XPG7 is needed for gnu or for the new 
testsuite check. So I'm just going to update aix/ctype_configure_char.c 
locally. I'll see later if I can figure what's wrong. 

Clément
going to update it locally. 

[committed] c++: Some C++20 and C++23 option help fixes

2021-01-28 Thread Jakub Jelinek via Gcc-patches
Hi!

I've noticed we still refer to C++20 as draft standard, and there is a pasto
in C++23 description.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk as obvious.

2021-01-28  Jakub Jelinek  

* c.opt (-std=c++2a, -std=c++20, -std=gnu++2a, -std=gnu++20): Remove
draft from description.
(-std=c++2b): Fix a pasto, 2020 -> 2023.

--- gcc/c-family/c.opt.jj   2021-01-27 10:05:35.231942979 +0100
+++ gcc/c-family/c.opt  2021-01-27 21:29:45.398365022 +0100
@@ -2208,15 +2208,15 @@ Conform to the ISO 2017 C++ standard.
 
 std=c++2a
 C++ ObjC++ Alias(std=c++20) Undocumented
-Conform to the ISO 2020 C++ draft standard (experimental and incomplete 
support).
+Conform to the ISO 2020 C++ standard (experimental and incomplete support).
 
 std=c++20
 C++ ObjC++
-Conform to the ISO 2020 C++ draft standard (experimental and incomplete 
support).
+Conform to the ISO 2020 C++ standard (experimental and incomplete support).
 
 std=c++2b
 C++ ObjC++ Alias(std=c++23)
-Conform to the ISO 2020 C++ draft standard (experimental and incomplete 
support).
+Conform to the ISO 2023 C++ draft standard (experimental and incomplete 
support).
 
 std=c++23
 C++ ObjC++ Undocumented
@@ -2294,11 +2294,11 @@ Conform to the ISO 2017 C++ standard wit
 
 std=gnu++2a
 C++ ObjC++ Alias(std=gnu++20) Undocumented
-Conform to the ISO 2020 C++ draft standard with GNU extensions (experimental 
and incomplete support).
+Conform to the ISO 2020 C++ standard with GNU extensions (experimental and 
incomplete support).
 
 std=gnu++20
 C++ ObjC++
-Conform to the ISO 2020 C++ draft standard with GNU extensions (experimental 
and incomplete support).
+Conform to the ISO 2020 C++ standard with GNU extensions (experimental and 
incomplete support).
 
 std=gnu++2b
 C++ ObjC++ Alias(std=gnu++23)


Jakub



Re: [PATCH v3] clear VLA bounds in attribute access (PR 97172)

2021-01-28 Thread Jakub Jelinek via Gcc-patches
On Thu, Jan 28, 2021 at 09:31:46AM +0100, Richard Biener via Gcc-patches wrote:
> +  if (TREE_CODE (bound) == PLUS_EXPR
> +  && integer_all_onesp (TREE_OPERAND (bound, 1)))
> +{
> +  bound = TREE_OPERAND (bound, 0);
> +  if (TREE_CODE (bound) == NOP_EXPR)
> +   bound = TREE_OPERAND (bound, 0);
> +}
> 
> so it either does or does not strip a -1 but has no indication on
> whether it did that?  That looks fragile and broken.

Yeah.  Plus again STRIP_NOPs in there should be used.  But certainly it
needs to remember whether there was + -1 or not and compensate for it.

> The free-lang-data parts are OK.

But is fld the right spot to do it?
If it is only the FE that emits the warnings, shouldn't it be stripped
already before gimplification so that it isn't actually corrupted?

I mean in c_parse_final_cleanups or c_common_parse_file depending on
if it is just C or C++ too?

Plus, if the expressions in the attribute don't contain SAVE_EXPRs, why
there isn't unshare_expr called on them (and if they do, I don't see how
it would be guaranteed, can't one e.g. do
int bar (void);
void foo (int n, int a[n + 1][bar () + 2], int b[sizeof (a[0]) + 32])
{
}
?) the unsharing variant I've pasted into the PR.

Jakub



Re: [PATCH] [8/9/10/11 Regression] [OOP] PR fortran/86470 - ICE with OpenMP

2021-01-28 Thread Harald Anlauf via Gcc-patches
Hi Thomas,

> > Should the testcase be moved to the gomp/ subdirectory?
> Yes. It's a compile-time test, and it will then only be run
> if the the compiler can do OpenMP.
>
> You will not need the
>
> +! { dg-options "-fopenmp" }
>
> line, then.

Adjusted and committed as r11-6950-g33a7a93218b1393d0135e3c4a9ad9ced87808f5e

> Thanks for the patch!

Thanks,
Harald



Re: Fix LTO bootstrap on Windows (PR lto/85574)

2021-01-28 Thread Richard Biener via Gcc-patches
On Thu, Jan 28, 2021 at 9:35 AM Eric Botcazou  wrote:
>
> The last fix made for PR lto/85574 introduced a comparison of executables and
> this cannot directly work on Windows because they are timestamped.  Moreover
> nobody sets $(exeext) at top level, at least on MinGW, so you get a weird
> behavior because some tools add the implicit .exe suffix and others don't.
>
> Bootstrapped in LTO mode on Windows, OK for all active branches?

OK and sorry for the breakage.

Thanks,
Richard.

>
> 2021-01-28  Eric Botcazou  
>
> contrib/
> PR lto/85574
> * compare-lto: Deal with PE-COFF executables specifically.
>
> --
> Eric Botcazou


Re: [PATCH][X86] Enable X86_TUNE_AVX256_UNALIGNED_{LOAD, STORE}_OPTIMAL for generic tune [PR target/98172]

2021-01-28 Thread Richard Biener via Gcc-patches
On Thu, Jan 28, 2021 at 7:32 AM Hongtao Liu via Gcc-patches
 wrote:
>
> Hi:
>GCC11 will be the system GCC 2 years from now, and for the
> processors then, they shouldn't even need to split a 256-bit vector
> into 2 128-bits vectors.
>.i.e. Test SPEC2017 with the below 2 options on Zen3/ICL show
> option B is better than Option A.
> Option A:
> -march=x86-64 -mtune=generic -mavx2 -mfma -Ofast
>
> Option B:
> Option A + 
> -mtune-ctrl="256_unaligned_load_optimal,256_unaligned_store_optimal"
>
>   Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}.

Given the explicit list for unaligned loads it's a no-brainer to change that
for X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL.  Given both
BDVER and ZNVER1 are listed for X86_TUNE_AVX256_UNALIGNED_STORE_OPTIMAL
we should try to benchmark the effect on ZNVER1 - Martin, do we still
have a znver1 machine around?

Note that with the settings differing in a way to split stores but not to split
loads, loading a just stored value can cause bad STLF and quite a
performance hit (since znver1 has 128bit data paths that shouldn't
be an issue there but it would have an issue for actually aligned data
on CPUs with 256bit data paths).

Thanks,
Richard.

>   Ok for trunk?
>
>
>
>
> --
> BR,
> Hongtao


Re: [RFC] test builtin ratio for loop distribution

2021-01-28 Thread Richard Biener via Gcc-patches
On Thu, Jan 28, 2021 at 6:28 AM Alexandre Oliva  wrote:
>
> On Jan 27, 2021, Richard Biener  wrote:
>
> > That said, rather than not transforming the loop as you do I'd
> > say we want to re-inline small copies more forcefully during
> > loop distribution code-gen so we turn a loop that sets
> > 3 'short int' to zero into a 'int' store and a 'short' store for example
> > (we have more code-size leeway here because we formerly had
> > a loop).
>
> Ok, that makes sense to me, if there's a chance of growing the access
> stride.
>
> > Since you don't add a testcase
>
> Uhh, sorry, I mentioned TFmode emulation calls, but I wasn't explicit I
> meant the soft-fp ones from libgcc.
>
> ./xgcc -B./ -O2 $srcdir/libgcc/soft-fp/fixtfdi.c \
>   -I$srcdir/libgcc/config/riscv -I$srcdir/include \
>   -S -o - | grep memset
>
> > I can't see whether the actual case would be fixed by setting SSA
> > pointer alignment on the memset arguments
>
> The dest pointer alignment is known at the builtin expansion time,
> that's not a problem.  What isn't known (*) is that the length is a
> multiple of the word size: what gets to the expander is that it's
> between 4 and 12 bytes (targeting 32-bit risc-v), but not that it's
> either 4, 8 or 12.  Coming up with an efficient inline expansion becomes
> a bit of a challenge without that extra knowledge.

Ah, yes - that's also useful information.  I guess for memset
an enhanced builtin would then have calloc style
size and nmemb arguments where the destination is expected
to have at least 'size' alignment.  That would allow turning
back the memset into the original loop (but with optimal IVs, etc.).
So kind of the x86 rep mov{bwlq}.  Now think of sth similarly
useful for memcpy/move.

Richard.

>
>
> (*) at least before my related patch for get_nonzero_bits
> https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564344.html
>
> --
> Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
>Free Software Activist GNU Toolchain Engineer
> Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


[PATCH] c++: Fix -Weffc++ in templates [PR98841]

2021-01-28 Thread Jakub Jelinek via Gcc-patches
Hi!

We emit a bogus warning on the following testcase, suggesting that the
operator should return *this even when it does that already.
The problem is that normally cp_build_indirect_ref_1 ensures that *this
is folded as current_class_ref, but in templates (if return type is
non-dependent, otherwise check_return_expr doesn't check it) it didn't
go through cp_build_indirect_ref_1, but just built another INDIRECT_REF.
Which means it then doesn't compare pointer-equal to current_class_ref.

The following patch just checks if it is the same thing like
cp_build_indirect_ref_1 would check if it would be called.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-01-27  Jakub Jelinek  

PR c++/98841
* typeck.c (check_return_expr): If processing_template_decl, don't
rely on *this expression being always pointer-equal to
current_class_ref.

* g++.dg/warn/effc5.C: New test.

--- gcc/cp/typeck.c.jj  2021-01-25 10:02:28.380126847 +0100
+++ gcc/cp/typeck.c 2021-01-27 11:21:13.117875551 +0100
@@ -10188,6 +10188,19 @@ check_return_expr (tree retval, bool *no
  /* Returning '*this' is obviously OK.  */
  if (retval == current_class_ref)
warn = false;
+ /* In templates, '*this' might not compare pointer equal to
+current_class_ref.  */
+ else if (processing_template_decl
+  && retval
+  && TREE_CODE (retval) == INDIRECT_REF
+  && (TREE_OPERAND (retval, 0) == current_class_ptr
+  || (TREE_CODE (TREE_OPERAND (retval, 0)) == NOP_EXPR
+  && (TREE_OPERAND (TREE_OPERAND (retval, 0), 0)
+  == current_class_ptr)
+  && (same_type_ignoring_top_level_qualifiers_p
+(TREE_TYPE (TREE_OPERAND (retval, 0)),
+ TREE_TYPE (current_class_ptr))
+   warn = false;
  /* If we are calling a function whose return type is the same of
 the current class reference, it is ok.  */
  else if (INDIRECT_REF_P (retval)
--- gcc/testsuite/g++.dg/warn/effc5.C.jj2021-01-27 11:24:34.451565246 
+0100
+++ gcc/testsuite/g++.dg/warn/effc5.C   2021-01-27 11:23:23.836375556 +0100
@@ -0,0 +1,17 @@
+// PR c++/98841
+// { dg-do compile }
+// { dg-options "-Weffc++" }
+
+struct S {
+  template 
+  S& operator=(const T&) { return *this; } // { dg-bogus "should return a 
reference to" }
+  S& operator=(const S&) { return *this; }
+};
+
+void
+foo ()
+{
+  S s, t;
+  s = 1;
+  s = t;
+}

Jakub



[PATCH] c++: Fix up handling of register ... asm ("...") vars in templates [PR33661, PR98847]

2021-01-28 Thread Jakub Jelinek via Gcc-patches
Hi!

As the testcase shows, for vars appearing in templates, we don't attach
the asm spec string to the pattern decls, nor pass it back to cp_finish_decl
during instantiation.

The following patch does that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-01-27  Jakub Jelinek  

PR c++/33661
PR c++/98847
* decl.c (cp_finish_decl): For register vars with asmspec in templates
call set_user_assembler_name and set DECL_HARD_REGISTER.
* pt.c (tsubst_expr): When instantiating DECL_HARD_REGISTER vars,
pass asmspec_tree to cp_finish_decl.

* g++.target/i386/pr98847.C: New test.

--- gcc/cp/decl.c.jj2021-01-07 22:54:34.0 +0100
+++ gcc/cp/decl.c   2021-01-27 10:48:14.399582310 +0100
@@ -7840,6 +7840,12 @@ cp_finish_decl (tree decl, tree init, bo
  retrofit_lang_decl (decl);
  SET_DECL_DEPENDENT_INIT_P (decl, true);
}
+
+  if (VAR_P (decl) && DECL_REGISTER (decl) && asmspec)
+   {
+ set_user_assembler_name (decl, asmspec);
+ DECL_HARD_REGISTER (decl) = 1;
+   }
   return;
 }
 
--- gcc/cp/pt.c.jj  2021-01-22 10:07:53.643466723 +0100
+++ gcc/cp/pt.c 2021-01-27 10:59:04.109126313 +0100
@@ -18227,6 +18227,7 @@ tsubst_expr (tree t, tree args, tsubst_f
bool const_init = false;
unsigned int cnt = 0;
tree first = NULL_TREE, ndecl = error_mark_node;
+   tree asmspec_tree = NULL_TREE;
maybe_push_decl (decl);
 
if (VAR_P (decl)
@@ -18250,7 +18251,18 @@ tsubst_expr (tree t, tree args, tsubst_f
   now.  */
predeclare_vla (decl);
 
-   cp_finish_decl (decl, init, const_init, NULL_TREE, 0);
+   if (VAR_P (decl) && DECL_HARD_REGISTER (pattern_decl))
+ {
+   tree id = DECL_ASSEMBLER_NAME (pattern_decl);
+   const char *asmspec = IDENTIFIER_POINTER (id);
+   gcc_assert (asmspec[0] == '*');
+   asmspec_tree
+ = build_string (IDENTIFIER_LENGTH (id) - 1,
+ asmspec + 1);
+   TREE_TYPE (asmspec_tree) = char_array_type_node;
+ }
+
+   cp_finish_decl (decl, init, const_init, asmspec_tree, 0);
 
if (ndecl != error_mark_node)
  cp_finish_decomp (ndecl, first, cnt);
--- gcc/testsuite/g++.target/i386/pr98847.C.jj  2021-01-27 11:01:55.312161637 
+0100
+++ gcc/testsuite/g++.target/i386/pr98847.C 2021-01-27 11:00:26.601179659 
+0100
@@ -0,0 +1,20 @@
+// PR c++/98847
+// { dg-do run }
+// { dg-options "-O2 -masm=att" }
+
+template 
+int
+foo ()
+{
+  register int edx asm ("edx");
+  asm ("movl $1234, %%edx" : "=r" (edx));
+  return edx;
+}
+
+int
+main ()
+{
+  if (foo<0> () != 1234)
+__builtin_abort ();
+  return 0;
+}

Jakub



Re: [PATCH PR97627]Avoid computing niters info for fake edges

2021-01-28 Thread Richard Biener via Gcc-patches
On Thu, Jan 28, 2021 at 3:49 AM bin.cheng via Gcc-patches
 wrote:
>
> Hi,
> As described in commit message, we need to avoid computing niters info for 
> fake
> edges.  This simple patch does this by two changes.
>
> Bootstrap and test on X86_64, is it ok?

Hmm, so I think the patch is a bit complicated and avoiding niter compute
for fake edges would be easier when just returning false for
fake edges in number_of_iterations_exit_assumptions?

Which pass was the problematical that had infinite loops connected to exit?

I guess the cfgloop code should simply ignore fake exits - they mostly
exist to make reverse CFG walks easy.  Specifically single_exit
and single_likely_exit but also exit edge recording should ignore them.

That said, the testcase seems to be fixed with just

diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 7d61ef080eb..7775bc7275c 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2407,6 +2407,11 @@ number_of_iterations_exit_assumptions (class
loop *loop, edge exit,
   affine_iv iv0, iv1;
   bool safe;

+  /* The condition at a fake exit (if it exists) does not control its
+ execution.  */
+  if (exit->flags & EDGE_FAKE)
+return false;
+
   /* Nothing to analyze if the loop is known to be infinite.  */
   if (loop_constraint_set_p (loop, LOOP_C_INFINITE))
 return false;

Your dfs_find_deadend change likely "breaks" post-dominance DFS order
(this is a very fragile area).

So any objection to just simplify the patch to the above hunk?

Thanks,
Richard.

> Thanks,
> bin


[PATCH] testsuite: Run vec_insert case on P8 and P9 with option specified

2021-01-28 Thread Xionghu Luo via Gcc-patches
Move common functions to header file for cleanup.

gcc/testsuite/ChangeLog:

2021-01-27  Xionghu Luo  

* gcc.target/powerpc/pr79251.p8.c: Move definition to ...
* gcc.target/powerpc/pr79251.h: ...this.
* gcc.target/powerpc/pr79251.p9.c: Likewise.
* gcc.target/powerpc/pr79251-run.c: Rename to...
* gcc.target/powerpc/pr79251-run.p8.c: ...this.
* gcc.target/powerpc/pr79251-run.p9.c: New test.
---
 .../gcc.target/powerpc/pr79251-run.c  | 30 ---
 .../gcc.target/powerpc/pr79251-run.p8.c   | 14 +
 .../gcc.target/powerpc/pr79251-run.p9.c   | 14 +
 gcc/testsuite/gcc.target/powerpc/pr79251.h| 17 +++
 gcc/testsuite/gcc.target/powerpc/pr79251.p8.c |  2 --
 gcc/testsuite/gcc.target/powerpc/pr79251.p9.c |  2 --
 6 files changed, 45 insertions(+), 34 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251-run.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251-run.p8.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251-run.p9.c

diff --git a/gcc/testsuite/gcc.target/powerpc/pr79251-run.c 
b/gcc/testsuite/gcc.target/powerpc/pr79251-run.c
deleted file mode 100644
index 6afd357c7ba..000
--- a/gcc/testsuite/gcc.target/powerpc/pr79251-run.c
+++ /dev/null
@@ -1,30 +0,0 @@
-/* { dg-do run } */
-/* { dg-require-effective-target vsx_hw } */
-/* { dg-options "-O2 -mvsx" } */
-
-#include 
-#include 
-#include "pr79251.h"
-
-TEST_VEC_INSERT_ALL (test)
-
-#define run_test(TYPE, num)
\
-  {
\
-vector TYPE v; 
\
-vector TYPE u = {0x0}; 
\
-for (long k = 0; k < 16 / sizeof (TYPE); k++)  
\
-  v[k] = 0xaa; 
\
-for (long k = 0; k < 16 / sizeof (TYPE); k++)  
\
-  {
\
-   u = test##num (v, 254, k); \
-   if (u[k] != (TYPE) 254)\
- __builtin_abort ();  \
-  }
\
-  }
-
-int
-main (void)
-{
-  TEST_VEC_INSERT_ALL (run_test)
-  return 0;
-}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr79251-run.p8.c 
b/gcc/testsuite/gcc.target/powerpc/pr79251-run.p8.c
new file mode 100644
index 000..47d4d288f3c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr79251-run.p8.c
@@ -0,0 +1,14 @@
+/* { dg-do run } */
+/* { dg-require-effective-target p8vector_hw } */
+/* { dg-options "-O2 -mvsx -mdejagnu-cpu=power8" } */
+
+#include 
+#include 
+#include "pr79251.h"
+
+int
+main (void)
+{
+  TEST_VEC_INSERT_ALL (run_test)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr79251-run.p9.c 
b/gcc/testsuite/gcc.target/powerpc/pr79251-run.p9.c
new file mode 100644
index 000..fd56b2356f4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr79251-run.p9.c
@@ -0,0 +1,14 @@
+/* { dg-do run } */
+/* { dg-require-effective-target p9vector_hw } */
+/* { dg-options "-O2 -mvsx -mdejagnu-cpu=power9" } */
+
+#include 
+#include 
+#include "pr79251.h"
+
+int
+main (void)
+{
+  TEST_VEC_INSERT_ALL (run_test)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr79251.h 
b/gcc/testsuite/gcc.target/powerpc/pr79251.h
index addb067f9ed..2684b660966 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr79251.h
+++ b/gcc/testsuite/gcc.target/powerpc/pr79251.h
@@ -17,3 +17,20 @@
   T (unsigned long long, 7)
\
   T (float, 8) 
\
   T (double, 9)
+
+TEST_VEC_INSERT_ALL (test)
+
+#define run_test(TYPE, num)
\
+  {
\
+vector TYPE v; 
\
+vector TYPE u = {0x0}; 
\
+for (long k = 0; k < 16 / sizeof (TYPE); k++)  
\
+  v[k] = 0xaa; 
\
+for (long k = 0; k < 16 / sizeof (TYPE); k++)  
\
+  {
\
+   u = test##num (v, 254, k); \
+   if (u[k] != (TYPE) 254)\
+ __builtin_abort ();  \
+  } 

Re: [PATCH v3] clear VLA bounds in attribute access (PR 97172)

2021-01-28 Thread Richard Biener via Gcc-patches
On Thu, Jan 28, 2021 at 12:08 AM Martin Sebor via Gcc-patches
 wrote:
>
> Attached is another attempt to fix the problem caused by allowing
> front-end trees representing nontrivial VLA bound expressions to
> stay in attribute access attached to functions.  Since removing
> these trees seems to be everyone's preference this patch does that
> by extending the free_lang_data pass to look for and zero out these
> trees.
>
> Because free_lang_data only frees anything when LTO is enabled and
> we want these trees cleared regardless to keep them from getting
> clobbered during gimplification, this change also modifies the pass
> to do the clearing even when the pass is otherwise inactive.

  if (TREE_CODE (bound) == NOP_EXPR)
+bound = TREE_OPERAND (bound, 0);
+
+  if (TREE_CODE (bound) == CONVERT_EXPR)
+{
+  tree op0 = TREE_OPERAND (bound, 0);
+  tree bndtyp = TREE_TYPE (bound);
+  tree op0typ = TREE_TYPE (op0);
+  if (TYPE_PRECISION (bndtyp) == TYPE_PRECISION (op0typ))
+   bound = op0;
+}
+
+  if (TREE_CODE (bound) == NON_LVALUE_EXPR)
+bound = TREE_OPERAND (bound, 0);

all of the above can be just

   STRIP_NOPS (bound);

which also handles nesting of the above in any order.

+  if (TREE_CODE (bound) == PLUS_EXPR
+  && integer_all_onesp (TREE_OPERAND (bound, 1)))
+{
+  bound = TREE_OPERAND (bound, 0);
+  if (TREE_CODE (bound) == NOP_EXPR)
+   bound = TREE_OPERAND (bound, 0);
+}

so it either does or does not strip a -1 but has no indication on
whether it did that?  That looks fragile and broken.

Anyway, the split out of this function seems unrelated to the
original problem and should be submitted separately.

+  for (vblist = TREE_VALUE (vblist); vblist; vblist = TREE_CHAIN (vblist))
+   {
+ tree *pvbnd = _VALUE (vblist);
+ if (!*pvbnd || DECL_P (*pvbnd))
+   continue;

so this doesn't let constant bounds prevail but only symbolical ones?  Not
that I care but I'd have expected || CONSTANT_CLASS_P (*pvbnd)

The free-lang-data parts are OK.

Richard.

> Tested on x86_64-linux.
>
> Martin


Fix LTO bootstrap on Windows (PR lto/85574)

2021-01-28 Thread Eric Botcazou
The last fix made for PR lto/85574 introduced a comparison of executables and 
this cannot directly work on Windows because they are timestamped.  Moreover 
nobody sets $(exeext) at top level, at least on MinGW, so you get a weird 
behavior because some tools add the implicit .exe suffix and others don't.

Bootstrapped in LTO mode on Windows, OK for all active branches?


2021-01-28  Eric Botcazou  

contrib/
PR lto/85574
* compare-lto: Deal with PE-COFF executables specifically.

-- 
Eric Botcazoudiff --git a/contrib/compare-lto b/contrib/compare-lto
index 17379e196a7..c0bb71c0765 100755
--- a/contrib/compare-lto
+++ b/contrib/compare-lto
@@ -32,7 +32,7 @@ case $1 in
 esac
 
 if test $# != 2; then
-  echo 'usage: compare-lto file1.o file2.o' >&2
+  echo 'usage: compare-lto file1 file2' >&2
   exit 1
 fi
 
@@ -101,6 +101,25 @@ else
 else
   status=1
 fi
+
+  # PE-COFF executables are timestamped so skip leading bytes for them.
+  else
+case "$1" in
+  *.exe)
+if cmp -i 256 "$1" "$2"; then
+  status=0
+else
+  status=1
+fi
+;;
+  *)
+if test -f "$1.exe" && cmp -i 256 "$1.exe" "$2.exe"; then
+  status=0
+else
+  status=1
+fi
+;;
+esac
   fi
 fi
 


Re: [PATCH] rtl-optimization/80960 - avoid creating garbage RTL in DSE

2021-01-28 Thread Richard Biener
On Wed, 27 Jan 2021, Jakub Jelinek wrote:

> On Wed, Jan 27, 2021 at 03:40:38PM +0100, Richard Biener wrote:
> > The following avoids repeatedly turning VALUE RTXen into
> > sth useful and re-applying a constant offset through get_addr
> > via DSE check_mem_read_rtx.  Instead perform this once for
> > all stores to be visited in check_mem_read_rtx.  This avoids
> > allocating 1.6GB of garbage PLUS RTXen on the PR80960
> > testcase, fixing the memory usage regression from old GCC.
> > 
> > Bootstrap and regtest running on x86_64-unknown-linux-gnu, OK?
> > 
> > Thanks,
> > Richard.
> > 
> > 2021-01-27  Richard Biener  
> > 
> > PR rtl-optimization/80960
> > * dse.c (check_mem_read_rtx): Call get_addr on the
> > offsetted address.
> > ---
> >  gcc/dse.c | 5 +
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/gcc/dse.c b/gcc/dse.c
> > index c88587e7d94..da0df54a2dd 100644
> > --- a/gcc/dse.c
> > +++ b/gcc/dse.c
> > @@ -2219,6 +2219,11 @@ check_mem_read_rtx (rtx *loc, bb_info_t bb_info)
> >  }
> >if (maybe_ne (offset, 0))
> >  mem_addr = plus_constant (get_address_mode (mem), mem_addr, offset);
> > +  /* Avoid passing VALUE RTXen as mem_addr to canon_true_dependence
> > + which will over and over re-create proper RTL and re-apply the
> > + offset above.  See PR80960 where we almost allocate 1.6GB of PLUS
> > + RTXen that way.  */
> > +  mem_addr = get_addr (mem_addr);
> >  
> >if (group_id >= 0)
> >  {
> 
> Does that result in any changes on how much does DSE optimize?
> I mean, if you do 2 bootstraps/regtests, one with this patch and another one
> without it, and at the end of rest_of_handle_dse dump
> locally_deleted, globally_deleted
> for each CU/function, do you get the same counts except perhaps for dse.c?

So I see no difference for stage2-gcc/*.o dse1/dse2 with/without the
patch but counts are _extremely_ small.  Statistics:

  70148 dse: local deletions = 0, global deletions = 0
 32 dse: local deletions = 0, global deletions = 1
  9 dse: local deletions = 0, global deletions = 2
  7 dse: local deletions = 0, global deletions = 3
  2 dse: local deletions = 0, global deletions = 4
  2 dse: local deletions = 0, global deletions = 5
  3 dse: local deletions = 0, global deletions = 7
 67 dse: local deletions = 1, global deletions = 0
  1 dse: local deletions = 1, global deletions = 2
 12 dse: local deletions = 2, global deletions = 0
  1 dse: local deletions = 24, global deletions = 1
  2 dse: local deletions = 3, global deletions = 0
  4 dse: local deletions = 4, global deletions = 0
  4 dse: local deletions = 6, global deletions = 0
  1 dse: local deletions = 7, global deletions = 0
  1 dse: local deletions = 8, global deletions = 0

so not sure how much confidence this brings over the analytical
reasoning that it shouldn't make a difference ...

stats on just dse2 are even more depressing (given it's cost)

  35123 dse: local deletions = 0, global deletions = 0
  2 dse: local deletions = 0, global deletions = 1
 20 dse: local deletions = 1, global deletions = 0
  1 dse: local deletions = 2, global deletions = 0
  1 dse: local deletions = 3, global deletions = 0
  1 dse: local deletions = 4, global deletions = 0

Richard.