Re: Avoids std::distance calls

2018-06-04 Thread François Dumont

On 04/06/2018 23:09, Jonathan Wakely wrote:

On 04/06/18 22:13 +0200, François Dumont wrote:

Hi

I'd like to propose this patch to avoid std::distance calls. In a 
number of situation in algos we already have the size of the buffer 
we need so we shouldn't have to compute it again.


Just one place, in __inplace_merge, no?


For the moment yes but I'll submit another patch soon.

Now committed.



[PATCH, rs6000] Fix PR target/63177: Powerpc no-vfa-vect-depend-2.c and no-vfa-vect-depend-3.c failures

2018-06-04 Thread Peter Bergner
PR63177 shows a bug in how we determine which gas options we decide to pass to 
the
assembler.  Normally, we pass the -m option to the assembler if we used the
-mcpu= option.  However, if we don't compile with -mcpu=, then we will
check some of the -m options and pass an appropriate -m 
option
to the assembler.  This is all fine and good except for when we compile with
-mpower9-vector -mcpu=power8.  The problem here is that POWER9 added new 
lxvx/stxvx
instructions which already existed in POWER8 as extended mnemonics of 
lxvd2x/stxvd2x
which are different instructions and behave differently in LE mode.  The "bug" 
is
that -mpower9-vector enables the generation of the POWER9 lxvx instruction, but 
the
-mcpu=power8 option causes us to use the -mpower8 assembler option so we get the
wrong instruction. :-(

The fix used here is to catch the special case when we use -mpower9-vector and
-mcpu=power8 together and then force ourselves to use the -mpower9 gas option.

This passed bootstrap and regtesting with no regressions and fixes a little over
150 testsuite failures.  Ok for trunk and the appropriate release branches once
it's baked on trunk for a while?

Peter

PR target/63177
* /config/rs6000/rs6000.h (ASM_CPU_SPEC): Add support for -mpower9.
Don't handle -mcpu=power8 if -mpower9-vector is also used.

Index: gcc/config/rs6000/rs6000.h
===
--- gcc/config/rs6000/rs6000.h  (revision 260913)
+++ gcc/config/rs6000/rs6000.h  (working copy)
@@ -120,7 +120,7 @@
 %{mcpu=power6: %(asm_cpu_power6) -maltivec} \
 %{mcpu=power6x: %(asm_cpu_power6) -maltivec} \
 %{mcpu=power7: %(asm_cpu_power7)} \
-%{mcpu=power8: %(asm_cpu_power8)} \
+%{mcpu=power8: %{!mpower9-vector: %(asm_cpu_power8)}} \
 %{mcpu=power9: %(asm_cpu_power9)} \
 %{mcpu=a2: -ma2} \
 %{mcpu=powerpc: -mppc} \
@@ -169,6 +169,7 @@
 %{maltivec: -maltivec} \
 %{mvsx: -mvsx %{!maltivec: -maltivec} %{!mcpu*: %(asm_cpu_power7)}} \
 %{mpower8-vector|mcrypto|mdirect-move|mhtm: %{!mcpu*: %(asm_cpu_power8)}} \
+%{mpower9-vector: %{!mcpu*|mcpu=power8: %(asm_cpu_power9)}} \
 -many"
 
 #define CPP_DEFAULT_SPEC ""



[PATCH] PR fortran/86045 -- test for P=0 in mod(, 0)

2018-06-04 Thread Steve Kargl
The attached patch re-arranges the code in gfc_simply_mod().
This allows gfortran to test if the 2nd argument is zero.
Tested on i586-*-freebsd and x86_64-*-freebsd.  OK to commit?

2018-06-04  Steven G. Kargl  

PR fortran/86045
* simplify.c (gfc_simplify_mod): Re-arrange code to test whether
'P' is zero and issue an error if it is.

2018-06-04  Steven G. Kargl  

PR fortran/86045
* gfortran.dg/pr86045.f90: New test.

-- 
Steve
Index: gcc/fortran/simplify.c
===
--- gcc/fortran/simplify.c	(revision 261171)
+++ gcc/fortran/simplify.c	(working copy)
@@ -5473,41 +5473,46 @@ gfc_simplify_mod (gfc_expr *a, gfc_expr *p)
   gfc_expr *result;
   int kind;
 
-  if (a->expr_type != EXPR_CONSTANT || p->expr_type != EXPR_CONSTANT)
+  /* First check p.  */
+  if (p->expr_type != EXPR_CONSTANT)
 return NULL;
 
-  kind = a->ts.kind > p->ts.kind ? a->ts.kind : p->ts.kind;
-  result = gfc_get_constant_expr (a->ts.type, kind, >where);
-
-  switch (a->ts.type)
+  /* p shall not be 0.  */
+  switch (p->ts.type)
 {
   case BT_INTEGER:
 	if (mpz_cmp_ui (p->value.integer, 0) == 0)
 	  {
-	/* Result is processor-dependent.  */
-	gfc_error ("Second argument MOD at %L is zero", >where);
-	gfc_free_expr (result);
+	gfc_error ("Argument %qs of MOD at %L shall not be zero",
+			"P", >where);
 	return _bad_expr;
 	  }
-	mpz_tdiv_r (result->value.integer, a->value.integer, p->value.integer);
 	break;
-
   case BT_REAL:
 	if (mpfr_cmp_ui (p->value.real, 0) == 0)
 	  {
-	/* Result is processor-dependent.  */
-	gfc_error ("Second argument of MOD at %L is zero", >where);
-	gfc_free_expr (result);
+	gfc_error ("Argument %qs of MOD at %L shall not be zero",
+			"P", >where);
 	return _bad_expr;
 	  }
-
-	gfc_set_model_kind (kind);
-	mpfr_fmod (result->value.real, a->value.real, p->value.real,
-		   GFC_RND_MODE);
 	break;
-
   default:
 	gfc_internal_error ("gfc_simplify_mod(): Bad arguments");
+}
+
+  if (a->expr_type != EXPR_CONSTANT)
+return NULL;
+
+  kind = a->ts.kind > p->ts.kind ? a->ts.kind : p->ts.kind;
+  result = gfc_get_constant_expr (a->ts.type, kind, >where);
+
+  if (a->ts.type == BT_INTEGER)
+mpz_tdiv_r (result->value.integer, a->value.integer, p->value.integer);
+  else
+{
+  gfc_set_model_kind (kind);
+  mpfr_fmod (result->value.real, a->value.real, p->value.real,
+		 GFC_RND_MODE);
 }
 
   return range_check (result, "MOD");
Index: gcc/testsuite/gfortran.dg/pr86045.f90
===
--- gcc/testsuite/gfortran.dg/pr86045.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr86045.f90	(working copy)
@@ -0,0 +1,7 @@
+! { dg-do compile }
+program p
+   logical :: a(2) = (mod([2,3],0) == 0) ! { dg-error "shall not be zero" }
+   integer :: b = count(mod([2,3],0) == 0)   ! { dg-error "shall not be zero" }
+   integer :: c = all(mod([2,3],0) == 0) ! { dg-error "shall not be zero" }
+   integer :: d = any(mod([2,3],0) == 0) ! { dg-error "shall not be zero" }
+end


PING [PATCH] relax -Wsizeof-pointer-memaccess for strncpy with size of source (PR 85931)

2018-06-04 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01698.html

On 05/29/2018 08:57 PM, Martin Sebor wrote:

Warning for a strncpy call whose bound is the same as the size
of the source and suggesting to use the size of the source is
less than helpful when both sizes are the same, as in:

  char a[4], b[4];
  strncpy (a, b, sizeof b);

The attached patch suppresses the -Wsizeof-pointer-memaccess
warning for these cases.  To do that even for VLAs (in some
cases), the patch enhances operand_equal_p() to handle
SAVE_EXPR to detect when VLA in sizeof VLA refers to the size
of a variable-length array.

Is this okay for trunk and GCC 8?

Martin




PING 2 [PATCH] refine -Wstringop-truncation and -Wsizeof-pointer-memaccess for strncat of nonstrings (PR 85602)

2018-06-04 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01671.html

(In IRC last week Franz reported successfully testing the patch.
Thanks again.)

On 05/29/2018 01:21 PM, Martin Sebor wrote:

To make review and testing easier (thank you, Franz), attached
is an updated patch rebased on top of today's trunk.

(Note that the patch intentionally doesn't suppress the warning
for the submitted test case without adding the nonstring attribute.)

On 05/25/2018 02:59 PM, Martin Sebor wrote:

Ping:
https://gcc.gnu.org/ml/gcc-patches/2018-05/msg00869.html

On 05/17/2018 08:01 PM, Martin Sebor wrote:

The -Wstringop-truncation and -Wsizeof-pointer-memaccess warnings
I added and enhanced, respectively, in GCC 8 are arguably overly
strict for source arguments declared with the nonstring attribute.

For example, -Wsizeof-pointer-memaccess triggers for the strncat
call below:

  __attribute__ ((nonstring)) char nonstr[8];
  extern char *d;
  strncat (d, nonstr, sizeof nonstr);

even though it's still a fairly common (if unsafe) idiom from
the early UNIX days (V7 from 1979 to be exact) where strncat
was introduced.  (This use case, modulo the attribute, was
reduced from coreutils.)

Simialrly, -Wstringop-truncation warns for some strcat calls that
are actually safe, such as in:

  strcpy (nonstr, "123");
  strncat (d, nonstr, 32);

To help with the adoption of the warnings and the attribute and
avoid unnecessary churn the attached patch relaxes both warnings
to accept code like this without diagnostics.

The patch doesn't add any new warnings so I'd like it considered
for GCC 8 in addition to trunk.

Thanks
Martin








PING 2 [PATCH] allow more strncat calls with -Wstringop-truncation (PR 85700)

2018-06-04 Thread Martin Sebor

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01189.html

On 05/29/2018 10:19 AM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01189.html

On 05/22/2018 07:40 PM, Martin Sebor wrote:

Here's another small refinement to -Wstringop-truncation to
avoid diagnosing more arguably "safe" cases of strncat() that
match the expected pattern of

  strncat (d, s, sizeof d - strlen (d) - 1);

such as

  extern char a[4];
  strncat (a, "12", sizeof d - strlen (a) - 1);

Since the bound is derived from the length of the destination
as GCC documents is the expected use, the call should probably
not be diagnosed even though truncation is possible.

The trouble with strncat is that it specifies a single count
that can be (and has been) used to specify either the remaining
space in the destination or the maximum number of characters
to append, but not both.  It's nearly impossible to tell for
certain which the author meant, and if it's safe, hence all
this fine-tuning.  I suspect this isn't the last tweak, either.

In any event, I'd like to commit the patch to both trunk and
gcc-8-branch.  The bug isn't marked regression but I suppose
it could (should) well be considered one.

Martin






[PATCH] RISC-V: Don't clobber retval when __builtin_eh_return called.

2018-06-04 Thread Jim Wilson
This fixes a problem reported on the RISC-V foundation sw-dev mailing list.  In
a function that calls __builtin_eh_return, such as Unwind_RaiseException, the
return value gets clobbered when we restore the EH_RETURN_DATA_REGNO registers.
The RISC-V port is using arg registers for EH_RETURN_DATA_REGNO that are also
used for the return value.  The implementation was copied from the MIPS port,
but the MIPS port has separate registers for return values and arguments.  This
causes return values to be clobbered for RISC-V code when __builtin_eh_return
is called.

The solution is pulled from the x86 port which also uses return value registers
for EH_RETURN_DATA_REGNO.  A function that calls __builtin_eh_return now has
two epilogues.  On the normal return path, we don't restore the eh data regs
as we don't need them, which avoid clobbering the return value.  On the EH
return path, we do restore the eh data regs as we need them, and we don't need
the return value.

Unfortunately, I don't have a testcase for this, it was found while porting
the D compiler.  But the problem can be seem by disassembling
Unwind_RaiseException.  The code is obviously wrong without the patch, and
looks correct with the patch.

This was tested cross for riscv32-elf and riscv64-linux, and native for
riscv64-linux.  There were no regressions.

Committed.

Jim

gcc/
* config/riscv/riscv-protos.h (riscv_expand_epilogue): Change bool arg
to int.
* config/riscv/riscv.c (riscv_for_each_saved_reg): New args epilogue
and maybe_eh_return.  Change regno to unsigned int.  Use new args to
handle EH_RETURN_DATA_REGNO registers properly.
(riscv_expand_prologue): Pass new args to riscv_for_each_saved_reg.
(riscv_expand_epilogue): Update comment.  Change argument name and
type.  Update code to use new name and type.  Pass new args to
riscv_for_each_saved_reg.  Only use EH_RETURN_STACKADJ_RTX when
EXCEPTION_RETURN.
* config/riscv/riscv.md (NORMAL_RETURN): New.
(SIBCALL_RETURN, EXCEPTION_RETURN): New.
(epilogue, sibcall_epilogue): Update riscv_expand_epilogue arg.
(eh_return): Call gen_eh_return_internal and emit barrier.
(eh_return_internal): Call riscv_expand_epilogue.
---
 gcc/config/riscv/riscv-protos.h |  2 +-
 gcc/config/riscv/riscv.c| 52 -
 gcc/config/riscv/riscv.md   | 19 ++--
 3 files changed, 56 insertions(+), 17 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index a194b192a2b..f158ed007dd 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -66,7 +66,7 @@ extern bool riscv_expand_block_move (rtx, rtx, rtx);
 extern rtx riscv_return_addr (int, rtx);
 extern HOST_WIDE_INT riscv_initial_elimination_offset (int, int);
 extern void riscv_expand_prologue (void);
-extern void riscv_expand_epilogue (bool);
+extern void riscv_expand_epilogue (int);
 extern bool riscv_epilogue_uses (unsigned int);
 extern bool riscv_can_use_return_insn (void);
 extern rtx riscv_function_value (const_tree, const_tree, enum machine_mode);
diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 6e389fa0102..c418dc1ec2e 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3502,23 +3502,45 @@ riscv_save_restore_reg (machine_mode mode, int regno,
of the frame.  */
 
 static void
-riscv_for_each_saved_reg (HOST_WIDE_INT sp_offset, riscv_save_restore_fn fn)
+riscv_for_each_saved_reg (HOST_WIDE_INT sp_offset, riscv_save_restore_fn fn,
+ bool epilogue, bool maybe_eh_return)
 {
   HOST_WIDE_INT offset;
 
   /* Save the link register and s-registers. */
   offset = cfun->machine->frame.gp_sp_offset - sp_offset;
-  for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
+  for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
 if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
   {
-   riscv_save_restore_reg (word_mode, regno, offset, fn);
+   bool handle_reg = TRUE;
+
+   /* If this is a normal return in a function that calls the eh_return
+  builtin, then do not restore the eh return data registers as that
+  would clobber the return value.  But we do still need to save them
+  in the prologue, and restore them for an exception return, so we
+  need special handling here.  */
+   if (epilogue && !maybe_eh_return && crtl->calls_eh_return)
+ {
+   unsigned int i, regnum;
+
+   for (i = 0; (regnum = EH_RETURN_DATA_REGNO (i)) != INVALID_REGNUM;
+i++)
+ if (regno == regnum)
+   {
+ handle_reg = FALSE;
+ break;
+   }
+ }
+
+   if (handle_reg)
+ riscv_save_restore_reg (word_mode, regno, offset, fn);
offset -= UNITS_PER_WORD;
   }
 
   /* This 

Re: [PATCH] rtlanal: Fix nonzero_bits for non-load paradoxical subregs (PR85925)

2018-06-04 Thread Eric Botcazou
> I can see why WORD_REGISTER_OPERATIONS allows some REG cases,
> but why does LOAD_EXTEND_OP have an effect on them?

LOAD_EXTEND_OP has an effect on all paradoxical SUBREGs because of spilling.  
This was even originally decoupled from WORD_REGISTER_OPERATIONS in reload, 
see this comment from find_reloads:

 On machines that extend byte operations and we have a
 SUBREG where both the inner and outer modes are no 
wider
 than a word and the inner mode is narrower, is 
integral,
 and gets extended when loaded from memory, combine.c 
has
 made assumptions about the behavior of the machine in 
such
 register access.  If the data is, in fact, in memory we
 must always load using the size assumed to be in the
 register and let the insn do the different-sized
 accesses.

 This is doubly true if WORD_REGISTER_OPERATIONS.  In
 this case eliminate_regs has left non-paradoxical
 subregs for push_reload to see.  Make sure it does
 by forcing the reload.

> Ah well.  I guess I'm just glad that AArch64 doesn't define this :-)

Nothing even remotely approaching the complexity of the SVE stuff... ;-)

-- 
Eric Botcazou


Re: [Patch, Fortran] PR25829: Asynchronous I/O

2018-06-04 Thread JonY
On 06/03/2018 06:59 PM, Nicolas Koenig wrote:
> Hello everyone,
> 
> this patch adds asynchronous I/O support. Thomas and I finally finished
> a feature-complete and debugged version, so here it is. In order to use
> asynchronous I/O, it is still necessary to link against libpthread,
> libgomp or another library linked against any of the aforementioned two.
> While it might not be the nicest way, it at least keeps in line with the
> likes of ifort. Two of the test I send deal with asynchronous error
> handling, so they will fail if not linked accordingly.
> 
> Since the implementation relies on pthreads, it would be great if
> somebody could try the patch on non-linux targets, to see whether it
> causes any problems there.
> 
> Let the rain of regressions begin ;)
> 
> Nicolas
> 
> P.S.: I would very much recommend removing the #undef DEBUG in async.h.
> I have to admit, I am quite proud of the debug printouts. They even
> build a data structure in the background telling you were a locked mutex
> was locked.

I'm not too familiar with Fortran, but I'll test it out over the
weekends with the asynchronous_9.f90 program.




signature.asc
Description: OpenPGP digital signature


Re: [PATCH], Add weak references to bridge old/new PowerPC ieee 128-bit name mangling

2018-06-04 Thread Michael Meissner
On Mon, Jun 04, 2018 at 05:19:07PM -0500, Segher Boessenkool wrote:
> You can still have overloads, just not two between the same type.  So you
> can have both __ibm128 and __ieee128.

Yes of course, but the test is explicitly testing whether you can overloads
with the same type.

> > The test in question was new and written when I did the previous changes.  
> > It
> > is not a long standing problem.  The test was explicitly written to make 
> > sure
> > all three types were different.  Since we now have only two types, we need 
> > to
> > adjust the test.
> 
> The test currently ICEs.  That needs to be fixed.

Well that is a C++ front end bug.

> > > > +/* Generate old manged name, not new name.  */
> > > > +static bool old_mangling;
> > > 
> > > As Andreas said, this is not a good name.  Please at least mention for
> > > what type this old mangling is.
> > 
> > I will consider this.
> 
> Thanks.  Naming is hard :-(

My current name is: ieee128_mangling_gcc_8_1

> > > I wonder how hard it would be to alias the long double type to either
> > > __ibm128 or __ieee128, instead of the other way around?  This would
> > > simplify code a lot.  Esp. if we can do the same thing for the modes,
> > > too.
> > 
> > It depends on whether other parts of the compiler already have links to long
> > double before the hook in rs6000.c gets called.  I frankly don't see it as
> > simplifying the code.
> 
> Not even if we can also get rid of TFmode as separate mode?

We have to keep TFmode for the non float128 ports, so I don't see it as buying
us much of anything.

But also it is a matter of how much you want to destabilize things.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



Re: [Patch, Fortran] PR25829: Asynchronous I/O (v2)

2018-06-04 Thread Nicolas Koenig

Hi Dominique and Rainer,

First of all thanks for testing!


Hi Dominique, Nicolas,


I have applied your patch on top of revision r261130 on
x86_64-apple-darwin17 (SSD with APFS file system).


I've tried it on i386-pc-solaris2.11 and sparc-sun-solaris2.11.


I also see two regressions

FAIL: gfortran.dg/f2003_inquire_1.f03   -O1  execution test

only with -m32 and -O1 (STOP 5), and


It fails for me at -O[0s] (i386) resp. -O[01] (sparc), 64-bit only.


This seems to be a bug in the test suite. It tries to find out whether 
an id is pending that is never initialized.





FAIL: gfortran.dg/f2003_io_1.f03   -O*

with both -m32 and -m64 (STOP 1).


Same here: FAILs at -O[0-3s] for both 32 and 64-bit.


And another bug in the test suite. This time the wait after the read is 
missing.





The is also typos for the added tests

s/libgfomp/libgomp/


Will fix.



Why do the tests start at asynchronous_6.f90?


Because they were originally intended for the gfortran test suite, but I 
couldn't run it there because of libpthread. I will change the numbering 
scheme.




... and asynchronous_9.f90 is missing from the ChangeLog, which
..._7.f90 is missing from the sequence.



asynchronous_7.f90 is a test for an error, but dg-shouldfail is not 
working in libgomp. Dominique is looking into this.



Besides, I see

+FAIL: libgomp.fortran/asynchronous_6.f90   -O1  execution test

STOP 2

32-bit i386 only.



I have trouble replicating this bug even with -m32. Could you get some 
more debugging info for the test on your machine?



+FAIL: libgomp.fortran/asynchronous_9.f90   -O  execution test

32 and 64-bit i386 and sparc, no error message.



This file wasn't supposed to be a test case, that's why it is not in the 
ChangeLog. It is a benchmark program, so it takes some time. Maybe a 
time out? Could you maybe try running it outside the test suite?



Rainer



Dominique wrote:
> "Treat asynchronous variables the same as volatile, for now." could 
probably simplified as

> "Treat asynchronous variables as volatile, for now."

Will do.

>
> I also wonder if
>
> +wrap_scalar_transfer (dtp, BT_INTEGER, p, kind, kind, 1);
>
> is correct without a cast to size_t for the last two arguments (and 
for the last argument in other instances). Note that I am C challenged, 
so forgive the question if it is stupid.


It atomatically casts based on the type information in the prototype in 
io.h.


>
> Thanks for the nice work.

With pleasure! :)

>
> Dominique


Re: [PATCH], Add weak references to bridge old/new PowerPC ieee 128-bit name mangling

2018-06-04 Thread Segher Boessenkool
On Mon, Jun 04, 2018 at 06:05:59PM -0400, Michael Meissner wrote:
> > > This
> > > allows templates to work again with those types (the template code aborts 
> > > if
> > > you have two distinct types use the same mangling).  However, overloaded 
> > > types
> > > won't work, but I suspect these days people use templates over overloaded
> > > functions.  I fixed up the test case for this (pr85657.C) so that it only 
> > > tests
> > > for templates.
> > 
> > If tests fail, do not delete the test.  If we have a problem, we have
> > a problem, and it needs to be fixed (not necessarily today of course,
> > and there is xfail for long-standing problems).
> 
> This is a fundamental detail of the current changes.  There is no way it can 
> be
> 'fixed'.
> 
> If __float128/__ibm128 each use the long double type internally when long
> double uses that representation, then you cannot have overloaded functions 
> that
> use the two types.
> 
> I.e.
> 
>   class foo {
>   // ...
>   long double arith (long double);
>   __float128 arith (__float128);
>   __ibm128 arith (__ibm128);
>   }
> 
> In the previous changes that the test was written for, we had 3 types within
> the compiler.  We had a __float128 type, we had a long double type, and we had
> an __ibm128 type.  We had different manglings for each of the different types.
> 
> Now that we only have two types, you can't have explicit overloading of the
> same type.  You can have templates, because only one version of the template 
> is
> created for the two types.

You can still have overloads, just not two between the same type.  So you
can have both __ibm128 and __ieee128.

> The test in question was new and written when I did the previous changes.  It
> is not a long standing problem.  The test was explicitly written to make sure
> all three types were different.  Since we now have only two types, we need to
> adjust the test.

The test currently ICEs.  That needs to be fixed.

> > > +/* Generate old manged name, not new name.  */
> > > +static bool old_mangling;
> > 
> > As Andreas said, this is not a good name.  Please at least mention for
> > what type this old mangling is.
> 
> I will consider this.

Thanks.  Naming is hard :-(

> > I wonder how hard it would be to alias the long double type to either
> > __ibm128 or __ieee128, instead of the other way around?  This would
> > simplify code a lot.  Esp. if we can do the same thing for the modes,
> > too.
> 
> It depends on whether other parts of the compiler already have links to long
> double before the hook in rs6000.c gets called.  I frankly don't see it as
> simplifying the code.

Not even if we can also get rid of TFmode as separate mode?


Segher


Re: [PATCH] rtlanal: Fix nonzero_bits for non-load paradoxical subregs (PR85925)

2018-06-04 Thread Richard Sandiford
Eric Botcazou  writes:
>> In the PR we have insns:
>> 
>> Trying 23 -> 24:
>>23: r123:SI=zero_extend(r122:HI)
>>   REG_DEAD r122:HI
>>24: [r115:SI]=r123:SI
>>   REG_DEAD r123:SI
>> 
>> which should be combined to
>> 
>> (set (mem:SI (reg/f:SI 115 [ pretmp_19 ]) [1 *pretmp_19+0 S4 A32])
>> (and:SI (subreg:SI (reg:HI 122) 0)
>> (const_int 32767 [0x7fff])))
>> 
>> But nonzero_bits of reg:HI 122 is 0x7fff, and nonzero_bits1 thinks it
>> then also has that same nonzero_bits for the subreg.  This is not
>> correct: the bit outside of HImode are undefined.  load_extend_op
>> applies to loads from memory only, not anything else.  Which means the
>> whole AND is optimised away.
>
> No, this is done on purpose for WORD_REGISTER_OPERATIONS targets and your 
> patch will pessimize them.  I'm going to have a look at the PR then.

I can see why WORD_REGISTER_OPERATIONS allows some REG cases,
but why does LOAD_EXTEND_OP have an effect on them?  The doc says:

@defmac WORD_REGISTER_OPERATIONS
Define this macro to 1 if operations between registers with integral mode
smaller than a word are always performed on the entire register.  To be
more explicit, if you start with a pair of @code{word_mode} registers with
known values and you do a subword, for example @code{QImode}, addition on
the low part of the registers, then the compiler may consider that the
result has a known value in @code{word_mode} too if the macro is defined
to 1.  Most RISC machines have this property and most CISC machines do not.
@end defmac

which doesn't make it sound like LOAD_EXTEND_OP has a direct effect
on the (cached) lhs of an arithmetic operation.

Ah well.  I guess I'm just glad that AArch64 doesn't define this :-)

Richard


Re: [PATCH], Add weak references to bridge old/new PowerPC ieee 128-bit name mangling

2018-06-04 Thread Michael Meissner
On Mon, Jun 04, 2018 at 12:46:42PM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Fri, Jun 01, 2018 at 07:28:40PM -0400, Michael Meissner wrote:
> > This patch also makes __ibm128 or __float128 use the long double mode if 
> > long
> > double uses the IBM extended double or IEEE 128-bit representations.
> 
> Does that need to be the same patch?  Please split such things out when
> you can.

Well it can be split, but it will then have the test failure still.

> > This
> > allows templates to work again with those types (the template code aborts if
> > you have two distinct types use the same mangling).  However, overloaded 
> > types
> > won't work, but I suspect these days people use templates over overloaded
> > functions.  I fixed up the test case for this (pr85657.C) so that it only 
> > tests
> > for templates.
> 
> If tests fail, do not delete the test.  If we have a problem, we have
> a problem, and it needs to be fixed (not necessarily today of course,
> and there is xfail for long-standing problems).

This is a fundamental detail of the current changes.  There is no way it can be
'fixed'.

If __float128/__ibm128 each use the long double type internally when long
double uses that representation, then you cannot have overloaded functions that
use the two types.

I.e.

class foo {
// ...
long double arith (long double);
__float128 arith (__float128);
__ibm128 arith (__ibm128);
}

In the previous changes that the test was written for, we had 3 types within
the compiler.  We had a __float128 type, we had a long double type, and we had
an __ibm128 type.  We had different manglings for each of the different types.

Now that we only have two types, you can't have explicit overloading of the
same type.  You can have templates, because only one version of the template is
created for the two types.

The test in question was new and written when I did the previous changes.  It
is not a long standing problem.  The test was explicitly written to make sure
all three types were different.  Since we now have only two types, we need to
adjust the test.

> 
> > +/* Generate old manged name, not new name.  */
> > +static bool old_mangling;
> 
> As Andreas said, this is not a good name.  Please at least mention for
> what type this old mangling is.

I will consider this.

> 
> > @@ -16355,14 +16394,21 @@ rs6000_init_builtins (void)
> >   __ieee128.  */
> >if (TARGET_FLOAT128_TYPE)
> >  {
> > -  ibm128_float_type_node = make_node (REAL_TYPE);
> > -  TYPE_PRECISION (ibm128_float_type_node) = 128;
> > -  SET_TYPE_MODE (ibm128_float_type_node, IFmode);
> > -  layout_type (ibm128_float_type_node);
> > +  if (TARGET_IEEEQUAD || !TARGET_LONG_DOUBLE_128)
> > +   {
> > + ibm128_float_type_node = make_node (REAL_TYPE);
> > + TYPE_PRECISION (ibm128_float_type_node) = 128;
> > + SET_TYPE_MODE (ibm128_float_type_node, IFmode);
> > + layout_type (ibm128_float_type_node);
> > +   }
> > +  else
> > +   ibm128_float_type_node = long_double_type_node;
> 
> I wonder how hard it would be to alias the long double type to either
> __ibm128 or __ieee128, instead of the other way around?  This would
> simplify code a lot.  Esp. if we can do the same thing for the modes,
> too.

It depends on whether other parts of the compiler already have links to long
double before the hook in rs6000.c gets called.  I frankly don't see it as
simplifying the code.

> 
> > @@ -32117,7 +32163,7 @@ rs6000_mangle_type (const_tree type)
> >if (SCALAR_FLOAT_TYPE_P (type) && FLOAT128_IBM_P (TYPE_MODE (type)))
> >  return "g";
> >if (SCALAR_FLOAT_TYPE_P (type) && FLOAT128_IEEE_P (TYPE_MODE (type)))
> > -return "u9__ieee128";
> > +return old_mangling ? "U10__float128" : "u9__ieee128";
> 
> > +#if TARGET_ELF && RS6000_WEAK
> > +static void
> > +rs6000_globalize_decl_name (FILE * stream, tree decl)
> > +{
> > +  const char *name = XSTR (XEXP (DECL_RTL (decl), 0), 0);
> > +
> > +  targetm.asm_out.globalize_label (stream, name);
> > +
> > +  if (TARGET_OLD_NAME_MANGLING && rs6000_passes_ieee128
> > +  && name[0] == '_' && name[1] == 'Z')
> > +{
> > +  tree save_asm_name = DECL_ASSEMBLER_NAME (decl);
> > +  const char *old_name;
> > +
> > +  old_mangling = true;
> > +  lang_hooks.set_decl_assembler_name (decl);
> > +  old_name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
> > +  SET_DECL_ASSEMBLER_NAME (decl, save_asm_name);
> > +  old_mangling = false;
> 
> Eww.  Can't we just create old_name directly?  Maybe there already is
> some nice helper for this?

No, there is no way to do this, without having to write a parser for the
mangled names to change the names.  I actually looked at that, and it was
getting to be complex.  In addition, it is fragile, in that if C++ decides to
tweak the mangling ABI (like they did in C++-17 if I read the code correctly),
we would have to make 

Re: [PATCH 2/2] [ARC] Update warning messges.

2018-06-04 Thread Andrew Burgess
* Claudiu Zissulescu  [2018-05-21 13:20:29 +0300]:

> From: claziss 
> 
> When we pass an mcpu to the compiler we have two types of (hardware
> configuration) flags that are set:
> 
> 1. Architecture specific, for example code-density is always enabled
> for ARCHS architectures. These options are overwriting whatever the
> corresponding user options with the preset ones.
> 
> 2. CPU specific, for example archs is using LL64 option by
> default. These options can be freely enabled or disabled.
> 
> Because of the above complexity, we need to throw some errors for the
> user to know when he/she does something which goes against the above
> rules. Thus, I came up with the following set of rules:
> 
> 1. Overwriting default architecture specific hardware option: it is
> ignored, a warning is thrown;
> 
> 2. Overwriting default CPU specific hardware option: it is taken into
> account, a warning is thrown.
> 
> Ok to apply?
> Claudiu

This seems fine.

Thanks,
Andrew

> 
> gcc/
> 2017-09-26  Claudiu Zissulescu  
> 
>   * config/arc/arc-arch.h: Update ARC_OPTX macro.
>   * config/arc/arc-options.def (ARC_OPTX): Introduce a new doc
>   field.
>   * config/arc/arc.c (arc_init): Update pic warning.
>   (irq_range): Update irq range parsing warnings.
>   (arc_override_options): Update various warning messages.
>   (arc_handle_aux_attribute): Likewise.
> 
> gcc/testsuite
> 2017-09-26  Claudiu Zissulescu  
> 
>   * gcc.target/arc/builtin_simdarc.c: Update test.
>   * gcc.target/arc/mulsi3_highpart-2.c: Likewise.
>   * gcc.target/arc/tumaddsidi4.c: Likewise.
> ---
>  gcc/config/arc/arc-arch.h|  2 +-
>  gcc/config/arc/arc-options.def   | 44 ++--
>  gcc/config/arc/arc.c | 90 
> ++--
>  gcc/testsuite/gcc.target/arc/builtin_simdarc.c   |  2 +-
>  gcc/testsuite/gcc.target/arc/mulsi3_highpart-2.c |  2 +-
>  gcc/testsuite/gcc.target/arc/tumaddsidi4.c   |  2 +-
>  6 files changed, 80 insertions(+), 62 deletions(-)
> 
> diff --git a/gcc/config/arc/arc-arch.h b/gcc/config/arc/arc-arch.h
> index f90be66e200..a3b7dc87be4 100644
> --- a/gcc/config/arc/arc-arch.h
> +++ b/gcc/config/arc/arc-arch.h
> @@ -28,7 +28,7 @@ along with GCC; see the file COPYING3.  If not see
>  enum cpu_flags
>{
>  #define ARC_OPT(NAME, CODE, MASK, DOC)   NAME = CODE,
> -#define ARC_OPTX(NAME, CODE, VAR, VAL, DOC) NAME = CODE,
> +#define ARC_OPTX(NAME, CODE, VAR, VAL, DOC0, DOC1) NAME = CODE,
>  #include "arc-options.def"
>  #undef ARC_OPT
>  #undef ARC_OPTX
> diff --git a/gcc/config/arc/arc-options.def b/gcc/config/arc/arc-options.def
> index 61ecd1de150..fefb3081c81 100644
> --- a/gcc/config/arc/arc-options.def
> +++ b/gcc/config/arc/arc-options.def
> @@ -74,32 +74,32 @@ ARC_OPT (FL_DPFP, (1ULL << 13), 
> MASK_DPFP_COMPACT_SET, "double precission FP
>  ARC_OPT (FL_ARGONAUT, (1ULL << 14), MASK_ARGONAUT_SET,  "argonaut")
>  ARC_OPT (FL_SIMD, (1ULL << 15), MASK_SIMD_SET,  "simd")
>  
> -ARC_OPTX (FL_MPYOPT_1, (1ULL << 17), arc_mpy_option, 1, "mpy option w")
> -ARC_OPTX (FL_MPYOPT_2, (1ULL << 18), arc_mpy_option, 2, "mpy option wlh1")
> -ARC_OPTX (FL_MPYOPT_3, (1ULL << 19), arc_mpy_option, 3, "mpy option wlh2")
> -ARC_OPTX (FL_MPYOPT_4, (1ULL << 20), arc_mpy_option, 4, "mpy option wlh3")
> -ARC_OPTX (FL_MPYOPT_5, (1ULL << 21), arc_mpy_option, 5, "mpy option wlh4")
> -ARC_OPTX (FL_MPYOPT_6, (1ULL << 22), arc_mpy_option, 6, "mpy option wlh5")
> -ARC_OPTX (FL_MPYOPT_7, (1ULL << 23), arc_mpy_option, 7, "mpy option 
> plus_dmpy")
> -ARC_OPTX (FL_MPYOPT_8, (1ULL << 24), arc_mpy_option, 8, "mpy option 
> plus_macd")
> -ARC_OPTX (FL_MPYOPT_9, (1ULL << 25), arc_mpy_option, 9, "mpy option 
> plus_qmacw")
> +ARC_OPTX (FL_MPYOPT_1, (1ULL << 17), arc_mpy_option, 1, "mpy-option", "w")
> +ARC_OPTX (FL_MPYOPT_2, (1ULL << 18), arc_mpy_option, 2, "mpy-option", "wlh1")
> +ARC_OPTX (FL_MPYOPT_3, (1ULL << 19), arc_mpy_option, 3, "mpy-option", "wlh2")
> +ARC_OPTX (FL_MPYOPT_4, (1ULL << 20), arc_mpy_option, 4, "mpy-option", "wlh3")
> +ARC_OPTX (FL_MPYOPT_5, (1ULL << 21), arc_mpy_option, 5, "mpy-option", "wlh4")
> +ARC_OPTX (FL_MPYOPT_6, (1ULL << 22), arc_mpy_option, 6, "mpy-option", "wlh5")
> +ARC_OPTX (FL_MPYOPT_7, (1ULL << 23), arc_mpy_option, 7, "mpy-option", 
> "plus_dmpy")
> +ARC_OPTX (FL_MPYOPT_8, (1ULL << 24), arc_mpy_option, 8, "mpy-option", 
> "plus_macd")
> +ARC_OPTX (FL_MPYOPT_9, (1ULL << 25), arc_mpy_option, 9, "mpy-option", 
> "plus_qmacw")
>  
>  ARC_OPT (FL_MPYOPT_7_9, (0x01c2ULL << 17), 0, "mpy option")
>  ARC_OPT (FL_MPYOPT_1_6, (0x003fULL << 17), 0, "mpy option")
>  
> -ARC_OPTX (FL_FPU_FPUS,   (1ULL << 26), arc_fpu_build, FPU_FPUS,  
> "mfpu=fpus")
> -ARC_OPTX (FL_FPU_FPUS_DIV,  (1ULL << 27), arc_fpu_build, FPU_FPUS_DIV,   
> "mfpu=fpus_div")
> -ARC_OPTX (FL_FPU_FPUS_FMA,  (1ULL << 28), arc_fpu_build, FPU_FPUS_FMA,   
> "mfpu=fpus_fma")
> -ARC_OPTX (FL_FPU_FPUS_ALL,  (1ULL << 

Re: [PATCH 1/2] [ARC][QuarkSE] Update

2018-06-04 Thread Andrew Burgess
* Claudiu Zissulescu  [2018-05-21 13:20:28 +0300]:

> From: claziss 
> 
> QuarkSE has lp_count width set to 16 bits. Update the compiler to
> consider it.
> 
> Ok to apply?
> Claudiu
> 
> gcc/
> 2017-07-11  Claudiu Zissulescu  
> 
>   * config/arc/arc-arch.h (arc_extras): New enum.
>   (arc_cpu_t): Add field extra.
>   (arc_cpu_types): Consider the extras.
>   * config/arc/arc-cpus.def: Add extras info.
>   * config/arc/arc-opts.h (processor_type): Consider extra field.
>   * config/arc/arc.c (arc_override_options): Handle extra field.

The field name 'extra' seems a little vague.  Can this field ever be
used for more than just lp_count size?  Would a name related more to
lp_count be better?  Or if the intention is that other flags might get
added, maybe there's a name that could suggest that usage?

As a minimum maybe the comment on the new field could describe what
the field is for?

Thanks,
Andrew


> ---
>  gcc/config/arc/arc-arch.h   | 19 ---
>  gcc/config/arc/arc-cpus.def | 59 
> -
>  gcc/config/arc/arc-opts.h   |  2 +-
>  gcc/config/arc/arc.c| 10 
>  4 files changed, 57 insertions(+), 33 deletions(-)
> 
> diff --git a/gcc/config/arc/arc-arch.h b/gcc/config/arc/arc-arch.h
> index 01f95946623..f90be66e200 100644
> --- a/gcc/config/arc/arc-arch.h
> +++ b/gcc/config/arc/arc-arch.h
> @@ -79,6 +79,14 @@ enum arc_tune_attr
>  ARC_TUNE_ARCHS4XD_SLOW
>};
>  
> +/* Extra options for a processor.  */
> +
> +enum arc_extras
> +{
> +  HAS_NONE,
> +  HAS_LPCOUNT_16
> +};
> +
>  /* CPU specific properties.  */
>  
>  typedef struct
> @@ -95,6 +103,9 @@ typedef struct
>/* Specific flags.  */
>const unsigned long long flags;
>  
> +  /* Extra value.  */
> +  enum arc_extras extra;
> +
>/* Tune value.  */
>enum arc_tune_attr tune;
>  
> @@ -112,12 +123,12 @@ const arc_arch_t arc_arch_types[] =
>  
>  const arc_cpu_t arc_cpu_types[] =
>{
> -{"none", NULL, PROCESSOR_NONE, 0, ARC_TUNE_NONE},
> -#define ARC_CPU(NAME, ARCH, FLAGS, TUNE) \
> -{#NAME, _arch_types [BASE_ARCH_##ARCH], PROCESSOR_##NAME, FLAGS, 
> ARC_TUNE_##TUNE },
> +{"none", NULL, PROCESSOR_NONE, 0, HAS_NONE, ARC_TUNE_NONE},
> +#define ARC_CPU(NAME, ARCH, FLAGS, EXTRA, TUNE)  
> \
> +{#NAME, _arch_types [BASE_ARCH_##ARCH], PROCESSOR_##NAME, FLAGS, 
> HAS_##EXTRA, ARC_TUNE_##TUNE },
>  #include "arc-cpus.def"
>  #undef ARC_CPU
> -{NULL, NULL, PROCESSOR_NONE, 0, ARC_TUNE_NONE}
> +{NULL, NULL, PROCESSOR_NONE, 0, HAS_NONE, ARC_TUNE_NONE}
>};
>  
>  /* Currently selected cpu type.  */
> diff --git a/gcc/config/arc/arc-cpus.def b/gcc/config/arc/arc-cpus.def
> index 4aa422f1a39..eea83917b42 100644
> --- a/gcc/config/arc/arc-cpus.def
> +++ b/gcc/config/arc/arc-cpus.def
> @@ -43,36 +43,39 @@
> as defined in arc-options.def file, and allowed by arc-arches.def
> file.  The specific hardware flags are enumerated without using
> spaces between the '|' character and consequtive flags.
> +   EXTRA  Extra hardware flags, different than the ones in
> +   arc-arches.def.  Here we can specify the width of lp_count,
> +   for example.
> TUNETune value for the given configuration, otherwise NONE.  */
>  
> -ARC_CPU (em, em, 0, NONE)
> -ARC_CPU (em_mini,   em, FL_RF16, NONE)
> -ARC_CPU (arcem,  em, FL_MPYOPT_2|FL_CD|FL_BS, NONE)
> -ARC_CPU (em4,em, FL_CD, NONE)
> -ARC_CPU (em4_dmips, em, FL_MPYOPT_2|FL_CD|FL_DIVREM|FL_NORM|FL_SWAP|FL_BS, 
> NONE)
> -ARC_CPU (em4_fpus,  em, 
> FL_MPYOPT_2|FL_CD|FL_DIVREM|FL_NORM|FL_SWAP|FL_BS|FL_FPU_FPUS, NONE)
> -ARC_CPU (em4_fpuda, em, 
> FL_MPYOPT_2|FL_CD|FL_DIVREM|FL_NORM|FL_SWAP|FL_BS|FL_FPU_FPUDA, NONE)
> -ARC_CPU (quarkse_em, em, 
> FL_MPYOPT_3|FL_CD|FL_DIVREM|FL_NORM|FL_SWAP|FL_BS|FL_FPX_QUARK|FL_SPFP|FL_DPFP,
>  NONE)
> -
> -ARC_CPU (hs,  hs, 0, NONE)
> -ARC_CPU (archs,   hs, FL_MPYOPT_2|FL_DIVREM|FL_LL64, NONE)
> -ARC_CPU (hs34,hs, FL_MPYOPT_2, NONE)
> -ARC_CPU (hs38,hs, FL_MPYOPT_9|FL_DIVREM|FL_LL64, NONE)
> -ARC_CPU (hs38_linux, hs, FL_MPYOPT_9|FL_DIVREM|FL_LL64|FL_FPU_FPUD_ALL, NONE)
> -ARC_CPU (hs4x,  hs, FL_MPYOPT_9|FL_DIVREM|FL_LL64, ARCHS4X)
> -ARC_CPU (hs4xd, hs, FL_MPYOPT_9|FL_DIVREM|FL_LL64, ARCHS4XD)
> -
> -ARC_CPU (arc600,  6xx, FL_BS, ARC600)
> -ARC_CPU (arc600_norm, 6xx, FL_BS|FL_NORM, ARC600)
> -ARC_CPU (arc600_mul64,6xx, FL_BS|FL_NORM|FL_MUL64, ARC600)
> -ARC_CPU (arc600_mul32x16, 6xx, FL_BS|FL_NORM|FL_MUL32x16, ARC600)
> -ARC_CPU (arc601,   6xx, 0, ARC600)
> -ARC_CPU (arc601_norm,  6xx, FL_NORM, ARC600)
> -ARC_CPU (arc601_mul64, 6xx, FL_NORM|FL_MUL64, ARC600)
> -ARC_CPU (arc601_mul32x16, 6xx, FL_NORM|FL_MUL32x16, ARC600)
> -
> -ARC_CPU (arc700, 700, 0, ARC700_4_2_STD)
> -ARC_CPU (nps400, 700, 0, ARC700_4_2_STD)
> +ARC_CPU (em, em, 0, NONE, NONE)
> +ARC_CPU (em_mini,   em, FL_RF16, NONE, NONE)
> 

Re: [PATCH] PR libstdc++/85930 fix misaligned reference

2018-06-04 Thread Jonathan Wakely

On 04/06/18 17:06 +0100, Jonathan Wakely wrote:

PR libstdc++/85930
* include/bits/shared_ptr_base.h (_Sp_make_shared_tag::_S_ti): Align
the static variable correctly.



And _really_ align it this time, so the alignment-specifier isn't
ignored (with the warning suppressed because it's a system header).

Tested powerpc64le-linux, committed to trunk.

commit 1efe054e03151e0fb53cacb937979f238e531ae1
Author: Jonathan Wakely 
Date:   Mon Jun 4 22:25:50 2018 +0100

PR libstdc++/85930 fix misplaced alignment-specifier

PR libstdc++/85930
* include/bits/shared_ptr_base.h [!__cpp_rtti]: Include 
unconditionally. Remove redundant declaration.
[!__cpp_rtti] (_Sp_make_shared_tag::_S_ti): Fix location of
alignment-specifier.

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h b/libstdc++-v3/include/bits/shared_ptr_base.h
index 22cb7eb46b1..6c5089afdda 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -49,9 +49,7 @@
 #ifndef _SHARED_PTR_BASE_H
 #define _SHARED_PTR_BASE_H 1
 
-#if __cpp_rtti
-# include 
-#endif
+#include 
 #include 
 #include 
 #include 
@@ -59,10 +57,6 @@
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
-#if !__cpp_rtti
-  class type_info;
-#endif
-
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if _GLIBCXX_USE_DEPRECATED
@@ -516,7 +510,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 static const type_info&
 _S_ti() noexcept
 {
-  static constexpr alignas(type_info) _Sp_make_shared_tag __tag;
+  alignas(type_info) static constexpr _Sp_make_shared_tag __tag;
   return reinterpret_cast(__tag);
 }
 #endif


Re: Avoids std::distance calls

2018-06-04 Thread Jonathan Wakely

On 04/06/18 22:13 +0200, François Dumont wrote:

Hi

I'd like to propose this patch to avoid std::distance calls. In a 
number of situation in algos we already have the size of the buffer we 
need so we shouldn't have to compute it again.


Just one place, in __inplace_merge, no?

I don't think there is any abi concern for this inline constructor, 
isn't there ?


No concerns, users can't explicitly instantiate that type, and it will
be inlined or generate an instantiation everywhere it's used.


    * include/bits/stl_tempbuf.h
    (_Temporary_buffer(_FwdIte, _FwdIte)): Delete, replaced by...
    (_Temporary_buffer(_FwdIte, size_type)): ...this, new.
    * include/ext/memory (temporary_buffer<>(_FwdIte, _FwdIte)): Adapt.
    * include/bits/stl_algo.h (__stable_partition): Adapt.
    (__inplace_merge): Adapt.
    (__stable_sort): Adapt.

Tested under Linux x86_64 normal mode.

Ok to commit ?


OK, thanks.




Re: C++ PATCH for c++/85976, ICE with USING_DECL in cp_tree_equal

2018-06-04 Thread Marek Polacek
On Mon, Jun 04, 2018 at 01:28:01PM -0400, Jason Merrill wrote:
> On Mon, Jun 4, 2018 at 11:44 AM, Marek Polacek  wrote:
> > I've had no luck in reducing the testcase in this PR, creduce won't get even
> > past the initial passes, and reducing by hand didn't get me very far, 
> > either.
> >
> > But the problem seems to be merely that we're not handling USING_DECLs in
> > cp_tree_equal, and we can get there via comp_template_arguments.  In this 
> > case
> > we have two USING_DECLs with different full names.
> >
> > So this patch just adds the USING_DECL case, similarly to e.g.
> > https://gcc.gnu.org/ml/gcc-patches/2012-10/msg00799.html
> >
> > Bootstrapped/regtested on x86_64-linux, ok for trunk/8?  I verified manually
> > that this fixes the testcase from the PR.
> 
> Hmm, do these USING_DECLs have DECL_DEPENDENT_P set?  What do they
> represent?  In the case of dependent USING_DECL I'd think we want to
> compare the scope and name rather than just return false.

They represent a using-declaration in this testcase.

Yep, they're DECL_DEPENDENT_P.  So let's check their scope and name.

> I think we do want a reduced testcase.  Maybe add the needs-reduction
> tag if you're having trouble reducing it yourself?

I got lucky: I managed to remove some random unnecessary code which then
allowed creduce to do its job, the result is pretty nice!

Bootstrapped/regtested on x86_64-linux, ok for trunk/8?

2018-06-04  Marek Polacek  

PR c++/85976
* tree.c (cp_tree_equal): Handle USING_DECL.

* g++.dg/cpp0x/alias-decl-64.C: New test.

diff --git gcc/cp/tree.c gcc/cp/tree.c
index c5b6e9689b6..bbbda7e98b6 100644
--- gcc/cp/tree.c
+++ gcc/cp/tree.c
@@ -3878,6 +3878,14 @@ cp_tree_equal (tree t1, tree t2)
 DEFERRED_NOEXCEPT_ARGS (t2)));
   break;
 
+case USING_DECL:
+  if (DECL_DEPENDENT_P (t1) && DECL_DEPENDENT_P (t2))
+   return (cp_tree_equal (USING_DECL_SCOPE (t1),
+  USING_DECL_SCOPE (t2))
+   && cp_tree_equal (DECL_NAME (t1),
+ DECL_NAME (t2)));
+  return false;
+
 default:
   break;
 }
diff --git gcc/testsuite/g++.dg/cpp0x/alias-decl-64.C 
gcc/testsuite/g++.dg/cpp0x/alias-decl-64.C
index e69de29bb2d..019eb269750 100644
--- gcc/testsuite/g++.dg/cpp0x/alias-decl-64.C
+++ gcc/testsuite/g++.dg/cpp0x/alias-decl-64.C
@@ -0,0 +1,15 @@
+// PR c++/85976
+// { dg-do compile { target c++11 } }
+
+template  class A;
+template  class B;
+template  struct C;
+template  class D {
+  using B::rank_;
+  void operator()(typename C>::i);
+};
+
+template  class F {
+  using B::rank_;
+  void operator()(typename C>::i);
+};

Marek


Re: [PATCH] rtlanal: Fix nonzero_bits for non-load paradoxical subregs (PR85925)

2018-06-04 Thread Eric Botcazou
> In the PR we have insns:
> 
> Trying 23 -> 24:
>23: r123:SI=zero_extend(r122:HI)
>   REG_DEAD r122:HI
>24: [r115:SI]=r123:SI
>   REG_DEAD r123:SI
> 
> which should be combined to
> 
> (set (mem:SI (reg/f:SI 115 [ pretmp_19 ]) [1 *pretmp_19+0 S4 A32])
> (and:SI (subreg:SI (reg:HI 122) 0)
> (const_int 32767 [0x7fff])))
> 
> But nonzero_bits of reg:HI 122 is 0x7fff, and nonzero_bits1 thinks it
> then also has that same nonzero_bits for the subreg.  This is not
> correct: the bit outside of HImode are undefined.  load_extend_op
> applies to loads from memory only, not anything else.  Which means the
> whole AND is optimised away.

No, this is done on purpose for WORD_REGISTER_OPERATIONS targets and your 
patch will pessimize them.  I'm going to have a look at the PR then.

-- 
Eric Botcazou


[PATCH] wwwdocs: port from MetaHTML to a Python 3 script

2018-06-04 Thread David Malcolm
On Sat, 2018-06-02 at 13:27 +0200, Gerald Pfeifer wrote:
> On Mon, 14 May 2018, Martin Sebor wrote:
> > > Martin, what do you think?  Would that have avoided the
> > > challenges
> > > your ran into?  Anything to better clarify or otherwise improve?
> >
> > Thanks for the improvement!  I think it will help going forward
>
> Great, I (finally) committed this now, business travel not exactly
> having been helpful with my GCC engagements...
>
> > assuming one knows about the page and remembers to check it.
> > I have to confess I forgot about it so I didn't check it before
> > running the validator the last time.  Would it be possible to add
> > the snippet to each page permanently?
>
> You mean, avoiding preprocessing and making the change permanent,
> directly as part of every .html file?
>
> I've been hesitant doing that to keep things simple and easier to
> change (and not bother folks), but think this makes sense and plan
> on doing that as we migrate our pages to HTML 5 which will
> drastically
> reduce the size of that snippet also.  I expect to find time to do
> so in the coming few months, and things should indeed be simpler
> than.
>
>
> One improvement I just made is to add a reference to
>   https://gcc.gnu.org/about.html
> to the automated checker that runs after each commit.  That provides
> more background then.
>
> > Alternatively, what do you think of the idea to have a script (or
> > makefile) to post-process changes to these pages on the client
> > side,
> > before checking them in?  I.e., adding the example annotation David
> > Malcolm prefers (black background) and also validating the HTML).
>
> That's a great idea.  Sadly our post-processing relies on MetaHTML,
> a GNU projects that was handled in, umm, a less than optimal manner,
> and has been in coma for years and thus does not build with current
> compilers on current operating system versions.
>
> At least GCC Summit in Praha David and me chatted about it, and he
> was thinking to (re)implement the key functionality we use in Python,
> so let me include him here.
>
> (In any case, with the move to CSS and the forthcoming move to HTML 5
> that dependency has been and will be reducing over time.)
>
> Gerald

Caveat/confession: I've never managed to successfully build MetaHTML,
so I've never run it.  All of my editing of the GCC website has involved
looking at the input HTML locally, hacking up the header so that the W3C
validator can run on it, then committing (and crossing my fingers).

Here's an attempt at porting the website from MetaHTML to a Python 3
script.  I've tested it lightly so far by comparing it to reference
pages from the built site (index.html, c99status.html, and install/test.html
so far).

It's not particularly elegant Python 3 (though it does have unit tests!),
and the parsing is a kludge (but seems good enough).  The goal is purely
to drop our dependency on MetaHTML so that all of the logic is expressed
in a well-supported, living programming language.  It has no dependencies
besides Python 3 (tested with 3.3.2).  It might even work with Python 2.

Does this look like the right idea?  (it's not quite perfect yet, but
is pretty close for the pages I tested on).

Are there any other pages which have their own uses of MetaHTML that I
didn't spot yet?

For now, I've attempted to faithfully reproduce the strange whitespace that
MetaHTML introduces, to make it easier to compare diffs.  A followup would
be to remove that, to give more natural whitespace (thus keeping the
whitespace changes separated from non-whitespace changes, as it were).

Dave
---
 bin/preprocess   |  24 ++--
 bin/process-html-file.py | 321 +++
 htdocs/style.mhtml   | 223 
 3 files changed, 329 insertions(+), 239 deletions(-)
 create mode 100755 bin/process-html-file.py
 delete mode 100644 htdocs/style.mhtml

diff --git a/bin/preprocess b/bin/preprocess
index 2d09d54..f894af4 100755
--- a/bin/preprocess
+++ b/bin/preprocess
@@ -33,11 +33,11 @@
 #
 # By Gerald Pfeifer  1999-12-29.
 
-MHC=${MHC-/usr/local/bin/mhc}
-
 SOURCETREE=${SOURCETREE-/www/gcc/htdocs-preformatted}
 DESTTREE=${DESTTREE-/www/gcc/htdocs}
 
+BINDIR=${SOURCETREE}/../bin
+
 STYLE=$SOURCETREE/style.mhtml
 IGNORE_DIFF_MARKER="IGNORE DIFF"
 
@@ -102,29 +102,26 @@ process_html_file()
 fdest=$2
 
 # Set environment variable PREPROCESS_FILE, which will then be
-# accessible by MetaHTML.
+# accessible by bin/process-html-file.py.
 
 PREPROCESS_FILE=$f
 export PREPROCESS_FILE
 
-# Prepend the MetaHTML style, set the MetaHTML include directory,
+# Prepend the MetaHTML include directory,
 # and process the page.
-
-cat $STYLE > $TMPDIR/input
 printf '\n' `pwd` >> $TMPDIR/input
 cat $f >> $TMPDIR/input
 
-if ! ${MHC} $TMPDIR/input > $TMPDIR/output.raw; then
-echo "${MHC} failed; aborting."
+# Process the page.
+if ! $BINDIR/process-html-file.py 

Re: [PATCH, rs6000] Improve indentation of prototype documentation

2018-06-04 Thread Segher Boessenkool
On Mon, Jun 04, 2018 at 02:37:20PM -0500, Kelvin Nilsen wrote:
> 
> This patch removes extraneous line breaks to condense the number of lines 
> require in the "PowerPC AltiVec Built-in Functions" section of the gcc.pdf 
> manual by about 7 pages.  Besides improving the appearance of this 
> documentation, there are two additional benefits:

What does line wrap now look like, both in the pdf and in the "plain" info
doc?  (And in html?)

> 1. Subsequent patches that move prototype definitions in order to alphabetize 
> definitions or in order to group definitions requiring the same target 
> options together are easier to understand if each prototype description is 
> represented on a single line.
> 
> 2. Enclosing the group of 8 vec_xl prototypes and 8 vec_xst prototypes 
> between @smallexample and @end smallexample allows these prototypes to be 
> automatically parsed by a tool that validates consistency between 
> implementation and documentation of built-in functions.

Yes, excellent :-)

> Segher: if you prefer, I can break this into multiple smaller patches.  What 
> would be the ideal size of each patch?

No, this is fine: it is doing just one thing (well, also the smallexample
thing), and the patch is readable enough.

What you shouldn't do is patches that interleave many things, or that do
multiple (big) things that can just as well be separate patches, or that
move things as well as modify them (in that case, do a patch that just
moves things, and one that does the modifications).

> Is this ok for trunk?

Yes.  Thank you!


Segher


> 2018-06-04  Kelvin Nilsen  
> 
>   * doc/extend.texi (PowerPC AltiVec Built-in Functions): Adjust
>   indentation and line wrap for many prototypes.  Add missing
>   @smallexample directives around block of prototypes for vec_xl and
>   vec_xst.


Avoids std::distance calls

2018-06-04 Thread François Dumont

Hi

I'd like to propose this patch to avoid std::distance calls. In a number 
of situation in algos we already have the size of the buffer we need so 
we shouldn't have to compute it again.


I don't think there is any abi concern for this inline constructor, 
isn't there ?


    * include/bits/stl_tempbuf.h
    (_Temporary_buffer(_FwdIte, _FwdIte)): Delete, replaced by...
    (_Temporary_buffer(_FwdIte, size_type)): ...this, new.
    * include/ext/memory (temporary_buffer<>(_FwdIte, _FwdIte)): Adapt.
    * include/bits/stl_algo.h (__stable_partition): Adapt.
    (__inplace_merge): Adapt.
    (__stable_sort): Adapt.

Tested under Linux x86_64 normal mode.

Ok to commit ?

François

diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h
index 914a6b6..d8488a7 100644
--- a/libstdc++-v3/include/bits/stl_algo.h
+++ b/libstdc++-v3/include/bits/stl_algo.h
@@ -1615,7 +1615,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   typedef typename iterator_traits<_ForwardIterator>::difference_type
 	_DistanceType;
 
-  _Temporary_buffer<_ForwardIterator, _ValueType> __buf(__first, __last);
+  _Temporary_buffer<_ForwardIterator, _ValueType>
+	__buf(__first, std::distance(__first, __last));
   return
 	std::__stable_partition_adaptive(__first, __last, __pred,
 	 _DistanceType(__buf.requested_size()),
@@ -2534,7 +2535,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   const _DistanceType __len2 = std::distance(__middle, __last);
 
   typedef _Temporary_buffer<_BidirectionalIterator, _ValueType> _TmpBuf;
-  _TmpBuf __buf(__first, __last);
+  _TmpBuf __buf(__first, __len1 + __len2);
 
   if (__buf.begin() == 0)
 	std::__merge_without_buffer
@@ -4992,7 +4993,7 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO
 	_DistanceType;
 
   typedef _Temporary_buffer<_RandomAccessIterator, _ValueType> _TmpBuf;
-  _TmpBuf __buf(__first, __last);
+  _TmpBuf __buf(__first, std::distance(__first, __last));
 
   if (__buf.begin() == 0)
 	std::__inplace_stable_sort(__first, __last, __comp);
diff --git a/libstdc++-v3/include/bits/stl_tempbuf.h b/libstdc++-v3/include/bits/stl_tempbuf.h
index 56c4ac5..159ee27 100644
--- a/libstdc++-v3/include/bits/stl_tempbuf.h
+++ b/libstdc++-v3/include/bits/stl_tempbuf.h
@@ -158,9 +158,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /**
* Constructs a temporary buffer of a size somewhere between
-   * zero and the size of the given range.
+   * zero and the given length.
*/
-  _Temporary_buffer(_ForwardIterator __first, _ForwardIterator __last);
+  _Temporary_buffer(_ForwardIterator __seed, size_type __original_len);
 
   ~_Temporary_buffer()
   {
@@ -241,9 +241,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 _Temporary_buffer<_ForwardIterator, _Tp>::
-_Temporary_buffer(_ForwardIterator __first, _ForwardIterator __last)
-: _M_original_len(std::distance(__first, __last)),
-  _M_len(0), _M_buffer(0)
+_Temporary_buffer(_ForwardIterator __seed, size_type __original_len)
+: _M_original_len(__original_len), _M_len(0), _M_buffer(0)
 {
   __try
 	{
@@ -253,7 +252,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	  _M_len = __p.second;
 	  if (_M_buffer)
 	std::__uninitialized_construct_buf(_M_buffer, _M_buffer + _M_len,
-	   __first);
+	   __seed);
 	}
   __catch(...)
 	{
@@ -268,4 +267,3 @@ _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
 #endif /* _STL_TEMPBUF_H */
-
diff --git a/libstdc++-v3/include/ext/memory b/libstdc++-v3/include/ext/memory
index c5d526e..fcc4948 100644
--- a/libstdc++-v3/include/ext/memory
+++ b/libstdc++-v3/include/ext/memory
@@ -184,7 +184,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 {
   /// Requests storage large enough to hold a copy of [first,last).
   temporary_buffer(_ForwardIterator __first, _ForwardIterator __last)
-  : _Temporary_buffer<_ForwardIterator, _Tp>(__first, __last) { }
+  : _Temporary_buffer<_ForwardIterator, _Tp>(__first,
+		 std::distance(__first, __last))
+  { }
   
   /// Destroys objects and frees storage.
   ~temporary_buffer() { }


[PATCH, rs6000] Improve indentation of prototype documentation

2018-06-04 Thread Kelvin Nilsen


This patch removes extraneous line breaks to condense the number of lines 
require in the "PowerPC AltiVec Built-in Functions" section of the gcc.pdf 
manual by about 7 pages.  Besides improving the appearance of this 
documentation, there are two additional benefits:

1. Subsequent patches that move prototype definitions in order to alphabetize 
definitions or in order to group definitions requiring the same target options 
together are easier to understand if each prototype description is represented 
on a single line.

2. Enclosing the group of 8 vec_xl prototypes and 8 vec_xst prototypes between 
@smallexample and @end smallexample allows these prototypes to be automatically 
parsed by a tool that validates consistency between implementation and 
documentation of built-in functions.

This patch has bootstrapped and tested without regressions on 
powerpc64le-unknown-linux (P8).  I have also built the gcc.pdf file and 
reviewed its contents.

Segher: if you prefer, I can break this into multiple smaller patches.  What 
would be the ideal size of each patch?

Is this ok for trunk?

gcc/ChangeLog:

2018-06-04  Kelvin Nilsen  

* doc/extend.texi (PowerPC AltiVec Built-in Functions): Adjust
indentation and line wrap for many prototypes.  Add missing
@smallexample directives around block of prototypes for vec_xl and
vec_xst.

Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi (revision 261067)
+++ gcc/doc/extend.texi (working copy)
@@ -16200,17 +16200,13 @@ vector signed char vec_add (vector signed char, ve
 vector signed char vec_add (vector signed char, vector signed char);
 vector unsigned char vec_add (vector bool char, vector unsigned char);
 vector unsigned char vec_add (vector unsigned char, vector bool char);
-vector unsigned char vec_add (vector unsigned char,
-  vector unsigned char);
+vector unsigned char vec_add (vector unsigned char, vector unsigned char);
 vector signed short vec_add (vector bool short, vector signed short);
 vector signed short vec_add (vector signed short, vector bool short);
 vector signed short vec_add (vector signed short, vector signed short);
-vector unsigned short vec_add (vector bool short,
-   vector unsigned short);
-vector unsigned short vec_add (vector unsigned short,
-   vector bool short);
-vector unsigned short vec_add (vector unsigned short,
-   vector unsigned short);
+vector unsigned short vec_add (vector bool short, vector unsigned short);
+vector unsigned short vec_add (vector unsigned short, vector bool short);
+vector unsigned short vec_add (vector unsigned short, vector unsigned short);
 vector signed int vec_add (vector bool int, vector signed int);
 vector signed int vec_add (vector signed int, vector bool int);
 vector signed int vec_add (vector signed int, vector signed int);
@@ -16226,47 +16222,33 @@ vector signed int vec_vadduwm (vector signed int,
 vector signed int vec_vadduwm (vector signed int, vector signed int);
 vector unsigned int vec_vadduwm (vector bool int, vector unsigned int);
 vector unsigned int vec_vadduwm (vector unsigned int, vector bool int);
-vector unsigned int vec_vadduwm (vector unsigned int,
- vector unsigned int);
+vector unsigned int vec_vadduwm (vector unsigned int, vector unsigned int);
 
-vector signed short vec_vadduhm (vector bool short,
- vector signed short);
-vector signed short vec_vadduhm (vector signed short,
- vector bool short);
-vector signed short vec_vadduhm (vector signed short,
- vector signed short);
-vector unsigned short vec_vadduhm (vector bool short,
-   vector unsigned short);
-vector unsigned short vec_vadduhm (vector unsigned short,
-   vector bool short);
-vector unsigned short vec_vadduhm (vector unsigned short,
-   vector unsigned short);
+vector signed short vec_vadduhm (vector bool short, vector signed short);
+vector signed short vec_vadduhm (vector signed short, vector bool short);
+vector signed short vec_vadduhm (vector signed short, vector signed short);
+vector unsigned short vec_vadduhm (vector bool short, vector unsigned short);
+vector unsigned short vec_vadduhm (vector unsigned short, vector bool short);
+vector unsigned short vec_vadduhm (vector unsigned short, vector unsigned 
short);
 
 vector signed char vec_vaddubm (vector bool char, vector signed char);
 vector signed char vec_vaddubm (vector signed char, vector bool char);
 vector signed char vec_vaddubm (vector signed char, vector signed char);
-vector unsigned char vec_vaddubm (vector bool char,
-  vector unsigned char);
-vector unsigned char vec_vaddubm (vector 

Re: [PATCH, rs6000] Correct documentation of vec_lvsl and vec_lvsr arguments

2018-06-04 Thread Segher Boessenkool
On Mon, Jun 04, 2018 at 11:03:49AM -0500, Kelvin Nilsen wrote:
> The existing documentation incorrectly specifies that the second argument of 
> vec_lvsl and vec_lvsr instructions are volatile  *.  This patch removes 
> the volatile qualifier from the documentation of these arguments.
> 
> his patch has bootstrapped and tested without regressions on 
> powerpc64le-unknown-linux (P8).  I have built the gcc.pdf file and reviewed 
> its contents.
> 
> Is this ok for trunk?

Yes please.  Thanks!


Segher


> 2018-06-04  Kelvin Nilsen  
> 
>   * doc/extend.texi (PowerPC AltiVec Built-in Functions): Remove
>   volatile qualifier from vec_lvsl and vec_lvsr argument prototypes.


Go patch committed: Don't crash on cyclic var/type dependencies

2018-06-04 Thread Ian Lance Taylor
This patch by Than McIntosh fixes type traversal in the Go frontend to
avoid compiler crashes for test cases where a type T includes an
expression that refers back to the type without actually explicitly
mentioning T. Examples include:

  var x [uintptr(unsafe.Sizeof())]byte
  var a [len(a)]int

The fix involves expanding the set of types that the traversal code
"remembers" (to avoid cycles) to include array types, and introducing
an additional guard in Builtin_call_expression::do_is_constant to
catch cyclic type constructs.

Fixes https://golang.org/issue/{25299,25679,25315,25680}.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 261041)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-79eca4fd642724d89e9bec8f79889451f6632a46
+8e74a218e11ef6eaaf7014a3ad1cd0b13359c607
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 261041)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -8061,9 +8061,13 @@ Builtin_call_expression::do_is_constant(
  arg_type = arg_type->points_to();
 
if (arg_type->array_type() != NULL
-   && arg_type->array_type()->length() != NULL
-   && Builtin_call_expression::array_len_is_constant(arg))
- return true;
+   && arg_type->array_type()->length() != NULL)
+  {
+   this->seen_ = true;
+   bool ret = Builtin_call_expression::array_len_is_constant(arg);
+   this->seen_ = false;
+   return ret;
+  }
 
if (this->code_ == BUILTIN_LEN && arg_type->is_string_type())
  {
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 261041)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -8258,8 +8258,16 @@ Traverse::remember_type(const Type* type
   // We mostly only have to remember named types.  But it turns out
   // that an interface type can refer to itself without using a name
   // by relying on interface inheritance, as in
-  // type I interface { F() interface{I} }
+  //
+  // type I interface { F() interface{I} }
+  //
+  // Similarly it is possible for array types to refer to themselves
+  // without a name, e.g.
+  //
+  // var x [uintptr(unsafe.Sizeof())]byte
+  //
   if (type->classification() != Type::TYPE_NAMED
+  && type->classification() != Type::TYPE_ARRAY
   && type->classification() != Type::TYPE_INTERFACE)
 return false;
   if (this->types_seen_ == NULL)


Re: [Patch][Aarch64][PR target/79924] Cannot translate diagnostics

2018-06-04 Thread Steve Ellcey
Ping.

Steve Ellcey
sell...@cavium.com


On Thu, 2018-01-11 at 15:44 -0800, Steve Ellcey wrote:
> This is a patch for PR target/79924, which says the error messages
> called from aarch64_err_no_fpadvsimd cannot be translated due to
> how they are constructed.  To make them translatable and not change
> the actual messages would have required creating 16 individual
> messages
> which seemed a bit excessive so I simplified them a bit and fixed
> up the test cases that scan for them.
> 
> Instead of having 16 messages with:
> 
> ['-mgeneral-regs-only'|'+nofp' feature modifier] is incompatible with
> [vector|floating-point] [argument|return type|varargs|code]
> 
> I changed it to four messages with
> 
> [-mgeneral-regs-only|+nofp feature modifier] is incompatible with the
> use of [vector|floating point] types
> 
> The changes I made in the actual messages were:
> 
> Remove the quotes from around '-mgeneral-regs-only' and '+nofp'
> because error
> messages I saw on other platforms did not use quotes around compiler
> flags.
> 
> Replace the specific [argument|return type|varargs|code] terms with
> 'use of [vector|floating point] types.  This got me from 16 to 4
> messages.  I think that since the error message will point at the
> actual
> line where the problem is, having the message be a bit more generic
> is
> not a problem.  Also I chose to include 'the use of' because having
> these
> types (and not doing anything with them) will not trigger one of
> these
> errors only using them in code (as an argument, return type, etc)
> will
> trigger the error.  Also I got rid of the '-' in floating-point since
> I didn't see any reason for it to be there.
> 
> Tested with no regressions, OK to checkin?
> 
> Steve Ellcey
> sell...@cavium.com
> 
> 
> 2018-01-11  Steve Ellcey  
> 
>   PR target/79924
>   * config/aarch64/aarch64-protos.h (aarch64_err_no_fpadvsimd):
> Remove
>   second argument.
>   * config/aarch64/aarch64-protos..c (aarch64_err_no_fpadvsimd):
>   Remove second argument, change how error is called.
>   (aarch64_layout_arg): Remove second argument from
>   aarch64_err_no_fpadvsimd call.
>   (aarch64_init_cumulative_args): Ditto.
>   (aarch64_gimplify_va_arg_expr): Ditto.
>   * config/aarch64/aarch64.md (mov): Ditto.
> 
> 
> 2018-01-11  Steve Ellcey  
> 
>   PR target/79924
>   * gcc.target/aarch64/mgeneral-regs_1.c: Update error message.
>   * gcc.target/aarch64/mgeneral-regs_2.c: Ditto.
>   * gcc.target/aarch64/mgeneral-regs_3.c: Ditto.
>   * gcc.target/aarch64/nofp_1.c: Ditto.


Re: [PATCH] [MSP430] Fix device name symbol defined for MSP430i series of devices

2018-06-04 Thread Jozef Lawrynowicz

On 04/06/18 18:52, DJ Delorie wrote:

Jozef Lawrynowicz  writes:

For the currently released msp430i* devices, only digits follow the i, so no
upper or lower case conversion is needed.

Thinking of the future... do we expect any new devices with letters?
Should we plan for them?  Or better to wait, in case there are more
lower-case-letter symbols?

Just trying to think of all the possibilities here.


It appears that the general scheme for the letters in the device name symbols
is to have them all upper case, but if it's part of the MSP430i branding (or
some other reason) to have the letters lowercase then I could imagine any
additional letters could end up as either upper or lower case.

When a new version of a device with errata fixed is released, an 'A' is put on
the end of the device name, so it's possible there'll be a new MSP430i device
with this format at some point.

I'll check with TI if there are any specific conventions for the format of
these device symbols so we can hopefully future proof this a bit.



Re: [Patch] Do not call the linker if we are creating precompiled header files

2018-06-04 Thread Steve Ellcey
Ping^2

Steve Ellcey
sell...@cavium.com

On Thu, 2018-05-17 at 14:50 -0700, Steve Ellcey wrote:
> Ping.
> 
> Steve Ellcey
> sell...@cavium.com
> 
> 
> On Wed, 2018-05-02 at 12:47 -0700, Steve Ellcey wrote:
> > 
> > This is a new version of a patch I sent out last year to stop gcc from
> > trying to do a link when creating precompiled headers and a linker
> > flag is also given.
> > 
> > When I build and test GCC I also build glibc and then I run the GCC tests
> > with -Wl,-rpath and -Wl,--dynamic-linker so that I don't have to install
> > glibc and the compiler in the default locations.  When I do this some
> > precompiled header tests fail because the existance of the linker flags
> > causes the compiler to try and call the linker when we really just want to
> > create pch files.
> > 
> > I tracked this down to driver::maybe_run_linker where it sees the linker
> > flags and increments num_linker_inputs, this causes the routine to call
> > the linker.   This patch checks to see if we are creating precompiled
> > header files and avoids calling the linker in that case.
> > 
> > I tested this with the GCC testsuite and got no regressions, OK to
> > checkin?
> > 
> > Steve Ellcey
> > sell...@cavium.com
> > 
> > 
> > 2018-05-02  Steve Ellcey  
> > 
> > * gcc.c (create_pch_flag): New variable.
> > (driver::prepare_infiles): Set create_pch_flag
> > when we are creating precompiled headers.
> > (driver::maybe_run_linker): Do not link if
> > create_pch_flag is set.
> > (driver::finalize): Reset create_pch_flag.
> > 
> > 
> > diff --git a/gcc/gcc.c b/gcc/gcc.c
> > index a716f70..ca986cf 100644
> > --- a/gcc/gcc.c
> > +++ b/gcc/gcc.c
> > @@ -208,6 +208,9 @@ int is_cpp_driver;
> >  /* Flag set to nonzero if an @file argument has been supplied to
> > gcc.  */
> >  static bool at_file_supplied;
> >  
> > +/* Flag set to nonzero if we are generating a precompiled
> > header.  */
> > +static bool create_pch_flag;
> > +
> >  /* Definition of string containing the arguments given to
> > configure.  */
> >  #include "configargs.h"
> >  
> > @@ -8095,8 +8098,15 @@ driver::prepare_infiles ()
> >        strlen (name),
> >        infiles[i].lang
> > ua
> > ge);
> >  
> > -  if (compiler && !(compiler->combinable))
> > -   combine_inputs = false;
> > +  if (compiler)
> > +   {
> > +     if (!(compiler->combinable))
> > +   combine_inputs = false;
> > +
> > +     if ((strcmp(compiler->suffix, "@c-header") == 0)
> > +     || (strcmp(compiler->suffix, "@c++-header") == 0))
> > +   create_pch_flag = true;
> > +   }
> >  
> >    if (lang_n_infiles > 0 && compiler != input_file_compiler
> >       && infiles[i].language && infiles[i].language[0] != '*')
> > @@ -8282,6 +8292,10 @@ driver::maybe_run_linker (const char *argv0)
> > const
> >    int linker_was_run = 0;
> >    int num_linker_inputs;
> >  
> > +  /* If we are creating a precompiled header, do not run the
> > linker.  */
> > +  if (create_pch_flag)
> > +return;
> > +
> >    /* Determine if there are any linker input files.  */
> >    num_linker_inputs = 0;
> >    for (i = 0; (int) i < n_infiles; i++)
> > @@ -10052,6 +10066,7 @@ driver::finalize ()
> >  
> >    is_cpp_driver = 0;
> >    at_file_supplied = 0;
> > +  create_pch_flag = 0;
> >    print_help_list = 0;
> >    print_version = 0;
> >    verbose_only_flag = 0;


Re: [PATCH] [MSP430] Fix device name symbol defined for MSP430i series of devices

2018-06-04 Thread DJ Delorie
Jozef Lawrynowicz  writes:
> For the currently released msp430i* devices, only digits follow the i, so no
> upper or lower case conversion is needed.

Thinking of the future... do we expect any new devices with letters?
Should we plan for them?  Or better to wait, in case there are more
lower-case-letter symbols?

Just trying to think of all the possibilities here.


Re: [PATCH], Add weak references to bridge old/new PowerPC ieee 128-bit name mangling

2018-06-04 Thread Segher Boessenkool
Hi Mike,

On Fri, Jun 01, 2018 at 07:28:40PM -0400, Michael Meissner wrote:
> This patch also makes __ibm128 or __float128 use the long double mode if long
> double uses the IBM extended double or IEEE 128-bit representations.

Does that need to be the same patch?  Please split such things out when
you can.

> This
> allows templates to work again with those types (the template code aborts if
> you have two distinct types use the same mangling).  However, overloaded types
> won't work, but I suspect these days people use templates over overloaded
> functions.  I fixed up the test case for this (pr85657.C) so that it only 
> tests
> for templates.

If tests fail, do not delete the test.  If we have a problem, we have
a problem, and it needs to be fixed (not necessarily today of course,
and there is xfail for long-standing problems).


> +/* Generate old manged name, not new name.  */
> +static bool old_mangling;

As Andreas said, this is not a good name.  Please at least mention for
what type this old mangling is.

> @@ -16355,14 +16394,21 @@ rs6000_init_builtins (void)
>   __ieee128.  */
>if (TARGET_FLOAT128_TYPE)
>  {
> -  ibm128_float_type_node = make_node (REAL_TYPE);
> -  TYPE_PRECISION (ibm128_float_type_node) = 128;
> -  SET_TYPE_MODE (ibm128_float_type_node, IFmode);
> -  layout_type (ibm128_float_type_node);
> +  if (TARGET_IEEEQUAD || !TARGET_LONG_DOUBLE_128)
> + {
> +   ibm128_float_type_node = make_node (REAL_TYPE);
> +   TYPE_PRECISION (ibm128_float_type_node) = 128;
> +   SET_TYPE_MODE (ibm128_float_type_node, IFmode);
> +   layout_type (ibm128_float_type_node);
> + }
> +  else
> + ibm128_float_type_node = long_double_type_node;

I wonder how hard it would be to alias the long double type to either
__ibm128 or __ieee128, instead of the other way around?  This would
simplify code a lot.  Esp. if we can do the same thing for the modes,
too.

> @@ -32117,7 +32163,7 @@ rs6000_mangle_type (const_tree type)
>if (SCALAR_FLOAT_TYPE_P (type) && FLOAT128_IBM_P (TYPE_MODE (type)))
>  return "g";
>if (SCALAR_FLOAT_TYPE_P (type) && FLOAT128_IEEE_P (TYPE_MODE (type)))
> -return "u9__ieee128";
> +return old_mangling ? "U10__float128" : "u9__ieee128";

> +#if TARGET_ELF && RS6000_WEAK
> +static void
> +rs6000_globalize_decl_name (FILE * stream, tree decl)
> +{
> +  const char *name = XSTR (XEXP (DECL_RTL (decl), 0), 0);
> +
> +  targetm.asm_out.globalize_label (stream, name);
> +
> +  if (TARGET_OLD_NAME_MANGLING && rs6000_passes_ieee128
> +  && name[0] == '_' && name[1] == 'Z')
> +{
> +  tree save_asm_name = DECL_ASSEMBLER_NAME (decl);
> +  const char *old_name;
> +
> +  old_mangling = true;
> +  lang_hooks.set_decl_assembler_name (decl);
> +  old_name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
> +  SET_DECL_ASSEMBLER_NAME (decl, save_asm_name);
> +  old_mangling = false;

Eww.  Can't we just create old_name directly?  Maybe there already is
some nice helper for this?

> +;; -malias-old-name-mangling: implement an alias to map old C++ mangled names
> +;; to the new version of the name
> +malias-old-name-mangling
> +Target Undocumented Var(TARGET_OLD_NAME_MANGLING) Init(-1) Save

As said, better name please.  Or, just drop the option (always enable it).


Segher


[PATCH][AArch64] Support for LDP/STP of Q-registers

2018-06-04 Thread Kyrill Tkachov

Hi all,

This patch adds support for generating LDPs and STPs of Q-registers.
This allows for more compact code generation and makes better use of the ISA.

It's implemented in a straightforward way by allowing 16-byte modes in the
sched-fusion machinery and adding appropriate peepholes in aarch64-ldpstp.md
as well as the patterns themselves in aarch64-simd.md.

I didn't see any non-noise performance effect on SPEC2017 on Cortex-A72 and 
Cortex-A53.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2018-06-04  Kyrylo Tkachov  

* config/aarch64/aarch64.c (aarch64_mode_valid_for_sched_fusion_p):
Allow 16-byte modes.
(aarch64_classify_address): Allow 16-byte modes for load_store_pair_p.
* config/aarch64/aarch64-ldpstp.md: Add peepholes for LDP STP of
128-bit modes.
* config/aarch64/aarch64-simd.md (load_pair):
New pattern.
(vec_store_pair): Likewise.
* config/aarch64/iterators.md (VQ2): New mode iterator.

2018-06-04  Kyrylo Tkachov  

* gcc.target/aarch64/ldp_stp_q.c: New test.
* gcc.target/aarch64/stp_vec_128_1.c: Likewise.
diff --git a/gcc/config/aarch64/aarch64-ldpstp.md b/gcc/config/aarch64/aarch64-ldpstp.md
index 7f1031dc80fab31f691c0b03d6a485c1b6fd7e53..12d89fd2ef5db5e0d3828d75ae244fdc04438f45 100644
--- a/gcc/config/aarch64/aarch64-ldpstp.md
+++ b/gcc/config/aarch64/aarch64-ldpstp.md
@@ -91,6 +91,32 @@ (define_peephole2
   aarch64_swap_ldrstr_operands (operands, false);
 })
 
+(define_peephole2
+  [(set (match_operand:VQ 0 "register_operand" "")
+	(match_operand:VQ 1 "memory_operand" ""))
+   (set (match_operand:VQ2 2 "register_operand" "")
+	(match_operand:VQ2 3 "memory_operand" ""))]
+  "aarch64_operands_ok_for_ldpstp (operands, true, mode)"
+  [(parallel [(set (match_dup 0) (match_dup 1))
+	  (set (match_dup 2) (match_dup 3))])]
+{
+  aarch64_swap_ldrstr_operands (operands, true);
+})
+
+(define_peephole2
+  [(set (match_operand:VQ 0 "memory_operand" "")
+	(match_operand:VQ 1 "register_operand" ""))
+   (set (match_operand:VQ2 2 "memory_operand" "")
+	(match_operand:VQ2 3 "register_operand" ""))]
+  "TARGET_SIMD
+   && aarch64_operands_ok_for_ldpstp (operands, false, mode)"
+  [(parallel [(set (match_dup 0) (match_dup 1))
+	  (set (match_dup 2) (match_dup 3))])]
+{
+  aarch64_swap_ldrstr_operands (operands, false);
+})
+
+
 ;; Handle sign/zero extended consecutive load/store.
 
 (define_peephole2
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index d5803998c60bf9422dbc4481bac1590f4d209a4a..740a3414a8d9c80addbfa611d530d9f56da11100 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -205,6 +205,34 @@ (define_insn "vec_store_pair"
   [(set_attr "type" "neon_stp")]
 )
 
+(define_insn "load_pair"
+  [(set (match_operand:VQ 0 "register_operand" "=w")
+	(match_operand:VQ 1 "aarch64_mem_pair_operand" "Ump"))
+   (set (match_operand:VQ2 2 "register_operand" "=w")
+	(match_operand:VQ2 3 "memory_operand" "m"))]
+  "TARGET_SIMD
+&& rtx_equal_p (XEXP (operands[3], 0),
+		plus_constant (Pmode,
+			   XEXP (operands[1], 0),
+			   GET_MODE_SIZE (mode)))"
+  "ldp\\t%q0, %q2, %1"
+  [(set_attr "type" "neon_ldp_q")]
+)
+
+(define_insn "vec_store_pair"
+  [(set (match_operand:VQ 0 "aarch64_mem_pair_operand" "=Ump")
+	(match_operand:VQ 1 "register_operand" "w"))
+   (set (match_operand:VQ2 2 "memory_operand" "=m")
+	(match_operand:VQ2 3 "register_operand" "w"))]
+  "TARGET_SIMD && rtx_equal_p (XEXP (operands[2], 0),
+		plus_constant (Pmode,
+			   XEXP (operands[0], 0),
+			   GET_MODE_SIZE (mode)))"
+  "stp\\t%q1, %q3, %0"
+  [(set_attr "type" "neon_stp_q")]
+)
+
+
 (define_split
   [(set (match_operand:VQ 0 "register_operand" "")
   (match_operand:VQ 1 "register_operand" ""))]
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 89fce15e0194365a6c0a85236c3ea6b26d26e89e..77f9f8adef6155e51be7aa5551e71c688128ecfc 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5681,7 +5681,8 @@ aarch64_mode_valid_for_sched_fusion_p (machine_mode mode)
   return mode == SImode || mode == DImode
 	 || mode == SFmode || mode == DFmode
 	 || (aarch64_vector_mode_supported_p (mode)
-	 && known_eq (GET_MODE_SIZE (mode), 8));
+	 && (known_eq (GET_MODE_SIZE (mode), 8)
+		 || known_eq (GET_MODE_SIZE (mode), 16)));
 }
 
 /* Return true if REGNO is a virtual pointer register, or an eliminable
@@ -5838,7 +5839,8 @@ aarch64_classify_address (struct aarch64_address_info *info,
 
 	  if (load_store_pair_p)
 	return ((known_eq (GET_MODE_SIZE (mode), 4)
-		 || known_eq (GET_MODE_SIZE (mode), 8))
+		 || known_eq (GET_MODE_SIZE (mode), 8)
+		 || known_eq (GET_MODE_SIZE (mode), 16))
 		&& aarch64_offset_7bit_signed_scaled_p (mode, offset));
 	  else
 	return (offset_9bit_signed_unscaled_p (mode, offset)
@@ -5898,7 +5900,8 @@ aarch64_classify_address (struct 

Re: [PATCH] [MSP430] Fix device name symbol defined for MSP430i series of devices

2018-06-04 Thread Jozef Lawrynowicz

On 04/06/18 18:26, DJ Delorie wrote:

Jozef Lawrynowicz  writes:

+  if (strncmp (target_mcu, "msp430i", 7) == 0)
+   snprintf (mcu_name, sizeof (mcu_name) - 1, "__MSP430i%s__",
+ target_mcu + 7);
+  else

Do you need to TOUPPER the parts of target_mcu after char 7 ?


For the currently released msp430i* devices, only digits follow the i, so no
upper or lower case conversion is needed.



Re: C++ PATCH for c++/85976, ICE with USING_DECL in cp_tree_equal

2018-06-04 Thread Jason Merrill
On Mon, Jun 4, 2018 at 11:44 AM, Marek Polacek  wrote:
> I've had no luck in reducing the testcase in this PR, creduce won't get even
> past the initial passes, and reducing by hand didn't get me very far, either.
>
> But the problem seems to be merely that we're not handling USING_DECLs in
> cp_tree_equal, and we can get there via comp_template_arguments.  In this case
> we have two USING_DECLs with different full names.
>
> So this patch just adds the USING_DECL case, similarly to e.g.
> https://gcc.gnu.org/ml/gcc-patches/2012-10/msg00799.html
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk/8?  I verified manually
> that this fixes the testcase from the PR.

Hmm, do these USING_DECLs have DECL_DEPENDENT_P set?  What do they
represent?  In the case of dependent USING_DECL I'd think we want to
compare the scope and name rather than just return false.

I think we do want a reduced testcase.  Maybe add the needs-reduction
tag if you're having trouble reducing it yourself?

Jason


Re: [PATCH] [MSP430] Fix device name symbol defined for MSP430i series of devices

2018-06-04 Thread DJ Delorie


Jozef Lawrynowicz  writes:
> +  if (strncmp (target_mcu, "msp430i", 7) == 0)
> + snprintf (mcu_name, sizeof (mcu_name) - 1, "__MSP430i%s__",
> +   target_mcu + 7);
> +  else

Do you need to TOUPPER the parts of target_mcu after char 7 ?


Re: [PATCH] PR target/85358: Add target hook to prevent default widening

2018-06-04 Thread Michael Meissner
On Mon, Jun 04, 2018 at 08:46:42AM +0200, Richard Biener wrote:
> On Fri, 1 Jun 2018, Joseph Myers wrote:
> 
> > On Fri, 1 Jun 2018, Michael Meissner wrote:
> > 
> > > I'm wondering if there are other suggestions to make this patch 
> > > acceptable.
> > > 
> > > As I mentioned previously, the initialization process needs to go through 
> > > all
> > > of the widening tables in order to initialize all FP types, so we can't 
> > > just
> > > arbitrarily eliminate IFmode from the widening table.
> > 
> > Initialization that's meant to cover all floating-point modes logically 
> > should not rely on everything being reachable by the "wider" relation.
> > 
> > That is, I'd expect it to do something equivalent to 
> > FOR_EACH_MODE_IN_CLASS (mode, MODE_FLOAT) rather than something based on 
> > "wider".
> 
> The unfortunate thing is that all iterators are wrapped around
> the "wider" relationship - we do not have any other way to find
> related modes.  So the concept that all modes in a class can be
> ordered after their width is baked in very deeply.  That's IMHO
> something we need to eventually fix to avoid this kind of
> target-hook "hacks".
> 
> Richard.

Given the related modes are in sequential order now, I could imagine ways to do
the initialization and iterators using those values.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



New Spanish PO file for 'gcc' (version 8.1.0)

2018-06-04 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Spanish team of translators.  The file is available at:

http://translationproject.org/latest/gcc/es.po

(This file, 'gcc-8.1.0.es.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




[PATCH] rs6000: Fix absif2

2018-06-04 Thread Segher Boessenkool
Without this patch absif2 always FAILs.  There is no testcase for
that, nor do we see it during bootstrap, but it is obvious.

Bootstrapped and tested on powerpc64-linux {-m32,-m64}; committing
to trunk.


Segher


2018-06-04  Segher Boessenkool  

* config/rs6000/rs6000.md (abs2): Handle IFmode.

---
 gcc/config/rs6000/rs6000.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 01bd6ba..e082fae 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -8030,7 +8030,7 @@ (define_expand "abs2"
   label = gen_label_rtx ();
   if (mode == TFmode)
 emit_insn (gen_abstf2_internal (operands[0], operands[1], label));
-  else if (mode == TFmode)
+  else if (mode == IFmode)
 emit_insn (gen_absif2_internal (operands[0], operands[1], label));
   else
 FAIL;
-- 
1.8.3.1



[PATCH] Define std::fpos special members as defaulted

2018-06-04 Thread Jonathan Wakely

* include/bits/postypes.h (fpos): Define special members as defaulted.

Tested powerpc64le-linux, committed to trunk.

commit 24dddbfdccf816b749601980ed05d2bce72e3d24
Author: Jonathan Wakely 
Date:   Mon Jun 4 16:50:29 2018 +0100

Define std::fpos special members as defaulted

* include/bits/postypes.h (fpos): Define special members as 
defaulted.

diff --git a/libstdc++-v3/include/bits/postypes.h 
b/libstdc++-v3/include/bits/postypes.h
index ba7c54781e8..8abdbf02322 100644
--- a/libstdc++-v3/include/bits/postypes.h
+++ b/libstdc++-v3/include/bits/postypes.h
@@ -133,6 +133,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   fpos(streamoff __off)
   : _M_off(__off), _M_state() { }
 
+#if __cplusplus >= 201103L
+  fpos(const fpos&) = default;
+  fpos& operator=(const fpos&) = default;
+  ~fpos() = default;
+#endif
+
   /// Convert to streamoff.
   operator streamoff() const { return _M_off; }
 


[PATCH] PR libstdc++/85930 fix misaligned reference

2018-06-04 Thread Jonathan Wakely

PR libstdc++/85930
* include/bits/shared_ptr_base.h (_Sp_make_shared_tag::_S_ti): Align
the static variable correctly.

Tested powerpc64le-linux, committed to trunk.

commit b0aa443f66d21f904ea0144b16df4b14ef9df09c
Author: Jonathan Wakely 
Date:   Mon Jun 4 16:45:42 2018 +0100

PR libstdc++/85930 fix misaligned reference

PR libstdc++/85930
* include/bits/shared_ptr_base.h (_Sp_make_shared_tag::_S_ti): Align
the static variable correctly.

diff --git a/libstdc++-v3/include/bits/shared_ptr_base.h 
b/libstdc++-v3/include/bits/shared_ptr_base.h
index b58273a79c5..22cb7eb46b1 100644
--- a/libstdc++-v3/include/bits/shared_ptr_base.h
+++ b/libstdc++-v3/include/bits/shared_ptr_base.h
@@ -516,7 +516,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 static const type_info&
 _S_ti() noexcept
 {
-  static constexpr _Sp_make_shared_tag __tag;
+  static constexpr alignas(type_info) _Sp_make_shared_tag __tag;
   return reinterpret_cast(__tag);
 }
 #endif


[PATCH, rs6000] Correct documentation of vec_lvsl and vec_lvsr arguments

2018-06-04 Thread Kelvin Nilsen
The existing documentation incorrectly specifies that the second argument of 
vec_lvsl and vec_lvsr instructions are volatile  *.  This patch removes 
the volatile qualifier from the documentation of these arguments.

his patch has bootstrapped and tested without regressions on 
powerpc64le-unknown-linux (P8).  I have built the gcc.pdf file and reviewed its 
contents.

Is this ok for trunk?

 gcc/ChangeLog:

2018-06-04  Kelvin Nilsen  

* doc/extend.texi (PowerPC AltiVec Built-in Functions): Remove
volatile qualifier from vec_lvsl and vec_lvsr argument prototypes.

Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi (revision 261067)
+++ gcc/doc/extend.texi (working copy)
@@ -16662,25 +16662,25 @@ vector unsigned char vec_ldl (int, const unsigned
 
 vector float vec_loge (vector float);
 
-vector unsigned char vec_lvsl (int, const volatile unsigned char *);
-vector unsigned char vec_lvsl (int, const volatile signed char *);
-vector unsigned char vec_lvsl (int, const volatile unsigned short *);
-vector unsigned char vec_lvsl (int, const volatile short *);
-vector unsigned char vec_lvsl (int, const volatile unsigned int *);
-vector unsigned char vec_lvsl (int, const volatile int *);
-vector unsigned char vec_lvsl (int, const volatile unsigned long *);
-vector unsigned char vec_lvsl (int, const volatile long *);
-vector unsigned char vec_lvsl (int, const volatile float *);
+vector unsigned char vec_lvsl (int, const unsigned char *);
+vector unsigned char vec_lvsl (int, const signed char *);
+vector unsigned char vec_lvsl (int, const unsigned short *);
+vector unsigned char vec_lvsl (int, const short *);
+vector unsigned char vec_lvsl (int, const unsigned int *);
+vector unsigned char vec_lvsl (int, const int *);
+vector unsigned char vec_lvsl (int, const unsigned long *);
+vector unsigned char vec_lvsl (int, const long *);
+vector unsigned char vec_lvsl (int, const float *);
 
-vector unsigned char vec_lvsr (int, const volatile unsigned char *);
-vector unsigned char vec_lvsr (int, const volatile signed char *);
-vector unsigned char vec_lvsr (int, const volatile unsigned short *);
-vector unsigned char vec_lvsr (int, const volatile short *);
-vector unsigned char vec_lvsr (int, const volatile unsigned int *);
-vector unsigned char vec_lvsr (int, const volatile int *);
-vector unsigned char vec_lvsr (int, const volatile unsigned long *);
-vector unsigned char vec_lvsr (int, const volatile long *);
-vector unsigned char vec_lvsr (int, const volatile float *);
+vector unsigned char vec_lvsr (int, const unsigned char *);
+vector unsigned char vec_lvsr (int, const signed char *);
+vector unsigned char vec_lvsr (int, const unsigned short *);
+vector unsigned char vec_lvsr (int, const short *);
+vector unsigned char vec_lvsr (int, const unsigned int *);
+vector unsigned char vec_lvsr (int, const int *);
+vector unsigned char vec_lvsr (int, const unsigned long *);
+vector unsigned char vec_lvsr (int, const long *);
+vector unsigned char vec_lvsr (int, const float *);
 
 vector float vec_madd (vector float, vector float, vector float);
 
@@ -18210,8 +18210,8 @@ vector double vec_ld (int, const vector double *);
 vector double vec_ld (int, const double *);
 vector double vec_ldl (int, const vector double *);
 vector double vec_ldl (int, const double *);
-vector unsigned char vec_lvsl (int, const volatile double *);
-vector unsigned char vec_lvsr (int, const volatile double *);
+vector unsigned char vec_lvsl (int, const double *);
+vector unsigned char vec_lvsr (int, const double *);
 vector double vec_madd (vector double, vector double, vector double);
 vector double vec_max (vector double, vector double);
 vector signed long vec_mergeh (vector signed long, vector signed long);



[PATCH] [MSP430] Fix device name symbol defined for MSP430i series of devices

2018-06-04 Thread Jozef Lawrynowicz

In the GCC board support package distributed by TI for msp430 devices, the
preprocessor symbol definition for the MSP430i series of devices expected by
the msp430.h header file differs from the standard template.

For other devices, the expected symbol has all characters in upper case, but
for the MSP430i devices, the 'i' must be lower case.

TI can't change the format of the symbol expected in msp430.h in the BSP, so
this work around is needed to ensure the -mmcu option works for MSP430i*
devices.

Ok for trunk?

>From 6ceb89b2955472b83d1cd1734231b46188e0c9bf Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Mon, 4 Jun 2018 12:04:54 +0100
Subject: [PATCH] MSP430: Fix device name symbol defined for msp430i* devices

2018-06-04  Jozef Lawrynowicz  

	* gcc/config/msp430/msp430.c (msp430_mcu_name): Set the "i" in the
	symbol defined for msp430i* devices to be lower case.

	gcc/testsuite/gcc.target/msp430/
	* msp430i-device-symbol.c: New test.
	* msp430.h: New test header file.

---
 gcc/config/msp430/msp430.c  | 16 
 gcc/testsuite/gcc.target/msp430/msp430.h|  6 ++
 gcc/testsuite/gcc.target/msp430/msp430i-device-symbol.c |  4 
 3 files changed, 22 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/msp430/msp430.h
 create mode 100644 gcc/testsuite/gcc.target/msp430/msp430i-device-symbol.c

diff --git a/gcc/config/msp430/msp430.c b/gcc/config/msp430/msp430.c
index a8fed12..c8ca58d 100644
--- a/gcc/config/msp430/msp430.c
+++ b/gcc/config/msp430/msp430.c
@@ -725,11 +725,19 @@ msp430_mcu_name (void)
   if (target_mcu)
 {
   unsigned int i;
-  static char mcu_name [64];
+  static char mcu_name[64];
 
-  snprintf (mcu_name, sizeof (mcu_name) - 1, "__%s__", target_mcu);
-  for (i = strlen (mcu_name); i--;)
-	mcu_name[i] = TOUPPER (mcu_name[i]);
+  /* The 'i' in the device name symbol for msp430i* devices must be lower
+	 case, to match the expected symbol in msp430.h.  */
+  if (strncmp (target_mcu, "msp430i", 7) == 0)
+	snprintf (mcu_name, sizeof (mcu_name) - 1, "__MSP430i%s__",
+		  target_mcu + 7);
+  else
+	{
+	  snprintf (mcu_name, sizeof (mcu_name) - 1, "__%s__", target_mcu);
+	  for (i = strlen (mcu_name); i--;)
+	mcu_name[i] = TOUPPER (mcu_name[i]);
+	}
   return mcu_name;
 }
 
diff --git a/gcc/testsuite/gcc.target/msp430/msp430.h b/gcc/testsuite/gcc.target/msp430/msp430.h
new file mode 100644
index 000..ee44f6b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/msp430/msp430.h
@@ -0,0 +1,6 @@
+#if defined (__MSP430i2020__)
+
+#else
+#error "Device not supported by msp430.h"
+
+#endif
diff --git a/gcc/testsuite/gcc.target/msp430/msp430i-device-symbol.c b/gcc/testsuite/gcc.target/msp430/msp430i-device-symbol.c
new file mode 100644
index 000..36a86c1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/msp430/msp430i-device-symbol.c
@@ -0,0 +1,4 @@
+/* { dg-do preprocess } */
+/* { dg-skip-if "" { "*-*-*" } { "-mlarge" "-mcpu=msp430x*" "-mmcu=*" "-mhwmult=32bit" "-mhwmult=f5series" } { "" } } */
+/* { dg-options "-mmcu=msp430i2020" } */
+#include "msp430.h"
-- 
2.7.4



C++ PATCH for c++/85976, ICE with USING_DECL in cp_tree_equal

2018-06-04 Thread Marek Polacek
I've had no luck in reducing the testcase in this PR, creduce won't get even
past the initial passes, and reducing by hand didn't get me very far, either.

But the problem seems to be merely that we're not handling USING_DECLs in
cp_tree_equal, and we can get there via comp_template_arguments.  In this case
we have two USING_DECLs with different full names.

So this patch just adds the USING_DECL case, similarly to e.g.
https://gcc.gnu.org/ml/gcc-patches/2012-10/msg00799.html

Bootstrapped/regtested on x86_64-linux, ok for trunk/8?  I verified manually
that this fixes the testcase from the PR.

2018-06-04  Marek Polacek  

PR c++/85976
* tree.c (cp_tree_equal): Handle USING_DECL.

--- gcc/cp/tree.c
+++ gcc/cp/tree.c
@@ -3752,6 +3752,7 @@ cp_tree_equal (tree t1, tree t2)
 case TEMPLATE_DECL:
 case IDENTIFIER_NODE:
 case SSA_NAME:
+case USING_DECL:
   return false;
 
 case BASELINK:

Marek


[PATCH,committed] [MAINTAINERS] Update email address

2018-06-04 Thread Matthew Fortune
Updating my email address, apologies for being out of date for a while.

Matthew

* MAINTAINERS: Update my email address.

---
 ChangeLog   | 4 
 MAINTAINERS | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index f9f376a..54b7958 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -77,7 +77,7 @@ m68k port Andreas Schwab

 m68k-motorola-sysv portPhilippe De Muyter  
 mcore port Nick Clifton
 microblaze Michael Eager   
-mips port  Matthew Fortune 
+mips port  Matthew Fortune 
 mmix port  Hans-Peter Nilsson  
 mn10300 port   Jeff Law
 mn10300 port   Alexandre Oliva 
-- 
2.2.1




Re: backport fix for PR 85623 to GCC 8

2018-06-04 Thread Martin Sebor

On 05/29/2018 10:11 AM, Martin Sebor wrote:

As discussed at (*) I'd like to backport the following patch
to GCC 8 to suppress a class of -Wstringop-truncation warnings.
If there are no concerns/objections I will go ahead and commit
it this week.

https://gcc.gnu.org/ml/gcc-patches/2018-05/msg00509.html

Thanks
Martin

*) https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01170.html


I've committed this change in r261152.

Martin


Re: [PATCH][RFC] Fix CFG cleanup compile-time hog, PR85964

2018-06-04 Thread Richard Biener
On Tue, 29 May 2018, Richard Biener wrote:

> 
> The following fixes the situation where the initial sweep over the
> CFG to remove trivially dead code regions causes excessive compile-time
> because of using remove_edge_and_dominated_blocks and thus
> iterate_fix_dominators.
> 
> The good thing is that I added cleanup_control_flow_pre doing this
> initial sweep in PRE order.  So we can simply remove the entry edges
> into the dead code regions and use the visited bitmap kept by the PRE
> walk to remove unreachable blocks afterwards.
> 
> For dominators we then re-compute them from scratch which is way faster
> for the testcase (a reduced one gets CFG cleanup time down from
> 19s to 0.16s).  The testcase still runs into a backwards jump-threading
> scalability issue.
> 
> Note the patch will be slower for the case of not many removed edges
> but it looks hard to find a good cut-off upfront.
> 
> Note we unfortunately cannot merge this with the unreachable block
> removal because of the intermediate code to insert loop entry
> forwarders which needs dominators to identify natural loop backedges.
> 
> Any opinions?

This is a less intrusive variant also factoring in the unreachable
block removal at the start of CFG cleanup.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

I'll go ahead with this tomorrow unless there are any comments.

Richard.

2018-06-04  Richard Biener  

* tree-cfgcleanup.c (cleanup_control_flow_pre): For edge
removal pretend DOM info isn't available so we do not update
it and only remove edges, not dominated blocks.  Actually free
DOM info in case we removed something.  Remove unreachable blocks.
(mfb_keep_latches): Work with either DOM info or marked backedges.
(cleanup_tree_cfg_noloop): Do not remove unreachable blocks
first.  Mark backedges if DOM info isn't available.
(Re-)compute DOM info after cleanup_control_flow_pre.

Index: gcc/tree-cfgcleanup.c
===
--- gcc/tree-cfgcleanup.c   (revision 261145)
+++ gcc/tree-cfgcleanup.c   (working copy)
@@ -684,8 +684,8 @@ want_merge_blocks_p (basic_block bb1, ba
 }
 
 
-/* Tries to cleanup cfg in basic block BB.  Returns true if anything
-   changes.  */
+/* Tries to cleanup cfg in basic block BB by merging blocks.  Returns
+   true if anything changes.  */
 
 static bool
 cleanup_tree_cfg_bb (basic_block bb)
@@ -725,6 +725,12 @@ cleanup_control_flow_pre ()
 {
   bool retval = false;
 
+  /* We want remove_edge_and_dominated_blocks to only remove edges,
+ not dominated blocks which it does when dom info isn't available.
+ Pretend so.  */
+  dom_state saved_state = dom_info_state (CDI_DOMINATORS);
+  set_dom_info_availability (CDI_DOMINATORS, DOM_NONE);
+
   auto_vec stack (n_basic_blocks_for_fn (cfun) + 1);
   auto_sbitmap visited (last_basic_block_for_fn (cfun));
   bitmap_clear (visited);
@@ -741,6 +747,8 @@ cleanup_control_flow_pre ()
  && ! bitmap_bit_p (visited, dest->index))
{
  bitmap_set_bit (visited, dest->index);
+ /* We only possibly remove edges from DEST here, leaving
+possibly unreachable code in the IL.  */
  retval |= cleanup_control_flow_bb (dest);
  if (EDGE_COUNT (dest->succs) > 0)
stack.quick_push (ei_start (dest->succs));
@@ -754,13 +762,35 @@ cleanup_control_flow_pre ()
}
 }
 
+  set_dom_info_availability (CDI_DOMINATORS, saved_state);
+
+  /* We are deleting BBs in non-reverse dominator order, make sure
+ insert_debug_temps_for_defs is prepared for that.  */
+  if (retval)
+free_dominance_info (CDI_DOMINATORS);
+
+  /* Remove all now (and previously) unreachable blocks.  */
+  for (int i = NUM_FIXED_BLOCKS; i < last_basic_block_for_fn (cfun); ++i)
+{
+  basic_block bb = BASIC_BLOCK_FOR_FN (cfun, i);
+  if (bb && !bitmap_bit_p (visited, bb->index))
+   {
+ if (!retval)
+   free_dominance_info (CDI_DOMINATORS);
+ delete_basic_block (bb);
+ retval = true;
+   }
+}
+
   return retval;
 }
 
 static bool
 mfb_keep_latches (edge e)
 {
-  return ! dominated_by_p (CDI_DOMINATORS, e->src, e->dest);
+  return !((dom_info_available_p (CDI_DOMINATORS)
+   && dominated_by_p (CDI_DOMINATORS, e->src, e->dest))
+  || (e->flags & EDGE_DFS_BACK));
 }
 
 /* Remove unreachable blocks and other miscellaneous clean up work.
@@ -769,23 +799,8 @@ mfb_keep_latches (edge e)
 static bool
 cleanup_tree_cfg_noloop (void)
 {
-  bool changed;
-
   timevar_push (TV_TREE_CLEANUP_CFG);
 
-  /* If dominance information is available, there cannot be any unreachable
- blocks.  */
-  if (!dom_info_available_p (CDI_DOMINATORS))
-{
-  changed = delete_unreachable_blocks ();
-  calculate_dominance_info (CDI_DOMINATORS);
-}
-  else
-{
-  checking_verify_dominators (CDI_DOMINATORS);
-  changed = false;
-}
-
   /* Ensure 

C++ PATCH for c++/61806, SFINAE, access control, and partial specialization.

2018-06-04 Thread Jason Merrill
Here we were missing SFINAE when choosing a partial specialization
because we requested the instantiation from an access-deferred context
and didn't push into checking context until too late.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 77b60d23c71206b7ac3c9d9817db3787383ea2fd
Author: Jason Merrill 
Date:   Mon Jun 4 16:51:15 2018 +0200

PR c++/61806 - missed SFINAE with partial specialization.

* cp-tree.h (deferring_access_check_sentinel): Add deferring_kind
parameter to constructor.
* pt.c (instantiate_class_template_1): Enable access checking
before call to most_specialized_partial_spec.

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 6a97abbe4e3..f2016f173bd 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6807,9 +6807,9 @@ extern bool perform_or_defer_access_check	(tree, tree, tree,
 
 struct deferring_access_check_sentinel
 {
-  deferring_access_check_sentinel ()
+  deferring_access_check_sentinel (enum deferring_kind kind = dk_deferred)
   {
-push_deferring_access_checks (dk_deferred);
+push_deferring_access_checks (kind);
   }
   ~deferring_access_check_sentinel ()
   {
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index d3915ddcddf..aad68a32643 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -10850,6 +10850,10 @@ instantiate_class_template_1 (tree type)
   /* Mark the type as in the process of being defined.  */
   TYPE_BEING_DEFINED (type) = 1;
 
+  /* We may be in the middle of deferred access check.  Disable
+ it now.  */
+  deferring_access_check_sentinel acs (dk_no_deferred);
+
   /* Determine what specialization of the original template to
  instantiate.  */
   t = most_specialized_partial_spec (type, tf_warning_or_error);
@@ -10889,10 +10893,6 @@ instantiate_class_template_1 (tree type)
   if (! push_tinst_level (type))
 return type;
 
-  /* We may be in the middle of deferred access check.  Disable
- it now.  */
-  push_deferring_access_checks (dk_no_deferred);
-
   int saved_unevaluated_operand = cp_unevaluated_operand;
   int saved_inhibit_evaluation_warnings = c_inhibit_evaluation_warnings;
 
@@ -11373,7 +11373,6 @@ instantiate_class_template_1 (tree type)
   maximum_field_alignment = saved_maximum_field_alignment;
   if (!fn_context)
 pop_from_top_level ();
-  pop_deferring_access_checks ();
   pop_tinst_level ();
 
   /* The vtable for a template class can be emitted in any translation
diff --git a/gcc/testsuite/g++.dg/cpp0x/sfinae63.C b/gcc/testsuite/g++.dg/cpp0x/sfinae63.C
new file mode 100644
index 000..7ad38497c0b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/sfinae63.C
@@ -0,0 +1,39 @@
+// PR c++/61806
+// { dg-do compile { target c++11 } }
+
+struct true_type 
+{
+  static const bool value = true;
+};
+
+struct false_type 
+{
+  static const bool value = false;
+};
+
+template
+T&& declval();
+
+template struct check { typedef void type; };
+
+template
+struct has_public_f : false_type {};
+
+template
+struct has_public_f<
+T,
+typename check<
+decltype(
+declval().f()
+)
+>::type
+> : true_type {};
+
+
+struct Spub  { public: void f(); };
+struct Spriv { private: void f(); };
+
+static_assert( has_public_f::value, "Ouch");
+static_assert(!has_public_f::value, "Ouch");
+
+int main() {}


[PATCH] PR fortran/85138,85996 -- Use private namespace

2018-06-04 Thread Steve Kargl
All,

The attach patch fixes a regression introduced by r258347.  In that
revision, an attempt is made to reduce a scalar integer expression
to a constant in a CHARACTER(LEN=...) type declaration.  If successful,
life is good.  If unsuccessful, the original scalar integer expression is
simply saved for later resolution. 

Unfortunately, this created a regression, because an attempt to reduce
the scalar integer expression mucks up the namespace.  In particular,
this occurs for user-defined generic functions with the module procedures
declared after the use of the generic name (see testcases).  Note, this
may actually be invalid Fortran, but I'm not going to try to unravel:

   A variable in a specification expression shall have its
   type and type parameters, if any, specified by a previous
   declaration in the same scoping unit, by the implicit
   typing rules in effect for the scoping unit, or by host
   or use association.

It seems to be a catch-22.  The module procedures should be available
through host association except the generic name is used before the
module procedures are parsed.

The fix is to use a private namespace.  Whether an attempt to
reduce the scalar interger expression is successful or not,
the private namespace is discarded.

Built and regression tested on x86_64-*-freebsd and i586-*-freebsd.
OK to commit?

2018-06-04  Steven G. Kargl  

PR fortran/85138
PR fortran/85996
* decl.c (gfc_match_char_spec): Use private namespace in attempt to
reduce a charlen to a constant.

2018-06-04  Steven G. Kargl  

PR fortran/85138
PR fortran/85996
* gfortran.dg/pr85138_1.f90: New test.
* gfortran.dg/pr85138_2.f90: Ditto.
* gfortran.dg/pr85996.f90: Ditto.

-- 
Steve
Index: gcc/fortran/decl.c
===
--- gcc/fortran/decl.c	(revision 261145)
+++ gcc/fortran/decl.c	(working copy)
@@ -3238,12 +3238,20 @@ done:
 cl->length = gfc_get_int_expr (gfc_charlen_int_kind, NULL, 1);
   else
 {
-  /* If gfortran ends up here, then the len may be reducible to a
-	 constant.  Try to do that here.  If it does not reduce, simply
-	 assign len to the charlen.  */
+  /* If gfortran ends up here, then len may be reducible to a constant.
+	 Try to do that here.  If it does not reduce, simply assign len to
+	 charlen.  A complication occurs with user-defined generic functions,
+	 which are not resolved.  Use a private namespace to deal with
+	 generic functions.  */
+
   if (len && len->expr_type != EXPR_CONSTANT)
 	{
+	  gfc_namespace *old_ns;
 	  gfc_expr *e;
+
+	  old_ns = gfc_current_ns;
+	  gfc_current_ns = gfc_get_namespace (NULL, 0);
+
 	  e = gfc_copy_expr (len);
 	  gfc_reduce_init_expr (e);
 	  if (e->expr_type == EXPR_CONSTANT)
@@ -3254,10 +3262,12 @@ done:
 	}
 	  else
 	gfc_free_expr (e);
-	  cl->length = len;
+
+	  gfc_free_namespace (gfc_current_ns);
+	  gfc_current_ns = old_ns;
 	}
-  else
-	cl->length = len;
+
+  cl->length = len;
 }
 
   ts->u.cl = cl;
Index: gcc/testsuite/gfortran.dg/pr85138_1.f90
===
--- gcc/testsuite/gfortran.dg/pr85138_1.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr85138_1.f90	(working copy)
@@ -0,0 +1,29 @@
+! { dg-do compile }
+module fox_m_fsys_format
+
+  interface len
+ module procedure str_real_sp_len, str_real_sp_fmt_len
+  end interface
+
+contains
+
+  pure function str_real_sp_fmt_len(x, fmt) result(n)
+real, intent(in) :: x
+character(len=*), intent(in) :: fmt
+if (.not.checkFmt(fmt)) then
+endif
+  end function str_real_sp_fmt_len
+  pure function str_real_sp_len(x) result(n)
+real, intent(in) :: x
+n = len(x, "")
+  end function str_real_sp_len
+  pure function str_real_dp_matrix(xa) result(s)
+real, intent(in) :: xa
+character(len=len(xa)) :: s
+  end function str_real_dp_matrix
+
+  pure function checkfmt(s) result(a)
+   logical a
+   character(len=*), intent(in) :: s
+  end function checkfmt
+end module fox_m_fsys_format
Index: gcc/testsuite/gfortran.dg/pr85138_2.f90
===
--- gcc/testsuite/gfortran.dg/pr85138_2.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr85138_2.f90	(working copy)
@@ -0,0 +1,32 @@
+! { dg-do compile }
+module fox_m_fsys_format
+  interface len
+module procedure str_real_dp_len, str_real_dp_fmt_len
+  end interface
+contains
+  pure function str_real_dp_fmt_len(x, fmt) result(n)
+real, intent(in) :: x
+character(len=*), intent(in) :: fmt
+if (.not.checkFmt(fmt)) then
+endif
+  end function str_real_dp_fmt_len
+  pure function str_real_dp_len(x) result(n)
+real, intent(in) :: x
+  end function str_real_dp_len
+  pure function str_real_dp_array_len(xa) result(n)
+real, dimension(:), intent(in) :: xa
+  end function str_real_dp_array_len
+  pure function str_real_dp_array_fmt_len(xa, 

Re: [Patch, Fortran] PR25829: Asynchronous I/O

2018-06-04 Thread Rainer Orth
Hi Nicolas,

> P.S.: I would very much recommend removing the #undef DEBUG in async.h. I
> have to admit, I am quite proud of the debug printouts. They even build a
> data structure in the background telling you were a locked mutex was
> locked.

however, doing so breaks quite a number of tests in gfortran.dg where
the additional output is unexpected.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [Patch, Fortran] PR25829: Asynchronous I/O (v2)

2018-06-04 Thread Rainer Orth
Hi Dominique, Nicolas,

> I have applied your patch on top of revision r261130 on
> x86_64-apple-darwin17 (SSD with APFS file system).

I've tried it on i386-pc-solaris2.11 and sparc-sun-solaris2.11.

> I also see two regressions
>
> FAIL: gfortran.dg/f2003_inquire_1.f03   -O1  execution test
>
> only with -m32 and -O1 (STOP 5), and

It fails for me at -O[0s] (i386) resp. -O[01] (sparc), 64-bit only.

> FAIL: gfortran.dg/f2003_io_1.f03   -O*
>
> with both -m32 and -m64 (STOP 1).

Same here: FAILs at -O[0-3s] for both 32 and 64-bit.

> The is also typos for the added tests
>
> s/libgfomp/libgomp/
>
> Why do the tests start at asynchronous_6.f90?

... and asynchronous_9.f90 is missing from the ChangeLog, which
..._7.f90 is missing from the sequence.

Besides, I see

+FAIL: libgomp.fortran/asynchronous_6.f90   -O1  execution test

STOP 2

32-bit i386 only.

+FAIL: libgomp.fortran/asynchronous_9.f90   -O  execution test

32 and 64-bit i386 and sparc, no error message.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] Fix PR85955

2018-06-04 Thread Richard Biener


I am testing the following patch to fix an ICE with sincos
folding with mismatched arguments.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Richard.

2018-06-04  Richard Biener  

PR tree-optimization/85955
* builtins.c (fold_builtin_sincos): Convert pointers to
destination to appropriate type before dereferencing.

* gcc.dg/pr85955.c: New testcase.

Index: gcc/builtins.c
===
--- gcc/builtins.c  (revision 261145)
+++ gcc/builtins.c  (working copy)
@@ -8266,6 +8266,9 @@ fold_builtin_sincos (location_t loc,
   call = builtin_save_expr (call);
 }
 
+  tree ptype = build_pointer_type (type);
+  arg1 = fold_convert (ptype, arg1);
+  arg2 = fold_convert (ptype, arg2);
   return build2 (COMPOUND_EXPR, void_type_node,
 build2 (MODIFY_EXPR, void_type_node,
 build_fold_indirect_ref_loc (loc, arg1),
Index: gcc/testsuite/gcc.dg/pr85955.c
===
--- gcc/testsuite/gcc.dg/pr85955.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/pr85955.c  (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -w" } */
+
+extern void sincos(double x, double *sinx, double *cosx);
+void apply(void (*f)(double, double *, double *),
+  double x, double *sinx, double *cosx)
+{
+  f(x, sinx, cosx);
+  return;
+}
+void apply_sincos(double x, double **sinx, double **cosx)
+{
+  apply(sincos, x, sinx, cosx);
+  return;
+}


Re: [PATCH] Optimize AVX512 vpcmpeq* against 0 into vptestnm* rather than vptestm* (PR target/85832, PR target/86036)

2018-06-04 Thread Uros Bizjak
On Mon, Jun 4, 2018 at 3:08 PM, Jakub Jelinek  wrote:
> Hi!
>
> On Wed, May 23, 2018 at 08:45:19AM +0200, Jakub Jelinek wrote:
>> As mentioned in the PR, vptestm* instructions with the same input operand 
>> used
>> twice perform the same comparison as vpcmpeq* against zero vector, with the
>> advantage that a register holding CONST0_RTX (mode) is not needed.
>>
>> 2018-05-23  Jakub Jelinek  
>>
>>   PR target/85832
>>   * config/i386/sse.md (_eq3_1):
>>   Add (=Yk,v,C) variant using vptestm insn.  Use TARGET_AVX512BW
>>   in test instead of TARGET_AVX512F for VI12_AVX512VL iterator.
>>
>>   * gcc.target/i386/avx512f-pr85832.c: New test.
>>   * gcc.target/i386/avx512vl-pr85832.c: New test.
>>   * gcc.target/i386/avx512bw-pr85832.c: New test.
>>   * gcc.target/i386/avx512vlbw-pr85832.c: New test.
>
> I've unfortunately not added an executable testcase nor tested it under sde,
> so missed that say vpcmpeqw with 0 vector as one of the operands actually
> doesn't do what vptestmw with the other argument repeated does, it does
> exactly the opposite, vpcmpeqw sets bits in the mask register for elements
> that are equal to 0, but vptestmw sets bits in the mask register for elements
> where and of first arg and second arg (i.e. the argument that is repeated)
> is non-zero.  Fortunately there is vptestnmw which does what we want.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux + tested on the
> testcase with sde.  Ok for trunk?
>
> 2018-06-04  Jakub Jelinek  
>
> PR target/85832
> PR target/86036
> * config/i386/sse.md (_eq3_1):
> Use vptestnm rather than vptestm in (=Yc,v,C) variant.
>
> * gcc.target/i386/avx512f-pr85832.c: Expect vptestnm rather than
> vptestm.
> * gcc.target/i386/avx512vl-pr85832.c: Likewise.
> * gcc.target/i386/avx512vlbw-pr85832.c: Likewise.
> * gcc.target/i386/avx512bw-pr85832.c: Likewise.
> * gcc.target/i386/avx512bw-pr86036.c: New test.

OK (I'd say it is obvious patch).

Thanks,
Uros.

> --- gcc/config/i386/sse.md.jj   2018-05-31 20:53:41.933453308 +0200
> +++ gcc/config/i386/sse.md  2018-06-04 10:29:02.667720644 +0200
> @@ -11287,7 +11287,7 @@ (define_insn "_eq3"TARGET_AVX512BW && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
>"@
> vpcmpeq\t{%2, %1, 
> %0|%0, %1, %2}
> -   vptestm\t{%1, %1, 
> %0|%0, %1, %1}"
> +   vptestnm\t{%1, %1, 
> %0|%0, %1, %1}"
>[(set_attr "type" "ssecmp")
> (set_attr "prefix_extra" "1")
> (set_attr "prefix" "evex")
> @@ -11302,7 +11302,7 @@ (define_insn "_eq3"TARGET_AVX512F && !(MEM_P (operands[1]) && MEM_P (operands[2]))"
>"@
> vpcmpeq\t{%2, %1, 
> %0|%0, %1, %2}
> -   vptestm\t{%1, %1, 
> %0|%0, %1, %1}"
> +   vptestnm\t{%1, %1, 
> %0|%0, %1, %1}"
>[(set_attr "type" "ssecmp")
> (set_attr "prefix_extra" "1")
> (set_attr "prefix" "evex")
> --- gcc/testsuite/gcc.target/i386/avx512f-pr85832.c.jj  2018-05-25 
> 14:35:23.123416639 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512f-pr85832.c 2018-06-04 
> 11:00:00.773880446 +0200
> @@ -1,8 +1,8 @@
>  /* PR target/85832 */
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -mavx512f -mno-avx512vl -mno-avx512bw -masm=att" } */
> -/* { dg-final { scan-assembler-times {\mvptestmd\M} 1 } } */
> -/* { dg-final { scan-assembler-times {\mvptestmq\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvptestnmd\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mvptestnmq\M} 1 } } */
>
>  #include 
>
> --- gcc/testsuite/gcc.target/i386/avx512vl-pr85832.c.jj 2018-05-25 
> 14:35:23.123416639 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512vl-pr85832.c2018-06-04 
> 11:00:09.995895313 +0200
> @@ -1,8 +1,8 @@
>  /* PR target/85832 */
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -mavx512vl -mno-avx512bw -masm=att" } */
> -/* { dg-final { scan-assembler-times {\mvptestmd\M} 2 } } */
> -/* { dg-final { scan-assembler-times {\mvptestmq\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mvptestnmd\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mvptestnmq\M} 2 } } */
>
>  #include 
>
> --- gcc/testsuite/gcc.target/i386/avx512vlbw-pr85832.c.jj   2018-05-25 
> 14:35:23.124416640 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512vlbw-pr85832.c  2018-06-04 
> 11:00:06.02098 +0200
> @@ -1,8 +1,8 @@
>  /* PR target/85832 */
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -mavx512vl -mavx512bw -masm=att" } */
> -/* { dg-final { scan-assembler-times {\mvptestmb\M} 2 } } */
> -/* { dg-final { scan-assembler-times {\mvptestmw\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mvptestnmb\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mvptestnmw\M} 2 } } */
>
>  #include 
>
> --- gcc/testsuite/gcc.target/i386/avx512bw-pr85832.c.jj 2018-05-25 
> 14:35:23.124416640 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512bw-pr85832.c2018-06-04 
> 10:59:53.015867934 +0200
> @@ -1,8 +1,8 @@
>  /* PR target/85832 */
>  /* { dg-do compile } 

Re: [patch] Enhance GIMPLE store-merging pass for bit-fields (2)

2018-06-04 Thread Richard Biener
On Mon, Jun 4, 2018 at 8:32 AM Eric Botcazou  wrote:
>
> Hi,
>
> the previous patch makes it possible to merge bit-field stores whose RHS is a
> constant or a SSA name, but there is a hitch: if the SSA name is the result of
> an "interesting" load, then the optimization is blocked.  That's because the
> GIMPLE store-merging pass not only attempts to merge stores but also loads if
> they directly feed subsequent stores.  Therefore the code generated for:
>
> struct S {
>   unsigned int flag : 1;
>   unsigned int size : 31;
> };
>
> void foo (struct S *s, struct S *m)
> {
>   s->flag = 1;
>   s->size = m->size;
> }
>
> is still abysmal at -O2:
>
> orb $1, (%rdi)
> movl(%rsi), %eax
> andl$-2, %eax
> movl%eax, %edx
> movl(%rdi), %eax
> andl$1, %eax
> orl %edx, %eax
> movl%eax, (%rdi)
> ret
>
> The attached patch changes it into the optimal:
>
> movl(%rsi), %eax
> orl $1, %eax
> movl%eax, (%rdi)
> ret
>
> The patch doesn't modify the overall logic of the pass but just turns MEM_REF
> stores into BIT_INSERT_EXPR stores when there is a preceding or subsequent
> BIT_INSERT_EXPR or INTEGER_CST store in the same bit-field region.
>
> Tested on x86-64/Linux, OK for the mainline?

OK.

Richard.

>
> 2018-06-04  Eric Botcazou  
>
> * gimple-ssa-store-merging.c (struct merged_store_group): Move up
> bit_insertion field and declare can_be_merged_into method.
> (merged_store_group::can_be_merged_into): New method.
> (imm_store_chain_info::coalesce_immediate): Call it to decide whether
> consecutive non-overlapping stores can be merged.  Turn MEM_REF stores
> into BIT_INSERT_EXPR stores if the group contains a non-MEM_REF store.
>
>
> 2018-06-04  Eric Botcazou  
>
> * gcc.dg/store_merging_21.c: New test.
> * gnat.dg/opt71b.adb: Likewise.
>
> --
> Eric Botcazou


[gomp5] Allow OpenMP simd loop iter vars to be private or lastprivate

2018-06-04 Thread Jakub Jelinek
Hi!

In OpenMP 5.0, although the simd loop iteration vars are still predetermined
linear (for non-collapsed loops or collapse(1)) or lastprivate (collapse>1),
one can explicitly make the iteration variable private or lastprivate;
especially the former is useful to make it clear nothing needs to be
propagated to the original list item.

Tested on x86_64-linux, committed to gomp-5_0-branch.

2018-06-04  Jakub Jelinek  

* gimplify.c (omp_is_private): Allow simd iterator to be lastprivate
or private.  Fix up diagnostics if linear is used on collapse>1 simd
iterator.

* gcc.dg/gomp/simd-1.c: New test.
* g++.dg/gomp/simd-2.C: New test.

* testsuite/libgomp.c-c++-common/simd-1.c: New test.

--- gcc/gimplify.c.jj   2018-05-31 17:16:02.925643457 +0200
+++ gcc/gimplify.c  2018-06-04 12:40:32.232748770 +0200
@@ -7463,18 +7463,9 @@ omp_is_private (struct gimplify_omp_ctx
  else if ((n->value & GOVD_REDUCTION) != 0)
error ("iteration variable %qE should not be reduction",
   DECL_NAME (decl));
- else if (simd == 0 && (n->value & GOVD_LINEAR) != 0)
+ else if (simd != 1 && (n->value & GOVD_LINEAR) != 0)
error ("iteration variable %qE should not be linear",
   DECL_NAME (decl));
- else if (simd == 1 && (n->value & GOVD_LASTPRIVATE) != 0)
-   error ("iteration variable %qE should not be lastprivate",
-  DECL_NAME (decl));
- else if (simd && (n->value & GOVD_PRIVATE) != 0)
-   error ("iteration variable %qE should not be private",
-  DECL_NAME (decl));
- else if (simd == 2 && (n->value & GOVD_LINEAR) != 0)
-   error ("iteration variable %qE is predetermined linear",
-  DECL_NAME (decl));
}
   return (ctx == gimplify_omp_ctxp
  || (ctx->region_type == ORT_COMBINED_PARALLEL
--- gcc/testsuite/gcc.dg/gomp/simd-1.c.jj   2018-06-04 14:32:12.406269289 
+0200
+++ gcc/testsuite/gcc.dg/gomp/simd-1.c  2018-06-04 14:13:31.279028667 +0200
@@ -0,0 +1,21 @@
+int a[32], b[32];
+
+void
+foo (void)
+{
+  int i, j;
+  #pragma omp simd linear(i, j) collapse(2)/* { dg-error "iteration 
variable 'i' should not be linear" } */
+  for (i = 0; i < 32; ++i) /* { dg-error "iteration 
variable 'j' should not be linear" "" { target *-*-* } .-1 } */
+for (j = 0; j < 32; ++j)
+  a[i] += b[j];
+}
+
+void
+bar (void)
+{
+  static int i, j;
+  #pragma omp for simd linear(i, j) collapse(2)/* { dg-error 
"iteration variable 'i' should not be linear" } */
+  for (i = 0; i < 32; ++i) /* { dg-error "iteration 
variable 'j' should not be linear" "" { target *-*-* } .-1 } */
+for (j = 0; j < 32; ++j)
+  a[i] += b[j];
+}
--- gcc/testsuite/g++.dg/gomp/simd-2.C.jj   2018-06-04 14:32:28.938287583 
+0200
+++ gcc/testsuite/g++.dg/gomp/simd-2.C  2018-06-04 14:37:43.931636150 +0200
@@ -0,0 +1,21 @@
+int a[32], b[32];
+
+void
+foo (void)
+{
+  int i, j;
+  #pragma omp simd linear(i, j) collapse(2)// { dg-error "iteration 
variable 'i' should not be linear" }
+  for (i = 0; i < 32; ++i) // { dg-error "iteration 
variable 'j' should not be linear" "" { target *-*-* } .-1 }
+for (j = 0; j < 32; ++j)
+  a[i] += b[j];
+}
+
+void
+bar (void)
+{
+  static int i, j;
+  #pragma omp for simd linear(i, j) collapse(2)// { dg-error 
"iteration variable 'i' should not be linear" "" { target *-*-* } .+1 }
+  for (i = 0; i < 32; ++i) // { dg-error "iteration 
variable 'j' should not be linear" }
+for (j = 0; j < 32; ++j)
+  a[i] += b[j];
+}
--- libgomp/testsuite/libgomp.c-c++-common/simd-1.c.jj  2018-06-04 
14:18:27.803356796 +0200
+++ libgomp/testsuite/libgomp.c-c++-common/simd-1.c 2018-06-04 
14:19:02.968395710 +0200
@@ -0,0 +1,71 @@
+// { dg-do run }
+// { dg-additional-options "-msse2" { target sse2_runtime } }
+// { dg-additional-options "-mavx" { target avx_runtime } }
+
+#define N 1024
+int a[N], b[N];
+
+int
+f1 (void)
+{
+  int i;
+  #pragma omp simd private (i)
+  for (i = 0; i < N; i++)
+a[i] = b[i] * 2;
+  #pragma omp simd lastprivate (i)
+  for (i = 0; i < N; i++)
+a[i] += b[i] * 2;
+  return i;
+}
+
+int
+f2 (void)
+{
+  int i, j;
+  #pragma omp simd private (i), collapse (2), lastprivate (j)
+  for (i = 0; i < 32; i++)
+for (j = 0; j < 32; ++j)
+  a[i * 32 + j] += b[i * 32 + j] * 2;
+  return j;
+}
+
+int
+f3 (void)
+{
+  static int i;
+  #pragma omp for simd private (i)
+  for (i = 0; i < N; ++i)
+a[i] = b[i] * 2;
+  #pragma omp for simd lastprivate (i)
+  for (i = 0; i < N; ++i)
+a[i] += b[i] * 2;
+  return i;
+}
+
+int
+f4 (void)
+{
+  static int i, j;
+  #pragma omp for simd private (i)collapse (2)lastprivate (j)
+  for (i = 0; i < 32; ++i)
+for (j = 0; j < 32; j++)
+  a[i * 32 + j] += b[i * 32 + j] * 2;
+  return j;
+}
+
+int
+main ()
+{
+  

C++ PATCH for c++/85765, SFINAE and non-type default template argument

2018-06-04 Thread Jason Merrill
My patch for 84785 started setting processing_template_decl while
substituting into a default template argument that might use template
parameters we don't yet have arguments for.  That broke this testcase
because when processing_template_decl is set, we don't actually try to
perform the conversion:

  /* In a template, there is no need to perform all of the work that
 is normally done.  We are only interested in the type of the call
 expression, i.e., the return type of the function.  Any semantic
 errors will be deferred until the template is instantiated.  */

But in this case we aren't actually in a template, so we want to get
the error for SFINAE.

With this patch, if the initial substitution replaces all template
parameters, we then do a full substitution.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit a2223328af8276d1796501e8706ca3c6b559eff6
Author: Jason Merrill 
Date:   Mon Jun 4 09:41:04 2018 +0200

PR c++/85765 - SFINAE and non-type default template arg.

* pt.c (type_unification_real): Do full semantic processing if
substituting a partial args list replaces all template parms.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 4c5890deeb8..d3915ddcddf 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -20387,8 +20387,6 @@ type_unification_real (tree tparms,
 	  location_t save_loc = input_location;
 	  if (DECL_P (parm))
 	input_location = DECL_SOURCE_LOCATION (parm);
-	  if (saw_undeduced == 1)
-	++processing_template_decl;
 
 	  if (saw_undeduced == 1
 	  && TREE_CODE (parm) == PARM_DECL
@@ -20396,11 +20394,14 @@ type_unification_real (tree tparms,
 	{
 	  /* The type of this non-type parameter depends on undeduced
 		 parameters.  Don't try to use its default argument yet,
+		 since we might deduce an argument for it on the next pass,
 		 but do check whether the arguments we already have cause
 		 substitution failure, so that that happens before we try
 		 later default arguments (78489).  */
+	  ++processing_template_decl;
 	  tree type = tsubst (TREE_TYPE (parm), full_targs, complain,
   NULL_TREE);
+	  --processing_template_decl;
 	  if (type == error_mark_node)
 		arg = error_mark_node;
 	  else
@@ -20408,10 +20409,27 @@ type_unification_real (tree tparms,
 	}
 	  else
 	{
-	  arg = tsubst_template_arg (arg, full_targs, complain, NULL_TREE);
+	  tree substed = NULL_TREE;
+	  if (saw_undeduced == 1 && processing_template_decl == 0)
+		{
+		  /* First instatiate in template context, in case we still
+		 depend on undeduced template parameters.  */
+		  ++processing_template_decl;
+		  substed = tsubst_template_arg (arg, full_targs, complain,
+		 NULL_TREE);
+		  --processing_template_decl;
+		  if (substed != error_mark_node
+		  && !uses_template_parms (substed))
+		/* We replaced all the tparms, substitute again out of
+		   template context.  */
+		substed = NULL_TREE;
+		}
+	  if (!substed)
+		substed = tsubst_template_arg (arg, full_targs, complain,
+	   NULL_TREE);
 
-	  if (!uses_template_parms (arg))
-		arg = convert_template_argument (parm, arg, full_targs,
+	  if (!uses_template_parms (substed))
+		arg = convert_template_argument (parm, substed, full_targs,
 		 complain, i, NULL_TREE);
 	  else if (saw_undeduced == 1)
 		arg = NULL_TREE;
@@ -20419,8 +20437,6 @@ type_unification_real (tree tparms,
 		arg = error_mark_node;
 	}
 
-	  if (saw_undeduced == 1)
-	--processing_template_decl;
 	  input_location = save_loc;
 	  *checks = get_deferred_access_checks ();
 	  pop_deferring_access_checks ();
diff --git a/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg10.C b/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg10.C
new file mode 100644
index 000..f0f9dfb73dc
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/fntmpdefarg10.C
@@ -0,0 +1,18 @@
+// PR c++/85765
+// { dg-do compile { target c++11 } }
+
+struct il { il(); il(const il&); };
+
+int* begin(il);
+
+template T&& declval();
+
+template())), decltype(*U(),0) = 0>
+U g(T& t, long) { return begin(t); } // #1
+
+template
+int g(T& t, ...); // #2
+
+volatile il a;
+
+auto n = g(a, 0); // calls #1 and ends with a hard error, should call #2


Re: [PATCH 09/10] Experiment with using optinfo in gimple-loop-interchange.cc

2018-06-04 Thread Richard Biener
On Sat, Jun 2, 2018 at 12:22 AM David Malcolm  wrote:
>
> On Fri, 2018-06-01 at 17:31 +0200, Richard Biener wrote:
> > On June 1, 2018 3:40:15 PM GMT+02:00, David Malcolm  > com> wrote:
> > > On Fri, 2018-06-01 at 11:50 +0200, Richard Biener wrote:
> > > > On Tue, May 29, 2018 at 10:33 PM David Malcolm  > > > om>
> > > > wrote:
> > > > >
> > > > > This was an experiment to try to capture information on a
> > > > > loop optimization.
> > > > >
> > > > > gcc/ChangeLog:
> > > > > * gimple-loop-interchange.cc
> > > > > (should_interchange_loops):
> > > > > Add
> > > > > optinfo note when interchange gives better data
> > > > > locality
> > > > > behavior.
> > > > > (tree_loop_interchange::interchange): Add
> > > > > OPTINFO_SCOPE.
> > > > > Add optinfo for successful and unsuccessful
> > > > > interchanges.
> > > > > (prepare_perfect_loop_nest): Add OPTINFO_SCOPE.  Add
> > > > > optinfo note.
> > > > > (pass_linterchange::execute): Add OPTINFO_SCOPE.
> > > > > ---
> > > > >  gcc/gimple-loop-interchange.cc | 36
> > > > > +++-
> > > > >  1 file changed, 35 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-
> > > > > interchange.cc
> > > > > index eb35263..cd32288 100644
> > > > > --- a/gcc/gimple-loop-interchange.cc
> > > > > +++ b/gcc/gimple-loop-interchange.cc
> > > > > @@ -1556,7 +1556,19 @@ should_interchange_loops (unsigned
> > > > > i_idx,
> > > > > unsigned o_idx,
> > > > >ratio = innermost_loops_p ? INNER_STRIDE_RATIO :
> > > > > OUTER_STRIDE_RATIO;
> > > > >/* Do interchange if it gives better data locality
> > > > > behavior.  */
> > > > >if (wi::gtu_p (iloop_strides, wi::mul (oloop_strides,
> > > > > ratio)))
> > > > > -return true;
> > > > > +{
> > > > > +  if (optinfo_enabled_p ())
> > > > > +   OPTINFO_NOTE ((gimple *)NULL) // FIXME
> > > > > + << "interchange gives better data locality behavior:
> > > > > "
> > > > > + << "iloop_strides: "
> > > > > + << decu (iloop_strides)
> > > > > + << " > (oloop_strides: "
> > > > > + << decu (oloop_strides)
> > > > > + << " * ratio: "
> > > > > + << decu (ratio)
> > > > > + << ")";
> > > >
> > > > Just randomly inside the thread.
> > > >
> > > > NOO!
> > > >
> > > > :/
> > > > Please do _not_ add more stream-like APIs.  How do you expect
> > > > translators to deal with those?
> > > >
> > > > Yes, I'm aware of the graphite-* ones and I dislike those very
> > > > much.
> > > >
> > > > What's wrong with the existing dump API?
> > >
> > > The existing API suffers from a "wall of text" problem:
> > >
> > > * although it's possible to filter based on various criteria (the
> > > optgroup tags, specific passes, success vs failure), it's not
> > > possible
> > > to filter base on code hotness: the -fopt-info API works purely in
> > > terms of location_t.  So all of the messages about the hottest
> > > functions in the workload are intermingled with all of the other
> > > messages about all of the other functions.
> >
> > True
> >
> > > * some of the text notes refer to function entry, but all of these
> > > are
> > > emitted "at the same level": there's no way to see the nesting of
> > > these
> > > function-entry logs, and where other notes are in relation to
> > > them.
> > > For example, in:
> > >
> > >  test.c:8:3: note: === analyzing loop ===
> > >  test.c:8:3: note: === analyze_loop_nest ===
> > >  test.c:8:3: note: === vect_analyze_loop_form ===
> > >  test.c:8:3: note: === get_loop_niters ===
> > > test.c:8:3: note: symbolic number of iterations is (unsigned int)
> > > n_9(D)
> > > test.c:8:3: note: not vectorized: loop contains function calls or
> > > data
> > > references that cannot be analyzed
> > >  test.c:8:3: note: vectorized 0 loops in function
> > >
> > > there's no way to tell that the "vect_analyze_loop_form" is in fact
> > > inside the call to "analyze_loop_nest", and where the "symbolic
> > > number
> > > of iterations" messages is coming from in relation to them.  This
> > > may
> > > not seem significant here, but I'm quoting a small example;
> > > vectorization typically leads to dozens of messages, with a deep
> > > nesting structure (where that structure isn't visible in the -fopt-
> > > info
> > >
> > > output).
> >
> > True. The same issue exists for diagnostics BTW. Indeed, being able
> > to collapse 'sections' in dump files, opt-info or diagnostics sounds
> > useful.
> >
> > Note that for dump files and opt-info the level argument was sort of
> > designed to do that.
>
> Are you referring to the indentation argument here?

No, to MSG_NOTE vs. MSG_MISSED_OPTIMIZATION , etc.

> > >
> > > The existing API is throwing data away:
> > >
> > > * as noted above, by working purely with a location_t, the
> > > execution
> > > count isn't associated with the messages.  The output format purely
> > > 

PING^1: [PATCH] libitm/x86: Correct offsets of __private_tm and pointer_guard

2018-06-04 Thread H.J. Lu
On Wed, May 30, 2018 at 7:54 AM, H.J. Lu  wrote:
> In glibc, sysdeps/i386/nptl/tls.h has
>
> typedef struct
> {
>   void *tcb;/* Pointer to the TCB.  Not necessarily the
>thread descriptor used by libpthread.  */
>   dtv_t *dtv;
>   void *self;   /* Pointer to the thread descriptor.  */
>   int multiple_threads;
>   uintptr_t sysinfo;
>   uintptr_t stack_guard;
>   uintptr_t pointer_guard;
>   int gscope_flag;
>   int __glibc_reserved1;
>   /* Reservation of some values for the TM ABI.  */
>   void *__private_tm[4];
>   /* GCC split stack support.  */
>   void *__private_ss;
> } tcbhead_t;
>
> and sysdeps/x86_64/nptl/tls.h has
>
> typedef struct
> {
>   void *tcb;/* Pointer to the TCB.  Not necessarily the
>thread descriptor used by libpthread.  */
>   dtv_t *dtv;
>   void *self;   /* Pointer to the thread descriptor.  */
>   int multiple_threads;
>   int gscope_flag;
>   uintptr_t sysinfo;
>   uintptr_t stack_guard;
>   uintptr_t pointer_guard;
>   unsigned long int vgetcpu_cache[2];
>   int __glibc_reserved1;
>   int __glibc_unused1;
>   /* Reservation of some values for the TM ABI.  */
>   void *__private_tm[4];
>   /* GCC split stack support.  */
>   void *__private_ss;
>   long int __glibc_reserved2;
>   /* Must be kept even if it is no longer used by glibc since programs,
>  like AddressSanitizer, depend on the size of tcbhead_t.  */
>   __128bits __glibc_unused2[8][4] __attribute__ ((aligned (32)));
>
>   void *__padding[8];
> } tcbhead_t;
>
> The offsets of __private_tm are
>
> i386:   36 bytes
> x32:48 bytes
> x86_64: 80 bytes
>
> and the offsets of pointer_guard are:
>
> i386:   24 bytes
> x32:28 bytes
> x86_64: 48 bytes
>
> Update SEG_READ and SEG_WRITE to use the offset of __private_tm as base
> and correct the offset of pointer_guard for x32.
>
> Tested on i686, x86-64 and x32.  OK for trunk and release branches?
>

Any comments, objections?

BTW, this patch doesn't change ABI of libitm.

> H.J.
> ---
> PR libitm/85988
> * config/linux/x86/tls.h (SEG_READ): Use the offset of
> __private_tm as base.
> (SEG_WRITE): Likewise.
> (SEG_ENCODE_WRITE): Correct the offset of pointer_guard for x32.
> (gtm_thr): Replace SEG_READ(10) with SEG_READ(0).
> (set_gtm_thr): Replace SEG_WRITE(10) with SEG_WRITE(0).
> (abi_disp): Replace SEG_DECODE_READ(11) with SEG_DECODE_READ(1).
> (set_abi_disp): Replace SEG_ENCODE_WRITE(11) with
> SEG_ENCODE_WRITE(1).
> ---
>  libitm/config/linux/x86/tls.h | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/libitm/config/linux/x86/tls.h b/libitm/config/linux/x86/tls.h
> index 5f3fd273c0e..ca6a5af3d4f 100644
> --- a/libitm/config/linux/x86/tls.h
> +++ b/libitm/config/linux/x86/tls.h
> @@ -42,8 +42,8 @@ namespace GTM HIDDEN {
>
>  #ifdef __x86_64__
>  #ifdef __LP64__
> -# define SEG_READ(OFS) "movq\t%%fs:(" #OFS "*8),%0"
> -# define SEG_WRITE(OFS)"movq\t%0,%%fs:(" #OFS "*8)"
> +# define SEG_READ(OFS) "movq\t%%fs:(80+" #OFS "*8),%0"
> +# define SEG_WRITE(OFS)"movq\t%0,%%fs:(80+" #OFS "*8)"
>  # define SEG_DECODE_READ(OFS)  SEG_READ(OFS) "\n\t" \
> "rorq\t$17,%0\n\t" \
> "xorq\t%%fs:48,%0"
> @@ -52,18 +52,18 @@ namespace GTM HIDDEN {
> SEG_WRITE(OFS)
>  #else
>  // For X32.
> -# define SEG_READ(OFS)  "movl\t%%fs:(" #OFS "*4),%0"
> -# define SEG_WRITE(OFS) "movl\t%0,%%fs:(" #OFS "*4)"
> +# define SEG_READ(OFS)  "movl\t%%fs:(48+" #OFS "*4),%0"
> +# define SEG_WRITE(OFS) "movl\t%0,%%fs:(48+" #OFS "*4)"
>  # define SEG_DECODE_READ(OFS)   SEG_READ(OFS) "\n\t" \
> "rorl\t$9,%0\n\t" \
> -   "xorl\t%%fs:24,%0"
> -# define SEG_ENCODE_WRITE(OFS)  "xorl\t%%fs:24,%0\n\t" \
> +   "xorl\t%%fs:28,%0"
> +# define SEG_ENCODE_WRITE(OFS)  "xorl\t%%fs:28,%0\n\t" \
> "roll\t$9,%0\n\t" \
> SEG_WRITE(OFS)
>  #endif
>  #else
> -# define SEG_READ(OFS)  "movl\t%%gs:(" #OFS "*4),%0"
> -# define SEG_WRITE(OFS) "movl\t%0,%%gs:(" #OFS "*4)"
> +# define SEG_READ(OFS)  "movl\t%%gs:(36+" #OFS "*4),%0"
> +# define SEG_WRITE(OFS) "movl\t%0,%%gs:(36+" #OFS "*4)"
>  # define SEG_DECODE_READ(OFS)  SEG_READ(OFS) "\n\t" \
> "rorl\t$9,%0\n\t" \
> "xorl\t%%gs:24,%0"
> @@ -75,26 +75,26 @@ namespace GTM HIDDEN {
>  static inline struct gtm_thread *gtm_thr(void)
>  {
>struct gtm_thread *r;
> -  asm volatile (SEG_READ(10) : "=r"(r));
> +  asm volatile (SEG_READ(0) : "=r"(r));
>return r;
>  }
>
>  static inline void set_gtm_thr(struct gtm_thread *x)
>  {
> -  asm volatile (SEG_WRITE(10) : : "r"(x));
> 

Re: Fix expand_expr_real_1 handling of BLKmode bitfield references

2018-06-04 Thread Richard Biener
On Wed, May 30, 2018 at 8:46 AM Richard Sandiford
 wrote:>
> The handling of bitfield references in expand_expr_real_1 includes:
>
> machine_mode ext_mode = mode;
>
> if (ext_mode == BLKmode
> && ! (target != 0 && MEM_P (op0)
>   && MEM_P (target)
>   && multiple_p (bitpos, BITS_PER_UNIT)))
>   ext_mode = int_mode_for_size (bitsize, 1).else_blk ();
>
> if (ext_mode == BLKmode)
>   {
> [...]
> gcc_assert (MEM_P (op0)
>
> Here "mode" is the TYPE_MODE of the result, so when mode == BLKmode,
> the target must be a MEM if nonnull, since no other rtl objects can
> have BLKmode.  But there's no guarantee that the source value op0 is also
> BLKmode and thus also a MEM: we can reach the assert for any source if
> the bitsize being extracted is larger than the largest integer mode
> (or larger than MAX_FIXED_MODE_SIZE).
>
> This triggered for SVE with -msve-vector-bits=512, where we could
> sometimes try to extract a BLKmode value from a 512-bit vector,
> and where int_mode_for_size would rightly fail for large bitsizes.
>
> The patch reuses the existing:
>
> /* Otherwise, if this is a constant or the object is not in memory
>and need be, put it there.  */
> else if (CONSTANT_P (op0) || (!MEM_P (op0) && must_force_mem))
>   {
> memloc = assign_temp (TREE_TYPE (tem), 1, 1);
> emit_move_insn (memloc, op0);
> op0 = memloc;
> clear_mem_expr = true;
>   }
>
> to handle this case.
>
> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf and
> x86_64-linux-gnu.  OK to install?

Ok.

Richard.

> Richard
>
>
> 2018-05-30  Richard Sandiford  
>
> gcc/
> * expr.c (expand_expr_real_1): Force the operand into memory if
> its TYPE_MODE is BLKmode and if there is no integer mode for
> the number of bits being extracted.
>
> gcc/testsuite/
> * gcc.target/aarch64/sve/extract_5.c: New test.
>
> Index: gcc/expr.c
> ===
> --- gcc/expr.c  2018-05-30 07:33:11.652009370 +0100
> +++ gcc/expr.c  2018-05-30 07:44:31.856060230 +0100
> @@ -10582,6 +10582,8 @@ expand_expr_real_1 (tree exp, rtx target
>to a larger size.  */
> must_force_mem = (offset
>   || mode1 == BLKmode
> + || (mode == BLKmode
> + && !int_mode_for_size (bitsize, 1).exists ())
>   || maybe_gt (bitpos + bitsize,
>GET_MODE_BITSIZE (mode2)));
>
> Index: gcc/testsuite/gcc.target/aarch64/sve/extract_5.c
> ===
> --- /dev/null   2018-04-20 16:19:46.369131350 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/extract_5.c2018-05-30 
> 07:44:31.857060190 +0100
> @@ -0,0 +1,71 @@
> +/* Originally from gcc.dg/vect/vect-alias-check-10.c.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-vectorize -msve-vector-bits=512" } */
> +
> +#define N 87
> +#define M 6
> +
> +typedef signed char sc;
> +typedef unsigned char uc;
> +typedef signed short ss;
> +typedef unsigned short us;
> +typedef int si;
> +typedef unsigned int ui;
> +typedef signed long long sll;
> +typedef unsigned long long ull;
> +
> +#define FOR_EACH_TYPE(M) \
> +  M (sc) M (uc) \
> +  M (ss) M (us) \
> +  M (si) M (ui) \
> +  M (sll) M (ull) \
> +  M (float) M (double)
> +
> +#define TEST_VALUE(I) ((I) * 5 / 2)
> +
> +#define ADD_TEST(TYPE) \
> +  void __attribute__((noinline, noclone))  \
> +  test_##TYPE (TYPE *a, int step)  \
> +  {\
> +for (int i = 0; i < N; ++i)\
> +  {\
> +   a[i * step + 0] = a[i * step + 0] + 1;  \
> +   a[i * step + 1] = a[i * step + 1] + 2;  \
> +   a[i * step + 2] = a[i * step + 2] + 4;  \
> +   a[i * step + 3] = a[i * step + 3] + 8;  \
> +  }\
> +  }\
> +  void __attribute__((noinline, noclone))  \
> +  ref_##TYPE (TYPE *a, int step)   \
> +  {\
> +for (int i = 0; i < N; ++i)\
> +  {\
> +   a[i * step + 0] = a[i * step + 0] + 1;  \
> +   a[i * step + 1] = a[i * step + 1] + 2;  \
> +   a[i * step + 2] = a[i * step + 2] + 4;  \
> +   a[i * step + 3] = a[i * step + 3] + 8;  \
> +   asm volatile ("");  \
> +  }\
> +  }
> +
> +#define DO_TEST(TYPE)  \
> +  for (int j = -M; j <= M; ++j) 

[PATCH] Optimize AVX512 vpcmpeq* against 0 into vptestnm* rather than vptestm* (PR target/85832, PR target/86036)

2018-06-04 Thread Jakub Jelinek
Hi!

On Wed, May 23, 2018 at 08:45:19AM +0200, Jakub Jelinek wrote:
> As mentioned in the PR, vptestm* instructions with the same input operand used
> twice perform the same comparison as vpcmpeq* against zero vector, with the
> advantage that a register holding CONST0_RTX (mode) is not needed.
> 
> 2018-05-23  Jakub Jelinek  
> 
>   PR target/85832
>   * config/i386/sse.md (_eq3_1):
>   Add (=Yk,v,C) variant using vptestm insn.  Use TARGET_AVX512BW
>   in test instead of TARGET_AVX512F for VI12_AVX512VL iterator.
> 
>   * gcc.target/i386/avx512f-pr85832.c: New test.
>   * gcc.target/i386/avx512vl-pr85832.c: New test.
>   * gcc.target/i386/avx512bw-pr85832.c: New test.
>   * gcc.target/i386/avx512vlbw-pr85832.c: New test.

I've unfortunately not added an executable testcase nor tested it under sde,
so missed that say vpcmpeqw with 0 vector as one of the operands actually
doesn't do what vptestmw with the other argument repeated does, it does
exactly the opposite, vpcmpeqw sets bits in the mask register for elements
that are equal to 0, but vptestmw sets bits in the mask register for elements
where and of first arg and second arg (i.e. the argument that is repeated)
is non-zero.  Fortunately there is vptestnmw which does what we want.

Bootstrapped/regtested on x86_64-linux and i686-linux + tested on the
testcase with sde.  Ok for trunk?

2018-06-04  Jakub Jelinek  

PR target/85832
PR target/86036
* config/i386/sse.md (_eq3_1):
Use vptestnm rather than vptestm in (=Yc,v,C) variant.

* gcc.target/i386/avx512f-pr85832.c: Expect vptestnm rather than
vptestm.
* gcc.target/i386/avx512vl-pr85832.c: Likewise.
* gcc.target/i386/avx512vlbw-pr85832.c: Likewise.
* gcc.target/i386/avx512bw-pr85832.c: Likewise.
* gcc.target/i386/avx512bw-pr86036.c: New test.

--- gcc/config/i386/sse.md.jj   2018-05-31 20:53:41.933453308 +0200
+++ gcc/config/i386/sse.md  2018-06-04 10:29:02.667720644 +0200
@@ -11287,7 +11287,7 @@ (define_insn "_eq3\t{%2, %1, 
%0|%0, %1, %2}
-   vptestm\t{%1, %1, 
%0|%0, %1, %1}"
+   vptestnm\t{%1, %1, 
%0|%0, %1, %1}"
   [(set_attr "type" "ssecmp")
(set_attr "prefix_extra" "1")
(set_attr "prefix" "evex")
@@ -11302,7 +11302,7 @@ (define_insn "_eq3\t{%2, %1, 
%0|%0, %1, %2}
-   vptestm\t{%1, %1, 
%0|%0, %1, %1}"
+   vptestnm\t{%1, %1, 
%0|%0, %1, %1}"
   [(set_attr "type" "ssecmp")
(set_attr "prefix_extra" "1")
(set_attr "prefix" "evex")
--- gcc/testsuite/gcc.target/i386/avx512f-pr85832.c.jj  2018-05-25 
14:35:23.123416639 +0200
+++ gcc/testsuite/gcc.target/i386/avx512f-pr85832.c 2018-06-04 
11:00:00.773880446 +0200
@@ -1,8 +1,8 @@
 /* PR target/85832 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -mavx512f -mno-avx512vl -mno-avx512bw -masm=att" } */
-/* { dg-final { scan-assembler-times {\mvptestmd\M} 1 } } */
-/* { dg-final { scan-assembler-times {\mvptestmq\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvptestnmd\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvptestnmq\M} 1 } } */
 
 #include 
 
--- gcc/testsuite/gcc.target/i386/avx512vl-pr85832.c.jj 2018-05-25 
14:35:23.123416639 +0200
+++ gcc/testsuite/gcc.target/i386/avx512vl-pr85832.c2018-06-04 
11:00:09.995895313 +0200
@@ -1,8 +1,8 @@
 /* PR target/85832 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -mavx512vl -mno-avx512bw -masm=att" } */
-/* { dg-final { scan-assembler-times {\mvptestmd\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mvptestmq\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvptestnmd\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvptestnmq\M} 2 } } */
 
 #include 
 
--- gcc/testsuite/gcc.target/i386/avx512vlbw-pr85832.c.jj   2018-05-25 
14:35:23.124416640 +0200
+++ gcc/testsuite/gcc.target/i386/avx512vlbw-pr85832.c  2018-06-04 
11:00:06.02098 +0200
@@ -1,8 +1,8 @@
 /* PR target/85832 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -mavx512vl -mavx512bw -masm=att" } */
-/* { dg-final { scan-assembler-times {\mvptestmb\M} 2 } } */
-/* { dg-final { scan-assembler-times {\mvptestmw\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvptestnmb\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mvptestnmw\M} 2 } } */
 
 #include 
 
--- gcc/testsuite/gcc.target/i386/avx512bw-pr85832.c.jj 2018-05-25 
14:35:23.124416640 +0200
+++ gcc/testsuite/gcc.target/i386/avx512bw-pr85832.c2018-06-04 
10:59:53.015867934 +0200
@@ -1,8 +1,8 @@
 /* PR target/85832 */
 /* { dg-do compile } */
 /* { dg-options "-O2 -mavx512bw -mno-avx512vl -masm=att" } */
-/* { dg-final { scan-assembler-times {\mvptestmb\M} 1 } } */
-/* { dg-final { scan-assembler-times {\mvptestmw\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvptestnmb\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvptestnmw\M} 1 } } */
 
 #include 
 
--- gcc/testsuite/gcc.target/i386/avx512bw-pr86036.c.jj 2018-06-04 
11:04:24.860193859 +0200
+++ gcc/testsuite/gcc.target/i386/avx512bw-pr86036.c

[PATCH] rtlanal: Fix nonzero_bits for non-load paradoxical subregs (PR85925)

2018-06-04 Thread Segher Boessenkool
Hi!

In the PR we have insns:

Trying 23 -> 24:
   23: r123:SI=zero_extend(r122:HI)
  REG_DEAD r122:HI
   24: [r115:SI]=r123:SI
  REG_DEAD r123:SI

which should be combined to

(set (mem:SI (reg/f:SI 115 [ pretmp_19 ]) [1 *pretmp_19+0 S4 A32])
(and:SI (subreg:SI (reg:HI 122) 0)
(const_int 32767 [0x7fff])))

But nonzero_bits of reg:HI 122 is 0x7fff, and nonzero_bits1 thinks it
then also has that same nonzero_bits for the subreg.  This is not
correct: the bit outside of HImode are undefined.  load_extend_op
applies to loads from memory only, not anything else.  Which means the
whole AND is optimised away.

Richard, what do you think?

Tested on the testcase for an arm compiler, and bootstrapped and
regression checked on powerpc64-linux {-m32,-m64} (not that that means
much, WORD_REGISTER_OPERATIONS isn't set there).


Segher


2018-06-04  Segher Boessenkool  

PR rtl-optimization/85925
* rtlanal.c (nonzero_bits1): For any paradoxical subreg of
something other than memory the high bits are undefined.

---
 gcc/rtlanal.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index ac3662d..6f171de 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -4742,17 +4742,17 @@ nonzero_bits1 (const_rtx x, scalar_int_mode mode, 
const_rtx known_x,
  nonzero &= cached_nonzero_bits (SUBREG_REG (x), mode,
  known_x, known_mode, known_ret);
 
-  /* On many CISC machines, accessing an object in a wider mode
+ /* On many CISC machines, accessing an object in a wider mode
 causes the high-order bits to become undefined.  So they are
 not known to be zero.  */
  rtx_code extend_op;
  if ((!WORD_REGISTER_OPERATIONS
   /* If this is a typical RISC machine, we only have to worry
  about the way loads are extended.  */
+  || !MEM_P (SUBREG_REG (x))
   || ((extend_op = load_extend_op (inner_mode)) == SIGN_EXTEND
   ? val_signbit_known_set_p (inner_mode, nonzero)
-  : extend_op != ZERO_EXTEND)
-  || (!MEM_P (SUBREG_REG (x)) && !REG_P (SUBREG_REG (x
+  : extend_op != ZERO_EXTEND))
  && xmode_width > inner_width)
nonzero
  |= (GET_MODE_MASK (GET_MODE (x)) & ~GET_MODE_MASK (inner_mode));
-- 
1.8.3.1



[PATCH, S390] Change mtune default

2018-06-04 Thread Robin Dapp
Hi,

when building with --with-tune=zEC12 and calling the resulting gcc with
--march=z13 (no extra -mtune), the binary would unexpectedly be compiled
with -march=z13 -mtune=zEC12.  This patch avoids using the default tune
parameter if -march is specified as argument but the user can still
explicitly state -march=z13 -mtune=zEC12.

Regards
 Robin

--

gcc/ChangeLog:

2018-06-04  Robin Dapp  

* config/s390/s390.h (enum processor_flags): Do not use
default tune parameter when -march was specified.
diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h
index a372981ff3a..ed0dfa19bb6 100644
--- a/gcc/config/s390/s390.h
+++ b/gcc/config/s390/s390.h
@@ -205,7 +205,7 @@ enum processor_flags
 #define OPTION_DEFAULT_SPECS 	\
   { "mode", "%{!mesa:%{!mzarch:-m%(VALUE)}}" },			\
   { "arch", "%{!march=*:-march=%(VALUE)}" },			\
-  { "tune", "%{!mtune=*:-mtune=%(VALUE)}" }
+  { "tune", "%{!mtune=*:%{!march=*:-mtune=%(VALUE)}}" }
 
 #ifdef __s390__
 extern const char *s390_host_detect_local_cpu (int argc, const char **argv);


Re: [PATCH] Make __FUNCTION__ a mergeable string and do not generate symbol entry.

2018-06-04 Thread Martin Liška
On 05/21/2018 07:19 PM, Jason Merrill wrote:
> On Mon, May 21, 2018 at 9:33 AM, Martin Liška  wrote:
>> On 10/24/2017 10:24 PM, Jason Merrill wrote:
>>> On Thu, Sep 14, 2017 at 5:22 AM, Martin Liška  wrote:
 On 08/10/2017 09:43 PM, Jason Merrill wrote:
> On 07/14/2017 01:35 AM, Martin Liška wrote:
>> On 05/01/2017 09:13 PM, Jason Merrill wrote:
>>> On Wed, Apr 26, 2017 at 6:58 AM, Martin Liška  wrote:
 On 04/25/2017 01:58 PM, Jakub Jelinek wrote:
> On Tue, Apr 25, 2017 at 01:48:05PM +0200, Martin Liška wrote:
>> Hello.
>>
>> This is patch that was originally installed by Jason and later 
>> reverted due to PR70422.
>> In the later PR Richi suggested a fix for that and Segher verified 
>> that it helped him
>> to survive regression tests. That's reason why I'm resending that.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression 
>> tests.
>>
>> Ready to be installed?
>> Martin
>
>> >From a34ce0ef37ae00609c9f3ff98a9cb0b7db6a8bd0 Mon Sep 17 00:00:00 
>> >2001
>> From: marxin 
>> Date: Thu, 20 Apr 2017 14:56:30 +0200
>> Subject: [PATCH] Make __FUNCTION__ a mergeable string and do not 
>> generate
>>   symbol entry.
>>
>> gcc/cp/ChangeLog:
>>
>> 2017-04-20  Jason Merrill  
>>   Martin Liska  
>>   Segher Boessenkool  
>>
>>   PR c++/64266
>>   PR c++/70353
>>   PR bootstrap/70422
>>   Core issue 1962
>>   * decl.c (cp_fname_init): Decay the initializer to pointer.
>>   (cp_make_fname_decl): Set DECL_DECLARED_CONSTEXPR_P,
>>   * pt.c (tsubst_expr) [DECL_EXPR]: Set DECL_VALUE_EXPR,
>>   DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P and
>>   DECL_IGNORED_P.  Don't call cp_finish_decl.
>
> If we don't emit those into the debug info, will the debugger be
> able to handle __FUNCTION__ etc. properly?

 No, debugger with the patch can't handled these. Similar to how clang
 behaves currently. Maybe it can be conditionally enabled with -g3, or 
 -g?

> Admittedly, right now we emit it into debug info only if those decls
> are actually used, say on:
> const char * foo () { return __FUNCTION__; }
> const char * bar () { return ""; }
> we'd emit foo::__FUNCTION__, but not bar::__FUNCTION__, so the 
> debugger
> has to have some handling of it anyway.  But while in functions
> that don't refer to __FUNCTION__ it is always the debugger that needs
> to synthetize those and thus they will be always pointer-equal,
> if there are some uses of it and for other uses the debugger would
> synthetize it, there is the possibility that the debugger synthetized
> string will not be the same object as actually used in the function.

 You're right, currently one has to use a special function to be able to
 print it in debugger. I believe we've already discussed that, according
 to spec, the strings don't have to point to a same string.

 Suggestions what we should do with the patch?
>>>
>>> We need to emit debug information for these variables.  From Jim's
>>> description in 70422 it seems that the problem is that the reference
>>> to the string from the debug information is breaking
>>> function_mergeable_rodata_prefix, which relies on
>>> current_function_decl.  It seems to me that its callers should pass
>>> along their decl parameter so that f_m_r_p can use the decl's
>>> DECL_CONTEXT rather than rely on current_function_decl being set
>>> properly>
>>> Jason
>>>
>>
>> Ok, after some time I returned back to it. I followed your advises and
>> changed the function function_mergeable_rodata_prefix. Apart from a small
>> rebase was needed.
>>
>> May I ask Jim to test the patch?
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression 
>> tests.
>
>> +  DECL_IGNORED_P (decl) = 1;
>
> As I said before, we do need to emit debug information for these 
> variables, so this is wrong.

 Hello.

 Sorry for overlooking that.

>
>> -  section *s = targetm.asm_out.function_rodata_section 
>> (current_function_decl);
>> +  tree decl = current_function_decl;
>> +  if (decl && DECL_CONTEXT (decl)
>> +  && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL)
>> +decl = DECL_CONTEXT (decl);
>
> I don't see how this would help; it still relies on current_function_decl 
> being set correctly, which was the problem we were running into before.

 I see, that's what I wanted to discuss 

[patch][i386] Tremont -march/-mtune options

2018-06-04 Thread Makhotina, Olga

Hi,

This patch implements Tremont -march/-mtune.

2018-06-04  Olga Makhotina  

gcc/

* config.gcc: Support "tremont".
* config/i386/driver-i386.c (host_detect_local_cpu): Detect "tremont".
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
PROCESSOR_TREMONT.
* config/i386/i386.c (m_TREMONT): Define.
(processor_target_table): Add "tremont".
(PTA_TREMONT): Define.
(ix86_lea_outperforms): Add TARGET_TREMONT.
(get_builtin_code_for_version): Handle PROCESSOR_TREMONT.
(fold_builtin_cpu): Add M_INTEL_TREMONT, replace M_INTEL_GOLDMONT
and M_INTEL_GOLDMONT_PLUS.
(fold_builtin_cpu): Add "tremont".
(ix86_add_stmt_cost): Add TARGET_TREMONT.
(ix86_option_override_internal): Add "tremont".
* config/i386/i386.h (processor_costs): Define TARGET_TREMONT.
(processor_type): Add PROCESSOR_TREMONT.
* config/i386/x86-tune.def: Add m_TREMONT.
* doc/invoke.texi: Add tremont as x86 -march=/-mtune= CPU type.

libgcc/

* config/i386/cpuinfo.h (processor_types): Add INTEL_TREMONT.

gcc/testsuite/

* gcc.target/i386/funcspec-56.inc: Test arch=tremont.

Is it OK?

Thanks.


0001-tremont.patch
Description: 0001-tremont.patch


[PATCH] Some CFG cleanup reorg

2018-06-04 Thread Richard Biener


This inlines a function to make the flow clearer.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-06-04  Richard Biener  

* tree-cfgcleanup.c (cleanup_tree_cfg_1): Fold into...
(cleanup_tree_cfg_noloop): ... single caller.  Do
start_recording_case_labels later.

Index: gcc/tree-cfgcleanup.c
===
--- gcc/tree-cfgcleanup.c   (revision 261143)
+++ gcc/tree-cfgcleanup.c   (working copy)
@@ -757,66 +757,6 @@ cleanup_control_flow_pre ()
   return retval;
 }
 
-/* Iterate the cfg cleanups, while anything changes.  */
-
-static bool
-cleanup_tree_cfg_1 (void)
-{
-  bool retval = false;
-  basic_block bb;
-  unsigned i, n;
-
-  /* Prepare the worklists of altered blocks.  */
-  cfgcleanup_altered_bbs = BITMAP_ALLOC (NULL);
-
-  /* During forwarder block cleanup, we may redirect edges out of
- SWITCH_EXPRs, which can get expensive.  So we want to enable
- recording of edge to CASE_LABEL_EXPR.  */
-  start_recording_case_labels ();
-
-  /* We cannot use FOR_EACH_BB_FN for the BB iterations below
- since the basic blocks may get removed.  */
-
-  /* Start by iterating over all basic blocks in PRE order looking for
- edge removal opportunities.  Do this first because incoming SSA form
- may be invalid and we want to avoid performing SSA related tasks such
- as propgating out a PHI node during BB merging in that state.  */
-  retval |= cleanup_control_flow_pre ();
-
-  /* After doing the above SSA form should be valid (or an update SSA
- should be required).  */
-
-  /* Continue by iterating over all basic blocks looking for BB merging
- opportunities.  */
-  n = last_basic_block_for_fn (cfun);
-  for (i = NUM_FIXED_BLOCKS; i < n; i++)
-{
-  bb = BASIC_BLOCK_FOR_FN (cfun, i);
-  if (bb)
-   retval |= cleanup_tree_cfg_bb (bb);
-}
-
-  /* Now process the altered blocks, as long as any are available.  */
-  while (!bitmap_empty_p (cfgcleanup_altered_bbs))
-{
-  i = bitmap_first_set_bit (cfgcleanup_altered_bbs);
-  bitmap_clear_bit (cfgcleanup_altered_bbs, i);
-  if (i < NUM_FIXED_BLOCKS)
-   continue;
-
-  bb = BASIC_BLOCK_FOR_FN (cfun, i);
-  if (!bb)
-   continue;
-
-  retval |= cleanup_control_flow_bb (bb);
-  retval |= cleanup_tree_cfg_bb (bb);
-}
-
-  end_recording_case_labels ();
-  BITMAP_FREE (cfgcleanup_altered_bbs);
-  return retval;
-}
-
 static bool
 mfb_keep_latches (edge e)
 {
@@ -833,10 +773,7 @@ cleanup_tree_cfg_noloop (void)
 
   timevar_push (TV_TREE_CLEANUP_CFG);
 
-  /* Iterate until there are no more cleanups left to do.  If any
- iteration changed the flowgraph, set CHANGED to true.
-
- If dominance information is available, there cannot be any unreachable
+  /* If dominance information is available, there cannot be any unreachable
  blocks.  */
   if (!dom_info_available_p (CDI_DOMINATORS))
 {
@@ -907,7 +844,52 @@ cleanup_tree_cfg_noloop (void)
  }
 }
 
-  changed |= cleanup_tree_cfg_1 ();
+  /* Prepare the worklists of altered blocks.  */
+  cfgcleanup_altered_bbs = BITMAP_ALLOC (NULL);
+
+  /* Start by iterating over all basic blocks in PRE order looking for
+ edge removal opportunities.  Do this first because incoming SSA form
+ may be invalid and we want to avoid performing SSA related tasks such
+ as propgating out a PHI node during BB merging in that state.  */
+  changed |= cleanup_control_flow_pre ();
+
+  /* After doing the above SSA form should be valid (or an update SSA
+ should be required).  */
+
+  /* During forwarder block cleanup, we may redirect edges out of
+ SWITCH_EXPRs, which can get expensive.  So we want to enable
+ recording of edge to CASE_LABEL_EXPR.  */
+  start_recording_case_labels ();
+
+  /* Continue by iterating over all basic blocks looking for BB merging
+ opportunities.  We cannot use FOR_EACH_BB_FN for the BB iteration
+ since the basic blocks may get removed.  */
+  unsigned n = last_basic_block_for_fn (cfun);
+  for (unsigned i = NUM_FIXED_BLOCKS; i < n; i++)
+{
+  basic_block bb = BASIC_BLOCK_FOR_FN (cfun, i);
+  if (bb)
+   changed |= cleanup_tree_cfg_bb (bb);
+}
+
+  /* Now process the altered blocks, as long as any are available.  */
+  while (!bitmap_empty_p (cfgcleanup_altered_bbs))
+{
+  unsigned i = bitmap_first_set_bit (cfgcleanup_altered_bbs);
+  bitmap_clear_bit (cfgcleanup_altered_bbs, i);
+  if (i < NUM_FIXED_BLOCKS)
+   continue;
+
+  basic_block bb = BASIC_BLOCK_FOR_FN (cfun, i);
+  if (!bb)
+   continue;
+
+  changed |= cleanup_control_flow_bb (bb);
+  changed |= cleanup_tree_cfg_bb (bb);
+}
+
+  end_recording_case_labels ();
+  BITMAP_FREE (cfgcleanup_altered_bbs);
 
   gcc_assert (dom_info_available_p (CDI_DOMINATORS));
 


Re: [wwwdocs] Buildstat update for 6.x

2018-06-04 Thread Tom G. Christensen

On 04/06/18 00:51, Gerald Pfeifer wrote:

One suggestion:  Since you follow a "latest first" for the entries in
the file, would it make sense to follow the same order for patch
summaries/commit messages as well?



Sure. I'll reverse the sort order in the summary for the next update.

-tgc


[patch] Remove redundant intrinsics

2018-06-04 Thread Koval, Julia
Hi,

Since pre-Icelake ISA already had 128bit version vpclmul and vaes, we already 
have intrinsics for them(_mm_aesdec_si128, _mm_aesdeclast_si128, 
_mm_aesenc_si128, _mm_aesenclast_si128, _mm_clmulepi64_si128). Therefore 
intrinsics for them, introduced with Icelake instructions are redundant. This 
patch removes them. Ok for trunk?

gcc/
* config/i386/vaesintrin.h (_mm_aesdec_epi128, _mm_aesdeclast_epi128,
_mm_aesenc_epi128, _mm_aesenclast_epi128): Remove.
* config/i386/vpclmulqdqintrin.h (_mm_clmulepi64_epi128): Remove.

gcc/testsuite/
* gcc.target/i386/avx512fvl-vaes-1.c: Remove 128bit versions from test.
* gcc.target/i386/vpclmulqdq.c: Ditto.

Thanks,
Julia


remove_in.patch
Description: remove_in.patch


Re: [Patch, Fortran] PR25829: Asynchronous I/O (v2)

2018-06-04 Thread Dominique d'Humières
Hi Nicolas,

I have applied your patch on top of revision r261130 on x86_64-apple-darwin17 
(SSD with APFS file system).

The only remaining failure on my own tests is for the test (pr35840)

write(10,*, asynchronous="Y"//"E"//trim("S  "))
end

giving at run time

At line 1 of file pr35840.f90 (unit = 10, file = 'fort.10')
Fortran runtime error: ASYNCHRONOUS transfer without ASYHCRONOUS='YES' in OPEN

I also see two regressions

FAIL: gfortran.dg/f2003_inquire_1.f03   -O1  execution test

only with -m32 and -O1 (STOP 5), and

FAIL: gfortran.dg/f2003_io_1.f03   -O*

with both -m32 and -m64 (STOP 1).

The is also typos for the added tests

s/libgfomp/libgomp/

Why do the tests start at asynchronous_6.f90?

"Treat asynchronous variables the same as volatile, for now." could probably 
simplified as
"Treat asynchronous variables as volatile, for now."

I also wonder if

+wrap_scalar_transfer (dtp, BT_INTEGER, p, kind, kind, 1);

is correct without a cast to size_t for the last two arguments (and for the 
last argument in other instances). Note that I am C challenged, so forgive the 
question if it is stupid.

Thanks for the nice work.

Dominique




[MAINTAINERS, committed] Remove redundant write-after-approval entries

2018-06-04 Thread Tom de Vries

[ was: Re: [MAINTAINERS, committed] Update email address ]

On 06/02/2018 10:28 AM, Gerald Pfeifer wrote:

Hi Tom,

On Fri, 1 Jun 2018, tdevries wrote:

I've updated my email address in the MAINTAINERS file.


it looks you're still in the Write After Approval section while
also being listed as maintainer for nvptx?  The former is redundant,
I suggest to remove it (and only add it should you ever step down as
maintainer).


Hi,

I've written a script ./maintainers-find-redundant-wap-entries.sh:
...
$ ./maintainers-find-redundant-wap-entries.sh
Redundant in write approval: Segher Boessenkool
Redundant in write approval: Michael Eager
Redundant in write approval: Ilya Enkovich
Redundant in write approval: Tristan Gingold
Redundant in write approval: Anthony Green
Redundant in write approval: Olivier Hainque
Redundant in write approval: Richard Henderson
Redundant in write approval: Martin Jambor
Redundant in write approval: Andrew Jenner
Redundant in write approval: Kaz Kojima
Redundant in write approval: Maxim Kuvyrkov
Redundant in write approval: Sandra Loosemore
Redundant in write approval: David Malcolm
Redundant in write approval: Ramana Radhakrishnan
Redundant in write approval: Pierre-Marie de Rodat
Redundant in write approval: Iain Sandoe
Redundant in write approval: Marcus Shawcroft
Redundant in write approval: Chung-Lin Tang
Redundant in write approval: Caroline Tice
Redundant in write approval: Tom de Vries
Redundant in write approval: Chung-Ju Wu
Redundant in write approval: Kirill Yukhin
...

I've removed all the redundant entries in the attached patch.

Committed as obvious.

Thanks,
- Tom


maintainers-find-redundant-wap-entries.sh
Description: application/shellscript
[MAINTAINERS] Remove redundant write-after-approval entries

2018-06-02  Tom de Vries  

	* MAINTAINERS: Remove write-after-approval entries for component
	maintainers.

---
 MAINTAINERS | 22 --
 1 file changed, 22 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 7ddd2e72df0..f9f376a6616 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -323,7 +323,6 @@ Laurynas Biveinis
 Eric Blake	
 Phil Blundell	
 Hans Boehm	
-Segher Boessenkool
 Lynn Boger	
 Ian Bolton	
 Andrea Bona	
@@ -364,13 +363,11 @@ Chris Demetriou	
 Sameera Deshpande
 Wilco Dijkstra	
 Benoit Dupont de Dinechin			
-Michael Eager	
 Jason Eckhardt	
 Bernd Edlinger	
 Phil Edwards	
 Steve Ellcey	
 Mohan Embar	
-Ilya Enkovich	
 Revital Eres	
 Marc Espie	
 Ansgar Esztermann
@@ -393,12 +390,10 @@ Gopalasubramanian Ganesh			
 Kaveh Ghazi	
 Doug Gilmore	
 Matthew Gingell	
-Tristan Gingold	
 Jan-Benedict Glaw
 Marc Glisse	
 Prachi Godbole	
 Torbjorn Granlund
-Anthony Green	
 Doug Gregor	
 Matthew Gretton-Dann
 Yury Gribov	
@@ -407,7 +402,6 @@ Laurent Guerby	
 Xuepeng Guo	
 Wei Guozhi	
 Mostafa Hagog	
-Olivier Hainque	
 Andrew Haley		
 Stuart Hastings	
 Michael Haubenwallner
@@ -418,7 +412,6 @@ Mark Heffernan	
 George Helffrich
 Daniel Hellstrom
 Fergus Henderson
-Richard Henderson
 Stuart Henderson
 Matthew Hiller	
 Kazu Hirata	
@@ -437,8 +430,6 @@ Daniel Jacobowitz
 Andreas Jaeger	
 Harsha Jagasia	
 Fariborz Jahanian
-Martin Jambor	
-Andrew Jenner	
 Janis Johnson	
 Teresa Johnson	
 Kean Johnston	
@@ -452,7 +443,6 @@ Andi Kleen	
 Jeff Knaggs	
 Michael Koch	
 Nicolas Koenig	
-Kaz Kojima	
 Boris Kolpackov 
 Dave Korn	
 Julia Koval	
@@ -461,7 +451,6 @@ Jan Kratochvil	
 Louis Krupp	
 Prathamesh Kulkarni
 Venkataramanan Kumar
-Maxim Kuvyrkov	
 Doug Kwan	
 Scott Robert Ladd
 Razya Ladelsky	
@@ -486,7 +475,6 @@ Sa Liu		
 Ralph Loader	
 Sheldon Lobo	
-Sandra Loosemore
 Manuel López-Ibáñez
 Carl Love	
 Martin v. Löwis	
@@ -494,7 +482,6 @@ H.J. Lu		
 Christophe Lyon	
 Luis Machado	
 Ziga Mahkovec	
-David Malcolm	
 Mikhail Maltsev	
 Jose E. Marchesi
 Patrick Marlier	
@@ -549,14 +536,12 @@ Thomas Preud'homme
 Vladimir Prus	
 Yao Qi		
 Jerry Quinn	
-Ramana Radhakrishnan
 Easwaran Raman	
 Rolf Rasmussen	
 Fritz Reese	
 Volker Reichelt	
 Bernhard Reutner-Fischer			
 Tom Rix		
-Pierre-Marie de Rodat
 Craig Rodrigues	
 Erven Rohou	
 Ira Rosen	
@@ -565,7 +550,6 @@ Maciej W. Rozycki
 Silvius Rus	
 Matthew Sachs	
 Hariharan Sandanagobalane			
-Iain Sandoe	
 Duncan Sands	
 Sujoy Saraswati	
 Trevor Saunders	
@@ -578,7 +562,6 @@ Svein Seldal	
 Senthil Kumar Selvaraj
 Thiemo Seufer	
 Bill Seurer	
-Marcus Shawcroft
 Tim Shen	
 David Sherwood	
 Sharad Singhai	
@@ -600,11 +583,9 @@ Andrew Sutton	
 Gabriele Svelto	
 Toma Tabacu	
 Sriraman Tallam	

Re: [PATCH] PR85964

2018-06-04 Thread Richard Biener
On Wed, 30 May 2018, Jan Hubicka wrote:

> > 
> > This makes tracer not explode with -fno-guess-branch-probabilities.
> > I've settled with find_best_successor/predecessor not returning
> > anything if _any_ edge in the interesting direction doesn't have
> > ->count () initialized (rather than ignoring such edges).
> > 
> > Honza - I suppose it is on purpose that functions like
> > .to_frequency () do not ICE for uninitialized counters?
> > It at least looks like "previous" behavior was more sane
> > for tracer in the counts/frequencies that were exposed.
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > 
> > Honza, does this look OK to you?
> > 
> > tracer going wild on this testcase exposes the CFG cleanup
> > scalability issue I've posted the following RFC for:
> > https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01644.html
> > 
> > Thanks,
> > Richard.
> > 
> > 2018-05-30  Richard Biener  
> > 
> > PR tree-optimization/85964
> > * tracer.c (better_p): Drop initialized count check, we only
> > call the function with initialized counts now.
> > (find_best_successor): Do find a best edge if one
> > has uninitialized count.
> > (find_best_predecessor): Likewise.  Do BB frequency check only
> > if count is initialized.
> 
> > 
> > Index: gcc/tracer.c
> > ===
> > --- gcc/tracer.c(revision 260896)
> > +++ gcc/tracer.c(working copy)
> > @@ -132,8 +132,7 @@ count_insns (basic_block bb)
> >  static bool
> >  better_p (const_edge e1, const_edge e2)
> >  {
> > -  if (e1->count ().initialized_p () && e2->count ().initialized_p ()
> > -  && ((e1->count () > e2->count ()) || (e1->count () < e2->count  (
> > +  if ((e1->count () > e2->count ()) || (e1->count () < e2->count ()))
> >  return e1->count () > e2->count ();
> >/* This is needed to avoid changes in the decision after
> >   CFG is modified.  */
> > @@ -152,12 +151,15 @@ find_best_successor (basic_block bb)
> >edge_iterator ei;
> >  
> >FOR_EACH_EDGE (e, ei, bb->succs)
> > -if (!best || better_p (e, best))
> > -  best = e;
> > +{
> > +  if (!e->count ().initialized_p ())
> > +   return NULL;
> > +  if (!best || better_p (e, best))
> > +   best = e;
> > +}
> >if (!best || ignore_bb_p (best->dest))
> >  return NULL;
> > -  if (best->probability.initialized_p ()
> > -  && best->probability.to_reg_br_prob_base () <= probability_cutoff)
> > +  if (best->probability.to_reg_br_prob_base () <= probability_cutoff)
> 
> Technically we could accept when one edge has large known probability and 
> other unknown,
> but in practice it won't matter because w/o profile guessing tracer is 
> useless anyway.

So the above hunk requires the fix below.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-06-04  Richard Biener  

PR tree-optimization/86038
* tracer.c (find_best_successor): Check probability for
being initialized, bail out if not.

* gcc.dg/pr86038.c: New testcase.

Index: gcc/tracer.c
===
--- gcc/tracer.c(revision 261136)
+++ gcc/tracer.c(working copy)
@@ -159,7 +159,8 @@ find_best_successor (basic_block bb)
 }
   if (!best || ignore_bb_p (best->dest))
 return NULL;
-  if (best->probability.to_reg_br_prob_base () <= probability_cutoff)
+  if (!best->probability.initialized_p ()
+  || best->probability.to_reg_br_prob_base () <= probability_cutoff)
 return NULL;
   return best;
 }
Index: gcc/testsuite/gcc.dg/pr86038.c
===
--- gcc/testsuite/gcc.dg/pr86038.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/pr86038.c  (working copy)
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target pthread } */
+/* { dg-options "-O2 -ftracer -ftree-parallelize-loops=2 -fno-tree-scev-cprop 
--param parloops-schedule=dynamic" } */
+
+int
+sd (int lw)
+{
+  while (lw < 1)
+++lw;
+
+  return lw;
+}


Re: [PATCH] Print working directory only in intermediate format (PR gcov-profile/84846).

2018-06-04 Thread Martin Liška
PING^1

On 05/25/2018 01:36 PM, Martin Liška wrote:
> Hi.
> 
> As requested by Eric, let's print working directory just in intermediate 
> format:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84846#c8
> 
> gcov.exp tests works with the patch.
> 
> Ready for trunk?
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-05-25  Martin Liska  
> 
>   PR gcov-profile/84846
>   * gcov.c (output_lines): Print working directory only
>   in intermediate format.
> ---
>  gcc/gcov.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> 



[arm] PR target/86003 build failures with --with-cpu=xscale

2018-06-04 Thread Richard Earnshaw (lists)
The XScale cpu configuration in GCC has always been somewhat
non-conforming.  Although XScale isn't an architecture (it's simply an
implementation of ARMv5te), we do by tradition emit a specific
pre-define for it.  We achieve this effect by adding an additional
feature bit to the xscale CPU definition that isn't part of the base
architecture.

When I restructured the options last year I overlooked this oddity and
the result, of course, is that this configuration now fails to build
as intended.

What happens is that the driver (correctly) constructs an architecture
for the xscale cpu name (as armv5te) and passes it in addition to the
CPU name.  The backend code, on finding both a cpu and an architecture
specifies attempts to correlate the two and finds a difference due to
the additional feature bit and reports an inconsistency (fatally if
-werror is specified).

I think the best fix to this is to treat the xscale feature bit using
the same mechanism that we use for other 'quirks' in CPU
implementations and simply filter it out before comparing the
capabilities.  It has the additional benefit that it's also the
simplest fix.

PR target/86003
* config/arm/arm-cpus.in (ALL_QUIRKS): Add xscale feature to the
list of bits to ignore when comparing architectures.


Committed to trunk and gcc-8 branch
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 96972a0..545321b 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -268,7 +268,9 @@ define fgroup DOTPROD	NEON dotprod
 
 # List of all quirk bits to strip out when comparing CPU features with
 # architectures.
-define fgroup ALL_QUIRKS   quirk_no_volatile_ce quirk_armv6kz quirk_cm3_ldrd
+# xscale isn't really a 'quirk', but it isn't an architecture either and we
+# need to ignore it for matching purposes.
+define fgroup ALL_QUIRKS   quirk_no_volatile_ce quirk_armv6kz quirk_cm3_ldrd xscale
 
 # Architecture entries
 # format:


Re: [RFC][PR64946] "abs" vectorization fails for char/short types

2018-06-04 Thread Richard Biener
On Mon, Jun 4, 2018 at 10:18 AM Kugan Vivekanandarajah
 wrote:
>
> Hi Richard,
>
> Thanks for the review.
>
> On 1 June 2018 at 22:20, Richard Biener  wrote:
> > On Fri, Jun 1, 2018 at 4:12 AM Kugan Vivekanandarajah
> >  wrote:
> >>
> >> Hi Richard,
> >>
> >> This is the revised patch based on the review and the discussion in
> >> https://gcc.gnu.org/ml/gcc/2018-05/msg00179.html.
> >>
> >> In summary:
> >> - I skipped  (element_precision (type) < element_precision (TREE_TYPE
> >> (@0))) in the match.pd pattern as this would prevent transformation
> >> for the case in PR.
> >> that is, I am interested in is something like:
> >>   char t = (char) ABS_EXPR <(int) x>
> >> and I want to generate
> >> char t = (char) ABSU_EXPR 
> >>
> >> - I also haven't added all the necessary match.pd changes for
> >> ABSU_EXPR. I have a patch for that but will submit separately based on
> >> this reveiw.
> >>
> >> - I also tried to add ABSU_EXPRsupport  in the places as necessary by
> >> grepping for ABS_EXPR.
> >>
> >> - I also had to add special casing in vectorizer for ABSU_EXP as its
> >> result is unsigned type.
> >>
> >> Is this OK. Patch bootstraps and the regression test is ongoing.
> >
> > The c/c-typeck.c:build_unary_op change looks unnecessary - the
> > C FE should never generate this directly (the c-common one might
> > be triggered by early folding I guess).
>
> The Gimple FE testcase is running into this.

Ah, OK then.

> >
> > @@ -1761,6 +1762,9 @@ const_unop (enum tree_code code, tree type, tree arg0)
> >if (TREE_CODE (arg0) == INTEGER_CST || TREE_CODE (arg0) == REAL_CST)
> > return fold_abs_const (arg0, type);
> >break;
> > +case ABSU_EXPR:
> > +   return fold_convert (type, fold_abs_const (arg0,
> > +  signed_type_for (type)));
> >
> >  case CONJ_EXPR:
> >
> > I think this will get you bogus TREE_OVERFLOW flags set on ABSU (-INT_MIN).
> >
> > I think you want to change fold_abs_const to properly deal with arg0 being
> > signed and type unsigned.  That is, sth like
> >
> > diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> > index 6f80f1b1d69..f60f9c77e91 100644
> > --- a/gcc/fold-const.c
> > +++ b/gcc/fold-const.c
> > @@ -13843,18 +13843,19 @@ fold_abs_const (tree arg0, tree type)
> >{
> >  /* If the value is unsigned or non-negative, then the absolute 
> > value
> >is the same as the ordinary value.  */
> > -   if (!wi::neg_p (wi::to_wide (arg0), TYPE_SIGN (type)))
> > - t = arg0;
> > +   wide_int val = wi::to_wide (arg0);
> > +   bool overflow = false;
> > +   if (!wi::neg_p (val, TYPE_SIGN (TREE_TYPE (arg0
> > + ;
> >
> > /* If the value is negative, then the absolute value is
> >its negation.  */
> > else
> > - {
> > -   bool overflow;
> > -   wide_int val = wi::neg (wi::to_wide (arg0), );
> > -   t = force_fit_type (type, val, -1,
> > -   overflow | TREE_OVERFLOW (arg0));
> > - }
> > + wide_int val = wi::neg (val, );
> > +
> > +   /* Force to the destination type, set TREE_OVERFLOW for signed
> > +  TYPE only.  */
> > +   t = force_fit_type (type, val, 1, overflow | TREE_OVERFLOW (arg0));
> >}
> >break;
> >
> > and then simply share the const_unop code with ABS_EXPR.
>
> Done.
>
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index 14386da..7d7c132 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -102,6 +102,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >  (match (nop_convert @0)
> >   @0)
> >
> > +(simplify (abs (convert @0))
> > + (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > +  && !TYPE_UNSIGNED (TREE_TYPE (@0))
> > +  && !TYPE_UNSIGNED (type))
> > +  (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
> > +   (convert (absu:utype @0)
> > +
> > +
> >
> > please put a comment before the pattern.  I believe there's no
> > need to check for !TYPE_UNSIGNED (type).  Note this
> > also converts abs ((char)int-var) to (char)absu(int-var) which
> > doesn't make sense.  The original issue you want to address
> > here is the case where TYPE_PRECISION of @0 is less than
> > the precision of type.  That is, you want to remove language
> > introduced integer promotion of @0 which only is possible
> > with ABSU.  So please do add such precision check
> > (I simply suggested the bogus direction of the test).
>
> Done.
> >
> > diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
> > index 68f4fd3..9b62583 100644
> > --- a/gcc/tree-cfg.c
> > +++ b/gcc/tree-cfg.c
> > @@ -3685,6 +3685,12 @@ verify_gimple_assign_unary (gassign *stmt)
> >  case PAREN_EXPR:
> >  case CONJ_EXPR:
> >break;
> > +case ABSU_EXPR:
> > +  if (!TYPE_UNSIGNED (lhs_type)
> > + || !ANY_INTEGRAL_TYPE_P (rhs1_type))
> >
> >  if (!ANY_INTEGRAL_TYPE_P (lhs_type)
> >  || !TYPE_UNSIGNED (lhs_type)
> >  || 

Re: [C++ Patch] Fix some locations

2018-06-04 Thread Paolo Carlini

Hi,

On 01/06/2018, 23:29, Jason Merrill wrote:

On Fri, Jun 1, 2018 at 5:03 PM, Paolo Carlini  wrote:

while working on some bugs I noticed that in a few places in decl.c we could
do better in terms of locations within the current infrastructure, some
simple, straightforward improvements. I'm attaching below a tiny first
patch, more to follow, I hope.

For example, a function which could be improved in many places is
grok_ctor_properties and, before going ahead, I'd like to ask about
something I noticed yesterday:

   /* There can be no default arguments.  */
   for (tree arg = argtypes; arg != void_list_node; arg = TREE_CHAIN (arg))
 if (TREE_PURPOSE (arg))
   {
 TREE_PURPOSE (arg) = NULL_TREE;
 if (operator_code == POSTINCREMENT_EXPR
 || operator_code == POSTDECREMENT_EXPR)
   pedwarn (input_location, OPT_Wpedantic,
"%qD cannot have default arguments", decl);
 else
   {
 error ("%qD cannot have default arguments", decl);
 return false;
   }
   }

the special casing seems weird, so far I haven't been able to find anything
in the standard about it and all the other compilers I have at hand (Oracle,
Intel, Clang) don't seem to have it. Shall we keep it or not? Maybe with an
additional comment explaining the rationale? The affected testcases would be
parse/defarg11.C and g++.jason/operator.C.

I think in the olden days, with a default argument one function could
support both prefix and postfix inc/decrement syntax:

struct A {
   A& operator++(int = 0);
};

int main()
{
   A a;
   ++a;
   a++;
}

...but that hasn't actually worked in forever, so it shouldn't be a
problem to make the ++ case an error, too.
Thanks for the explanation. If we can remove the special casing, fixing 
some other issues will be easier, for example setting TREE_PURPOSE 
unconditionally error_mark_node fixes a Bugzilla bug (I don't have the 
number with me, more when I'm back home next week ;)


Thanks,
Paolo.


Re: [RFC][PR64946] "abs" vectorization fails for char/short types

2018-06-04 Thread Kugan Vivekanandarajah
Hi Richard,

Thanks for the review.

On 1 June 2018 at 22:20, Richard Biener  wrote:
> On Fri, Jun 1, 2018 at 4:12 AM Kugan Vivekanandarajah
>  wrote:
>>
>> Hi Richard,
>>
>> This is the revised patch based on the review and the discussion in
>> https://gcc.gnu.org/ml/gcc/2018-05/msg00179.html.
>>
>> In summary:
>> - I skipped  (element_precision (type) < element_precision (TREE_TYPE
>> (@0))) in the match.pd pattern as this would prevent transformation
>> for the case in PR.
>> that is, I am interested in is something like:
>>   char t = (char) ABS_EXPR <(int) x>
>> and I want to generate
>> char t = (char) ABSU_EXPR 
>>
>> - I also haven't added all the necessary match.pd changes for
>> ABSU_EXPR. I have a patch for that but will submit separately based on
>> this reveiw.
>>
>> - I also tried to add ABSU_EXPRsupport  in the places as necessary by
>> grepping for ABS_EXPR.
>>
>> - I also had to add special casing in vectorizer for ABSU_EXP as its
>> result is unsigned type.
>>
>> Is this OK. Patch bootstraps and the regression test is ongoing.
>
> The c/c-typeck.c:build_unary_op change looks unnecessary - the
> C FE should never generate this directly (the c-common one might
> be triggered by early folding I guess).

The Gimple FE testcase is running into this.

>
> @@ -1761,6 +1762,9 @@ const_unop (enum tree_code code, tree type, tree arg0)
>if (TREE_CODE (arg0) == INTEGER_CST || TREE_CODE (arg0) == REAL_CST)
> return fold_abs_const (arg0, type);
>break;
> +case ABSU_EXPR:
> +   return fold_convert (type, fold_abs_const (arg0,
> +  signed_type_for (type)));
>
>  case CONJ_EXPR:
>
> I think this will get you bogus TREE_OVERFLOW flags set on ABSU (-INT_MIN).
>
> I think you want to change fold_abs_const to properly deal with arg0 being
> signed and type unsigned.  That is, sth like
>
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index 6f80f1b1d69..f60f9c77e91 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -13843,18 +13843,19 @@ fold_abs_const (tree arg0, tree type)
>{
>  /* If the value is unsigned or non-negative, then the absolute value
>is the same as the ordinary value.  */
> -   if (!wi::neg_p (wi::to_wide (arg0), TYPE_SIGN (type)))
> - t = arg0;
> +   wide_int val = wi::to_wide (arg0);
> +   bool overflow = false;
> +   if (!wi::neg_p (val, TYPE_SIGN (TREE_TYPE (arg0
> + ;
>
> /* If the value is negative, then the absolute value is
>its negation.  */
> else
> - {
> -   bool overflow;
> -   wide_int val = wi::neg (wi::to_wide (arg0), );
> -   t = force_fit_type (type, val, -1,
> -   overflow | TREE_OVERFLOW (arg0));
> - }
> + wide_int val = wi::neg (val, );
> +
> +   /* Force to the destination type, set TREE_OVERFLOW for signed
> +  TYPE only.  */
> +   t = force_fit_type (type, val, 1, overflow | TREE_OVERFLOW (arg0));
>}
>break;
>
> and then simply share the const_unop code with ABS_EXPR.

Done.

> diff --git a/gcc/match.pd b/gcc/match.pd
> index 14386da..7d7c132 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -102,6 +102,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (match (nop_convert @0)
>   @0)
>
> +(simplify (abs (convert @0))
> + (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +  && !TYPE_UNSIGNED (TREE_TYPE (@0))
> +  && !TYPE_UNSIGNED (type))
> +  (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
> +   (convert (absu:utype @0)
> +
> +
>
> please put a comment before the pattern.  I believe there's no
> need to check for !TYPE_UNSIGNED (type).  Note this
> also converts abs ((char)int-var) to (char)absu(int-var) which
> doesn't make sense.  The original issue you want to address
> here is the case where TYPE_PRECISION of @0 is less than
> the precision of type.  That is, you want to remove language
> introduced integer promotion of @0 which only is possible
> with ABSU.  So please do add such precision check
> (I simply suggested the bogus direction of the test).

Done.
>
> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
> index 68f4fd3..9b62583 100644
> --- a/gcc/tree-cfg.c
> +++ b/gcc/tree-cfg.c
> @@ -3685,6 +3685,12 @@ verify_gimple_assign_unary (gassign *stmt)
>  case PAREN_EXPR:
>  case CONJ_EXPR:
>break;
> +case ABSU_EXPR:
> +  if (!TYPE_UNSIGNED (lhs_type)
> + || !ANY_INTEGRAL_TYPE_P (rhs1_type))
>
>  if (!ANY_INTEGRAL_TYPE_P (lhs_type)
>  || !TYPE_UNSIGNED (lhs_type)
>  || !ANY_INTEGRAL_TYPE_P (rhs1_type)
>  || TYPE_UNSIGNED (rhs1_type)
>  || element_precision (lhs_type) != element_precision (rhs1_type))
>   {
>   error ("invalid types for ABSU_EXPR");
>   debug_generic_expr (lhs_type);
>   debug_generic_expr (rhs1_type);
>  return true;
>   }
>
> +   return true;
> +  return false;
> +  

[ping] Use response files from the driver in more cases

2018-06-04 Thread Eric Botcazou
https://gcc.gnu.org/ml/gcc-patches/2018-04/msg01172.html

Thanks in advance.

-- 
Eric Botcazou


[PING] [PATCH] Add a comma to comment in lower_eh_constructs_2

2018-06-04 Thread Zhouyi Zhou
Hi,
   Following patch has not been reviewed for two weeks. I think the
patch is trivial,
it doesn't affect other parts of GCC.

   https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01005.html

   And I don't have write permission to GCC
Thanks
Zhouyi
On Mon, May 21, 2018 at 12:09 PM, Zhouyi Zhou  wrote:
> Hi
>
> I think the comment in lower_eh_constructs_2 need a comma.
>
> And I don't have write permission to GCC.
>
> Bootstraped on x86-64
>
> Thanks,
> Zhouyi
>
> Signed-off-by: Zhouyi Zhou 
>
> gcc/ChangeLog:
>
> 2018-05-21 Zhouyi Zhou 
> * tree-eh.c (lower_eh_constructs_2): Add a comma to comment.
>
> ---
>  gcc/tree-eh.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/tree-eh.c b/gcc/tree-eh.c
> index 30c6d9e..06fe748 100644
> --- a/gcc/tree-eh.c
> +++ b/gcc/tree-eh.c
> @@ -2028,7 +2028,7 @@ lower_eh_constructs_2 (struct leh_state *state, 
> gimple_stmt_iterator *gsi)
>/* FALLTHRU */
>
>  case GIMPLE_ASSIGN:
> -  /* If the stmt can throw use a new temporary for the assignment
> +  /* If the stmt can throw, use a new temporary for the assignment
>   to a LHS.  This makes sure the old value of the LHS is
>  available on the EH edge.  Only do so for statements that
>  potentially fall through (no noreturn calls e.g.), otherwise
> --
> 2.1.4
>


Re: [PATCH] Improve merge_ranges (PR tree-optimization/69615)

2018-06-04 Thread Richard Biener
On Sun, 3 Jun 2018, Jakub Jelinek wrote:

> Hi!
> 
> We canonicalize x < type_max_val (type (x)) as x != type_max_val (type (x))
> and similarly for x > type_min_val (type (x)).  Unfortunately the former
> form is what is often more beneficial for merge_ranges, if we have as in the
> testcase e.g. x >= 0 && x != __INT_MAX__, in?_p is different and we don't
> optimize it, but it can be optimized into [0, __INT_MAX__-1] range.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

OK.

Richard.

> 2018-06-03  Jakub Jelinek  
> 
>   PR tree-optimization/69615
>   * fold-const.c (merge_ranges): If range1 is - [x, x] and x is the
>   maximum or minimum of the type, try to merge it also as if
>   range1 is + [-, x - 1] or + [x + 1, -].
> 
>   * gcc.dg/pr69615.c: New test.
> 
> --- gcc/fold-const.c.jj   2018-05-31 20:53:33.0 +0200
> +++ gcc/fold-const.c  2018-06-02 19:20:10.210975502 +0200
> @@ -5084,6 +5084,29 @@ merge_ranges (int *pin_p, tree *plow, tr
>tem = high0, high0 = high1, high1 = tem;
>  }
>  
> +  /* If the second range is != high1 where high1 is the type maximum of
> + the type, try first merging with < high1 range.  */
> +  if (low1
> +  && high1
> +  && TREE_CODE (low1) == INTEGER_CST
> +  && (TREE_CODE (TREE_TYPE (low1)) == INTEGER_TYPE
> +   || (TREE_CODE (TREE_TYPE (low1)) == ENUMERAL_TYPE
> +   && known_eq (TYPE_PRECISION (TREE_TYPE (low1)),
> +GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (low1))
> +  && operand_equal_p (low1, high1, 0))
> +{
> +  if (tree_int_cst_equal (low1, TYPE_MAX_VALUE (TREE_TYPE (low1)))
> +   && merge_ranges (pin_p, plow, phigh, in0_p, low0, high0,
> +!in1_p, NULL_TREE, range_predecessor (low1)))
> + return true;
> +  /* Similarly for the second range != low1 where low1 is the type 
> minimum
> +  of the type, try first merging with > low1 range.  */
> +  if (tree_int_cst_equal (low1, TYPE_MIN_VALUE (TREE_TYPE (low1)))
> +   && merge_ranges (pin_p, plow, phigh, in0_p, low0, high0,
> +!in1_p, range_successor (low1), NULL_TREE))
> + return true;
> +}
> +
>/* Now flag two cases, whether the ranges are disjoint or whether the
>   second range is totally subsumed in the first.  Note that the tests
>   below are simplified by the ones above.  */
> --- gcc/testsuite/gcc.dg/pr69615.c.jj 2018-06-02 19:40:26.282663273 +0200
> +++ gcc/testsuite/gcc.dg/pr69615.c2018-06-02 19:40:01.427633205 +0200
> @@ -0,0 +1,37 @@
> +/* PR tree-optimization/69615 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not " >= 0" "optimized" } } */
> +/* { dg-final { scan-tree-dump-not " < 0" "optimized" } } */
> +/* { dg-final { scan-tree-dump-not " <= 23" "optimized" } } */
> +/* { dg-final { scan-tree-dump-not " > 23" "optimized" } } */
> +
> +extern void foo (void);
> +
> +void
> +f1 (int x)
> +{
> +  if (x >= 0 && x <= __INT_MAX__ - 1)
> +foo ();
> +}
> +
> +void
> +f2 (int x, int y)
> +{
> +  if (x >= 0 && y && x <= __INT_MAX__ - 1)
> +foo ();
> +}
> +
> +void
> +f3 (int x)
> +{
> +  if (x > -__INT_MAX__ - 1 && x <= 23)
> +foo ();
> +}
> +
> +void
> +f4 (int x, int y)
> +{
> +  if (x > -__INT_MAX__ - 1 && y && x <= 23)
> +foo ();
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] Fix -Wduplicate-branches ICE in inchash::add_expr (PR c++/86025)

2018-06-04 Thread Richard Biener
On Sun, 3 Jun 2018, Jakub Jelinek wrote:

> Hi!
> 
> The second argument of OMP_CRITICAL is IDENTIFIER_NODE, but we were ICEing
> on that with -Wduplicate-branches, because we didn't know how to hash it.
> operand_equal_p handles it the way it should, only pointer equal
> IDENTIFIER_NODEs are the same.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2018-06-03  Jakub Jelinek  
> 
>   PR c++/86025
>   * tree.c (inchash::add_expr): Handle IDENTIFIER_NODE.
> 
>   * c-c++-common/gomp/pr86025.c: New test.
> 
> --- gcc/tree.c.jj 2018-05-25 14:34:36.869377392 +0200
> +++ gcc/tree.c2018-06-01 15:36:40.882938758 +0200
> @@ -7360,6 +7360,9 @@ add_expr (const_tree t, inchash::hash 
>for (i = 0; i < TREE_VEC_LENGTH (t); ++i)
>   inchash::add_expr (TREE_VEC_ELT (t, i), hstate, flags);
>return;
> +case IDENTIFIER_NODE:
> +  hstate.add_object (IDENTIFIER_HASH_VALUE (t));
> +  return;
>  case FUNCTION_DECL:
>/* When referring to a built-in FUNCTION_DECL, use the __builtin__ 
> form.
>Otherwise nodes that compare equal according to operand_equal_p might
> --- gcc/testsuite/c-c++-common/gomp/pr86025.c.jj  2018-06-01 
> 15:55:50.377382152 +0200
> +++ gcc/testsuite/c-c++-common/gomp/pr86025.c 2018-06-01 15:55:32.632359904 
> +0200
> @@ -0,0 +1,20 @@
> +/* PR c++/86025 */
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Wduplicated-branches" } */
> +
> +int i;
> +
> +void
> +foo (int x)
> +{
> +  if (x)
> +{
> +  #pragma omp critical (foo)
> +  i++;
> +}
> +  else
> +{
> +  #pragma omp critical
> +  i++;
> +}
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] PR target/85358: Add target hook to prevent default widening

2018-06-04 Thread Richard Biener
On Fri, 1 Jun 2018, Joseph Myers wrote:

> On Fri, 1 Jun 2018, Michael Meissner wrote:
> 
> > I'm wondering if there are other suggestions to make this patch acceptable.
> > 
> > As I mentioned previously, the initialization process needs to go through 
> > all
> > of the widening tables in order to initialize all FP types, so we can't just
> > arbitrarily eliminate IFmode from the widening table.
> 
> Initialization that's meant to cover all floating-point modes logically 
> should not rely on everything being reachable by the "wider" relation.
> 
> That is, I'd expect it to do something equivalent to 
> FOR_EACH_MODE_IN_CLASS (mode, MODE_FLOAT) rather than something based on 
> "wider".

The unfortunate thing is that all iterators are wrapped around
the "wider" relationship - we do not have any other way to find
related modes.  So the concept that all modes in a class can be
ordered after their width is baked in very deeply.  That's IMHO
something we need to eventually fix to avoid this kind of
target-hook "hacks".

Richard.

>  But if the initialization relies on some form of partial 
> ordering (initializing some modes relies on other modes having been 
> initialized first, in a way somehow related to the "wider" relation), it 
> may be more complicated than that.


[patch] Enhance GIMPLE store-merging pass for bit-fields (2)

2018-06-04 Thread Eric Botcazou
Hi,

the previous patch makes it possible to merge bit-field stores whose RHS is a 
constant or a SSA name, but there is a hitch: if the SSA name is the result of 
an "interesting" load, then the optimization is blocked.  That's because the 
GIMPLE store-merging pass not only attempts to merge stores but also loads if 
they directly feed subsequent stores.  Therefore the code generated for:

struct S {
  unsigned int flag : 1;
  unsigned int size : 31;
};

void foo (struct S *s, struct S *m)
{
  s->flag = 1;
  s->size = m->size;
}

is still abysmal at -O2:

orb $1, (%rdi)
movl(%rsi), %eax
andl$-2, %eax
movl%eax, %edx
movl(%rdi), %eax
andl$1, %eax
orl %edx, %eax
movl%eax, (%rdi)
ret

The attached patch changes it into the optimal:

movl(%rsi), %eax
orl $1, %eax
movl%eax, (%rdi)
ret

The patch doesn't modify the overall logic of the pass but just turns MEM_REF 
stores into BIT_INSERT_EXPR stores when there is a preceding or subsequent 
BIT_INSERT_EXPR or INTEGER_CST store in the same bit-field region.

Tested on x86-64/Linux, OK for the mainline?


2018-06-04  Eric Botcazou  

* gimple-ssa-store-merging.c (struct merged_store_group): Move up
bit_insertion field and declare can_be_merged_into method.
(merged_store_group::can_be_merged_into): New method.
(imm_store_chain_info::coalesce_immediate): Call it to decide whether
consecutive non-overlapping stores can be merged.  Turn MEM_REF stores
into BIT_INSERT_EXPR stores if the group contains a non-MEM_REF store.


2018-06-04  Eric Botcazou  

* gcc.dg/store_merging_21.c: New test.
* gnat.dg/opt71b.adb: Likewise.

-- 
Eric BotcazouIndex: gimple-ssa-store-merging.c
===
--- gimple-ssa-store-merging.c	(revision 261128)
+++ gimple-ssa-store-merging.c	(working copy)
@@ -1426,6 +1426,7 @@ struct merged_store_group
   unsigned int load_align[2];
   unsigned int first_order;
   unsigned int last_order;
+  bool bit_insertion;
 
   auto_vec stores;
   /* We record the first and last original statements in the sequence because
@@ -1435,10 +1436,10 @@ struct merged_store_group
   gimple *first_stmt;
   unsigned char *val;
   unsigned char *mask;
-  bool bit_insertion;
 
   merged_store_group (store_immediate_info *);
   ~merged_store_group ();
+  bool can_be_merged_into (store_immediate_info *);
   void merge_into (store_immediate_info *);
   void merge_overlapping (store_immediate_info *);
   bool apply_stores ();
@@ -1851,8 +1852,47 @@ merged_store_group::~merged_store_group
 XDELETEVEC (val);
 }
 
+/* Return true if the store described by INFO can be merged into the group.  */
+
+bool
+merged_store_group::can_be_merged_into (store_immediate_info *info)
+{
+  /* Do not merge bswap patterns.  */
+  if (info->rhs_code == LROTATE_EXPR)
+return false;
+
+  /* The canonical case.  */
+  if (info->rhs_code == stores[0]->rhs_code)
+return true;
+
+  /* BIT_INSERT_EXPR is compatible with INTEGER_CST.  */
+  if (info->rhs_code == BIT_INSERT_EXPR && stores[0]->rhs_code == INTEGER_CST)
+return true;
+
+  if (stores[0]->rhs_code == BIT_INSERT_EXPR && info->rhs_code == INTEGER_CST)
+return true;
+
+  /* We can turn MEM_REF into BIT_INSERT_EXPR for bit-field stores.  */
+  if (info->rhs_code == MEM_REF
+  && (stores[0]->rhs_code == INTEGER_CST
+	  || stores[0]->rhs_code == BIT_INSERT_EXPR)
+  && info->bitregion_start == stores[0]->bitregion_start
+  && info->bitregion_end == stores[0]->bitregion_end)
+return true;
+
+  if (stores[0]->rhs_code == MEM_REF
+  && (info->rhs_code == INTEGER_CST
+	  || info->rhs_code == BIT_INSERT_EXPR)
+  && info->bitregion_start == stores[0]->bitregion_start
+  && info->bitregion_end == stores[0]->bitregion_end)
+return true;
+
+  return false;
+}
+
 /* Helper method for merge_into and merge_overlapping to do
the common part.  */
+
 void
 merged_store_group::do_merge (store_immediate_info *info)
 {
@@ -2673,12 +2713,7 @@ imm_store_chain_info::coalesce_immediate
 	 Merge it into the current store group.  There can be gaps in between
 	 the stores, but there can't be gaps in between bitregions.  */
   else if (info->bitregion_start <= merged_store->bitregion_end
-	   && info->rhs_code != LROTATE_EXPR
-	   && (info->rhs_code == merged_store->stores[0]->rhs_code
-		   || (info->rhs_code == INTEGER_CST
-		   && merged_store->stores[0]->rhs_code == BIT_INSERT_EXPR)
-		   || (info->rhs_code == BIT_INSERT_EXPR
-		   && merged_store->stores[0]->rhs_code == INTEGER_CST)))
+	   && merged_store->can_be_merged_into (info))
 	{
 	  store_immediate_info *infof = merged_store->stores[0];
 
@@ -2696,21 +2731,41 @@ imm_store_chain_info::coalesce_immediate
 	  std::swap (info->ops[0], info->ops[1]);