[PATCHv2][PR 57371] Remove useless floating point casts in comparisons

2017-07-06 Thread Yuri Gribov
Hi all,

This is an updated version of patch in
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00034.html . It should
be much more complete, both in functionality and in tests.

Bootstrapped and regtested on x64. Ok for trunk?

-Y


pr57371-2.patch
Description: Binary data


[PATCH] prevent -Wall from resetting -Wstringop-overflow=2 to 1 (pr 81345)

2017-07-06 Thread Martin Sebor

The -Wstringop-overflow option defaults to 2 (for Object Size
Checking type 1).  But when -Wall is used it resets the default
value to 1.  This happens because when I added the option to
c.opt I assumed it would default to, well, the default value
set by the Init() directive regardless of whether or not -Wall
was used.  The attached patch explicitly specifies the defaults
to correct this.

Btw., I think this behavior is too surprising to be correct or
(I hope) even intended for options with arguments.  -Wstringop-
overflow is specified like this:

  C ObjC C++ ObjC++ Joined RejectNegative UInteger 
Var(warn_stringop_overflow) Init(2) Warning LangEnabledBy(C ObjC C++ 
ObjC++, Wall) IntegerRange(0, 4)


with the LangEnabledBy form used above documented like this:

  LangEnabledBy(language, opt)

When compiling for the given language, the option is set to
the value of -opt,

IMO, it makes little sense for an option that takes an argument
and that specifies a binary option like -Wall in LangEnabledBy
to default to the binary value of the latter option.  I think
it would be more intuitive and convenient for it to default to
the value set by its Init directive for the positive form of
the binary option and to zero for the negative form (or to empty
for strings, if that's ever done).

Martin
PR other/81345 -  -Wall resets -Wstringop-overflow to 1 from the default 2

gcc/c-family/ChangeLog:

	PR other/81345
	* c.opt (-Wstringop-overflow): Set defaults in LangEnabledBy.

gcc/testsuite/ChangeLog:

	PR other/81345
	* gcc.dg/pr81345.c: New test.


diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 05766c4..e0ad3ab 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -732,7 +732,7 @@ Warn about buffer overflow in string manipulation functions like memcpy
 and strcpy.
 
 Wstringop-overflow=
-C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_stringop_overflow) Init(2) Warning LangEnabledBy(C ObjC C++ ObjC++, Wall) IntegerRange(0, 4)
+C ObjC C++ ObjC++ Joined RejectNegative UInteger Var(warn_stringop_overflow) Init(2) Warning LangEnabledBy(C ObjC C++ ObjC++, Wall, 2, 0) IntegerRange(0, 4)
 Under the control of Object Size type, warn about buffer overflow in string
 manipulation functions like memcpy and strcpy.
 
diff --git a/gcc/testsuite/gcc.dg/pr81345.c b/gcc/testsuite/gcc.dg/pr81345.c
new file mode 100644
index 000..c2cbad7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr81345.c
@@ -0,0 +1,17 @@
+/* PR other/81345 - -Wall resets -Wstringop-overflow to 1 from the default 2
+   { dg-do compile }
+   { dg-options "-O2 -Wall" } */
+
+char a[3];
+
+void f (const char *s)
+{
+  __builtin_strncpy (a, s, sizeof a + 1);   /* { dg-warning "\\\[-Wstringop-overflow=]" } */
+}
+
+struct S { char a[3]; int i; };
+
+void g (struct S *d, const char *s)
+{
+  __builtin_strncpy (d->a, s, sizeof d->a + 1);   /* { dg-warning "\\\[-Wstringop-overflow=]" } */
+}


Re: [PATCH] PR target/81313: Use DRAP only if there are outgoing arguments on stack

2017-07-06 Thread H.J. Lu
On Thu, Jul 6, 2017 at 12:08 PM, H.J. Lu  wrote:
> Since DRAP is needed only if there are outgoing arguments on stack, we
> should track outgoing arguments on stack and avoid setting need_drap to
> true when there are no outgoing arguments on stack.
>
> Tested on i686 and x86-64 with SSE2, AVX and AVX2.  There is no
> regression.  OK for trunk?
>
> H.J.
> ---
> gcc/
>
> PR target/81313
> * config/i386/i386.c (ix86_function_arg_advance): Set
> outgoing_args_on_stack to true if there are outgoing arguments
> on stack.
> (ix86_function_arg): Likewise.
> (ix86_get_drap_rtx): Use DRAP only if there are outgoing
> arguments on stack and ACCUMULATE_OUTGOING_ARGS is false.
> * config/i386/i386.h (machine_function): Add
> outgoing_args_on_stack.
>
> @@ -10473,6 +10479,10 @@ ix86_function_arg (cumulative_args_t cum_v, 
> machine_mode omode,
>else
>  arg = function_arg_32 (cum, mode, omode, type, bytes, words);
>
> +  /* Track if there are outgoing arguments on stack.  */
> +  if (arg == NULL_RTX)
> +cfun->machine->outgoing_args_on_stack = true;

This should be

+  /* Track if there are outgoing arguments on stack.  */
+  if (arg == NULL_RTX && cum->caller)
+cfun->machine->outgoing_args_on_stack = true;

to check outgoing arguments for caller here.

>return arg;
>  }
>
>

I am testing updated patch with a new testcase.


-- 
H.J.
From 4d02c433206790e0ae7de4e91c0f412f9bdac7e8 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Thu, 6 Jul 2017 08:58:46 -0700
Subject: [PATCH] x86: Use DRAP only if there are outgoing arguments on stack

Since DRAP is needed only if there are outgoing arguments on stack, we
should track outgoing arguments on stack and avoid setting need_drap to
true when there are no outgoing arguments on stack.

gcc/

	PR target/81313
	* config/i386/i386.c (ix86_function_arg_advance): Set
	outgoing_args_on_stack to true if there are outgoing arguments
	on stack.
	(ix86_function_arg): Likewise.
	(ix86_get_drap_rtx): Use DRAP only if there are outgoing
	arguments on stack and ACCUMULATE_OUTGOING_ARGS is false.
	* config/i386/i386.h (machine_function): Add
	outgoing_args_on_stack.

gcc/testsuite/

	PR target/81313
	* gcc.target/i386/pr81313-1.c: New test.
	* gcc.target/i386/pr81313-2.c: Likewise.
	* gcc.target/i386/pr81313-3.c: Likewise.
	* gcc.target/i386/pr81313-4.c: Likewise.
	* gcc.target/i386/pr81313-5.c: Likewise.
---
 gcc/config/i386/i386.c| 18 --
 gcc/config/i386/i386.h|  3 +++
 gcc/testsuite/gcc.target/i386/pr81313-1.c | 12 
 gcc/testsuite/gcc.target/i386/pr81313-2.c | 12 
 gcc/testsuite/gcc.target/i386/pr81313-3.c | 12 
 gcc/testsuite/gcc.target/i386/pr81313-4.c | 12 
 gcc/testsuite/gcc.target/i386/pr81313-5.c | 12 
 7 files changed, 79 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81313-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81313-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81313-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81313-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81313-5.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1a8a3a3..b041524 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -10143,7 +10143,13 @@ ix86_function_arg_advance (cumulative_args_t cum_v, machine_mode mode,
   /* For pointers passed in memory we expect bounds passed in Bounds
  Table.  */
   if (!nregs)
-cum->bnds_in_bt = chkp_type_bounds_count (type);
+{
+  /* Track if there are outgoing arguments on stack.  */
+  if (cum->caller)
+	cfun->machine->outgoing_args_on_stack = true;
+
+  cum->bnds_in_bt = chkp_type_bounds_count (type);
+}
 }
 
 /* Define where to put the arguments to a function.
@@ -10473,6 +10479,10 @@ ix86_function_arg (cumulative_args_t cum_v, machine_mode omode,
   else
 arg = function_arg_32 (cum, mode, omode, type, bytes, words);
 
+  /* Track if there are outgoing arguments on stack.  */
+  if (arg == NULL_RTX && cum->caller)
+cfun->machine->outgoing_args_on_stack = true;
+
   return arg;
 }
 
@@ -13646,7 +13656,11 @@ ix86_update_stack_boundary (void)
 static rtx
 ix86_get_drap_rtx (void)
 {
-  if (ix86_force_drap || !ACCUMULATE_OUTGOING_ARGS)
+  /* We must use DRAP if there are outgoing arguments on stack and
+ ACCUMULATE_OUTGOING_ARGS is false.  */
+  if (ix86_force_drap
+  || (cfun->machine->outgoing_args_on_stack
+	  && !ACCUMULATE_OUTGOING_ARGS))
 crtl->need_drap = true;
 
   if (stack_realign_drap)
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 08243c1..a2ae9b4 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2657,6 +2657,9 @@ struct GTY(()) machine_function {
  frame pointer.) */
   unsigned int call_ms2sysv_extra_regs:3;
 
+  /* Nonzero if the function places outgoing argumen

[PATCH, rs6000] Modify libgcc's float128 IFUNC resolver functions to use __builtin_cpu_supports()

2017-07-06 Thread Peter Bergner
Usage of getauxval() within the float128 libgcc IFUNC resolver functions is
causing problems:

https://sourceware.org/bugzilla/show_bug.cgi?id=21707

Alan describes why we can't have relocations in IFUNC resolver functions here:

https://gcc.gnu.org/PR81193

With the addition of __builtin_cpu_supports (), we no longer need to call
getauxval() to query the HWCAP/HWCAP2 masks, so let's use that instead.

I have verified with some small test cases, that we do call the correct
__{add,sub,...}k3_hw() functions instead of the *_sw versions.  I did that
by running the test cases in GDB and manually setting the IEEE128 bit in
the HWCAP2 mask stored in the TCB before the resolvers were run.

This bootstrapps and regtests with no regressions, ok for trunk?

I will note that this patch causes issues in some tests in the GLIBC testsiute,
which Tulio is working on fixing (it's a GLIBC issue, not a GCC issue), so if
this patch is "ok", I plan on holding off on committing this, until the GLIBC
fix is committed.

Peter

* config/rs6000/float128-ifunc.c: Don't include auxv.h.
(have_ieee_hw_p): Delete function.
(SW_OR_HW) Use __builtin_cpu_supports().

Index: libgcc/config/rs6000/float128-ifunc.c
===
--- libgcc/config/rs6000/float128-ifunc.c   (revision 249850)
+++ libgcc/config/rs6000/float128-ifunc.c   (working copy)
@@ -45,47 +45,7 @@
 #error "This module must not be compiled with IEEE 128-bit hardware support"
 #endif
 
-#include 
-
-/* Use the namespace clean version of getauxval.  However, not all versions of
-   sys/auxv.h declare it, so declare it here.  This code is intended to be
-   temporary until a suitable version of __builtin_cpu_supports is added that
-   allows us to tell quickly if the machine supports IEEE 128-bit hardware.  */
-extern unsigned long __getauxval (unsigned long);
-
-static int
-have_ieee_hw_p (void)
-{
-  static int ieee_hw_p = -1;
-
-  if (ieee_hw_p < 0)
-{
-  char *p = (char *) __getauxval (AT_PLATFORM);
-
-  ieee_hw_p = 0;
-
-  /* Don't use atoi/strtol/strncmp/etc.  These may require the normal
-environment to be setup to set errno to 0, and the ifunc resolvers run
-before the whole glibc environment is initialized.  */
-  if (p && p[0] == 'p' && p[1] == 'o' && p[2] == 'w' && p[3] == 'e'
- && p[4] == 'r')
-   {
- long n = 0;
- char ch;
-
- p += 5;
- while ((ch = *p++) >= '0' && (ch <= '9'))
-   n = (n * 10) + (ch - '0');
-
- if (n >= 9)
-   ieee_hw_p = 1;
-   }
-}
-
-  return ieee_hw_p;
-}
-
-#define SW_OR_HW(SW, HW) (have_ieee_hw_p () ? HW : SW)
+#define SW_OR_HW(SW, HW) (__builtin_cpu_supports ("ieee128") ? HW : SW)
 
 /* Resolvers.  */
 


Re: Ping [Patch, fortran] PR70071

2017-07-06 Thread Janus Weil
Applied to trunk as r250039. Thanks for the patch!

Cheers,
Janus

2017-07-05 22:03 GMT+02:00 Janus Weil :
> Hi Harald,
>
> thanks for the reminder. I can take care of committing the patch for
> you. Just give me a day or two ...
>
> Cheers,
> Janus
>
>
>
> 2017-07-05 20:44 GMT+02:00 Harald Anlauf :
>> The patch below has not been applied to the best of my knowledge.
>>
>> Just a reminder for whoever cares.
>>
>> Harald
>>
>> On 05/04/17 20:19, Harald Anlauf wrote:
>>> On 05/04/17 18:15, Steve Kargl wrote:
 On Thu, May 04, 2017 at 05:26:17PM +0200, Harald Anlauf wrote:
> While trying to clean up my working copy, I found that the trivial
> patch for the ICE-on-invalid as described in the PR regtests cleanly
> for 7-release on i686-pc-linux-gnu.
>
> Here's the cleaned-up version (diffs attached).
>
> 2017-05-04  Harald Anlauf  
>
> PR fortran/70071
> * array.c (gfc_ref_dimen_size): Handle bad subscript triplets.
>
> 2017-05-04  Harald Anlauf  
>
> PR fortran/70071
> * gfortran.dg/coarray_44.f90: New testcase.
>

 Harald,

 The patch looks reasonable.  Do you have a commit privilege?

>>>
>>> Steve,
>>>
>>> no, I don't.
>>>
>>> Would you like to take care of the patch?  Then please do so.
>>>
>>> Thanks,
>>> Harald
>>>
>>


[PATCH] PR target/81313: Use DRAP only if there are outgoing arguments on stack

2017-07-06 Thread H.J. Lu
Since DRAP is needed only if there are outgoing arguments on stack, we
should track outgoing arguments on stack and avoid setting need_drap to
true when there are no outgoing arguments on stack.

Tested on i686 and x86-64 with SSE2, AVX and AVX2.  There is no
regression.  OK for trunk?

H.J.
---
gcc/

PR target/81313
* config/i386/i386.c (ix86_function_arg_advance): Set
outgoing_args_on_stack to true if there are outgoing arguments
on stack.
(ix86_function_arg): Likewise.
(ix86_get_drap_rtx): Use DRAP only if there are outgoing
arguments on stack and ACCUMULATE_OUTGOING_ARGS is false.
* config/i386/i386.h (machine_function): Add
outgoing_args_on_stack.

gcc/testsuite/

PR target/81313
* gcc.target/i386/pr81313-1.c: New test.
* gcc.target/i386/pr81313-2.c: Likewise.
* gcc.target/i386/pr81313-3.c: Likewise.
* gcc.target/i386/pr81313-4.c: Likewise.
---
 gcc/config/i386/i386.c| 18 --
 gcc/config/i386/i386.h|  3 +++
 gcc/testsuite/gcc.target/i386/pr81313-1.c | 12 
 gcc/testsuite/gcc.target/i386/pr81313-2.c | 12 
 gcc/testsuite/gcc.target/i386/pr81313-3.c | 12 
 gcc/testsuite/gcc.target/i386/pr81313-4.c | 12 
 6 files changed, 67 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81313-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81313-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81313-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81313-4.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1a8a3a3..9b64d50 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -10143,7 +10143,13 @@ ix86_function_arg_advance (cumulative_args_t cum_v, 
machine_mode mode,
   /* For pointers passed in memory we expect bounds passed in Bounds
  Table.  */
   if (!nregs)
-cum->bnds_in_bt = chkp_type_bounds_count (type);
+{
+  /* Track if there are outgoing arguments on stack.  */
+  if (cum->caller)
+   cfun->machine->outgoing_args_on_stack = true;
+
+  cum->bnds_in_bt = chkp_type_bounds_count (type);
+}
 }
 
 /* Define where to put the arguments to a function.
@@ -10473,6 +10479,10 @@ ix86_function_arg (cumulative_args_t cum_v, 
machine_mode omode,
   else
 arg = function_arg_32 (cum, mode, omode, type, bytes, words);
 
+  /* Track if there are outgoing arguments on stack.  */
+  if (arg == NULL_RTX)
+cfun->machine->outgoing_args_on_stack = true;
+
   return arg;
 }
 
@@ -13646,7 +13656,11 @@ ix86_update_stack_boundary (void)
 static rtx
 ix86_get_drap_rtx (void)
 {
-  if (ix86_force_drap || !ACCUMULATE_OUTGOING_ARGS)
+  /* We must use DRAP if there are outgoing arguments on stack and
+ ACCUMULATE_OUTGOING_ARGS is false.  */
+  if (ix86_force_drap
+  || (cfun->machine->outgoing_args_on_stack
+ && !ACCUMULATE_OUTGOING_ARGS))
 crtl->need_drap = true;
 
   if (stack_realign_drap)
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 08243c1..a2ae9b4 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -2657,6 +2657,9 @@ struct GTY(()) machine_function {
  frame pointer.) */
   unsigned int call_ms2sysv_extra_regs:3;
 
+  /* Nonzero if the function places outgoing arguments on stack.  */
+  BOOL_BITFIELD outgoing_args_on_stack : 1;
+
   /* During prologue/epilogue generation, the current frame state.
  Otherwise, the frame state at the end of the prologue.  */
   struct machine_frame_state fs;
diff --git a/gcc/testsuite/gcc.target/i386/pr81313-1.c 
b/gcc/testsuite/gcc.target/i386/pr81313-1.c
new file mode 100644
index 000..f765003
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr81313-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -mincoming-stack-boundary=4 
-mpreferred-stack-boundary=6" } */
+
+extern void foo (void);
+
+void
+bar (void)
+{
+  foo ();
+}
+
+/* { dg-final { scan-assembler-not "lea\[lq\]?\[\\t 
\]*\[0-9\]*\\(%\[er\]sp\\)" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr81313-2.c 
b/gcc/testsuite/gcc.target/i386/pr81313-2.c
new file mode 100644
index 000..2cdc645
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr81313-2.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -mno-accumulate-outgoing-args -mincoming-stack-boundary=4 
-mpreferred-stack-boundary=6 -mno-iamcu" } */
+
+extern void foo (int, int, int);
+
+void
+bar (void)
+{
+  foo (1, 2, 3);
+}
+
+/* { dg-final { scan-assembler "lea\[l\]?\[\\t \]*\[0-9\]*\\(%esp\\)" } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr81313-3.c 
b/gcc/testsuite/gcc.target/i386/pr81313-3.c
new file mode 100644
index 000..14bd708
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr81313-3.c
@@ -0,0 +1,12 @@
+/* { dg-do compile { target ia32 } } */
+/* { dg-options "-O2 -mno-accumulate-

Re: ToT build failure?

2017-07-06 Thread Jakub Jelinek
On Thu, Jul 06, 2017 at 01:45:42PM -0400, David Malcolm wrote:
> Given that the previous status quo of the selftests was to require the
> C frontend, I committed the attached patch (as r250036), under the
> "obvious" rule, retaining the ability to optionally run the selftests
> within the C++ frontend.

You should do something similar for how we make check etc.:
CHECK_TARGETS = @check_languages@

check: $(CHECK_TARGETS)

and then each Make-lang.in defining its check- goal.
So similarly to that s-selftest-c++ should be in cp/Make-lang.in
and based on the configured languages should include the s-selftest-
dependencies.

Jakub


Re: [PATCH, VAX] Correct ffs instruction constraint

2017-07-06 Thread Jeff Law
On 07/06/2017 10:59 AM, Felix Deichmann wrote:
> Jeff,
> 
> Am 29.06.2017 schrieb Jeff Law :
>> Ideally we'd like to have a testcase for this in the regression suite.
>>
>> If you could provide the .i file and options used which generated the
>> incorrect ffs instruction I can use the reduction tools with a cross
>> compiler to produce a nice simple test for the testsuite.
> 
> I put the corresponding .i file at:
> http://www.netbsd.org/~flxd/scsipi_base.i.gz
> 
> See line 7638:
> bit = __builtin_ffs(periph->periph_freetags[word]);
> 
> Command/Options used which generated the incorrect ffs instruction:
> 
> /nb8/obj/tooldir.NetBSD-7.0-amd64/bin/vax--netbsdelf-gcc -fno-pic
> -ffreestanding -fno-zero-initialized-in-bss -Os -fno-strict-aliasing
> -fno-common -std=gnu99 -Werror -Wall -Wno-main -Wno-format-zero-length
> -Wpointer-arith -Wmissing-prototypes -Wstrict-prototypes
> -Wold-style-definition -Wswitch -Wshadow -Wcast-qual -Wwrite-strings
> -Wno-unreachable-code -Wno-pointer-sign -Wno-attributes
> -Wno-sign-compare --sysroot=/nb8/obj/destdir.vax -D_VAX_INLINE_ -I.
> -I/nb8/src/sys/../common/lib/libx86emu -I/nb8/src/sys/../common/include
> -I/nb8/src/sys/arch -I/nb8/src/sys -nostdinc -D_KERNEL -D_KERNEL_OPT
> -std=gnu99 -I/nb8/src/sys/lib/libkern/../../../common/lib/libc/quad
> -I/nb8/src/sys/lib/libkern/../../../common/lib/libc/string
> -I/nb8/src/sys/lib/libkern/../../../common/lib/libc/arch/vax/string -c
> /nb8/src/sys/dev/scsipi/scsipi_base.c -o scsipi_base.o
Hmm, unfortunately I consistently get a call to into libgcc for the
__builtin_ffs code rather than an ffs instruction.  That's with a
gcc-4.8.3 as well as with trunk compiler.

Can you include "-v" output from compiling scsipi_base?

Thanks.
jeff


Re: C++ PATCHes to dependent template-id parsing

2017-07-06 Thread Jason Merrill
On Wed, Jun 28, 2017 at 3:38 PM, Jason Merrill  wrote:
> 81204 is a regression whereby previously we would accidentally get the
> parsing of res.template set right because when we did the lookup in
> the surrounding context, we found the function template and then
> ignored it.  This patch partially reverts the handling of .template to
> how it was in GCC 6.
>
> But this bug is really a special case of 54769; we should be treating
> that name as dependent and not doing a lookup in the enclosing context
> at all.  As I noted in discussion of 55576, we need to pass
> template_keyword_p into nested_name_specifier_opt.  So this patch does
> that, and also adjusts cp_parser_template_name to consider object
> scope.

I'm reverting the 81204 patch for both 7 and 8, as it is wrong under
DR 141 and the 54769 patch is better.

Jason


Re: ToT build failure?

2017-07-06 Thread David Malcolm
On Thu, 2017-07-06 at 13:18 -0400, David Malcolm wrote:
> On Thu, 2017-07-06 at 10:05 -0700, Steve Ellcey wrote:
> > Is anyone else having problems building a cross-gcc where an intial
> > gcc with C only is built first and used to build glibc?  I am
> > trying this (it worked before) and am getting:
> > 
> > /local/sellcey/gcc-aarch64/obj/gcc_initial/./gcc/xgcc 
> > -B/local/sellcey/gcc-aarch64/obj/gcc_initial/./gcc/ -xc++ -nostdinc
> > /dev/null -S -o /dev/null -fself-test=/local/sellcey/gcc
> > -aarch64/src/gcc/gcc/testsuite/selftests
> > xgcc: error: language c++ not recognized
> > xgcc: error: language c++ not recognized
> > Makefile:1972: recipe for target 's-selftest-c++' failed
> > make[1]: *** [s-selftest-c++] Error 1
> > 
> > The configure I use to build the initial GCC is:
> > 
> > /local/sellcey/gcc-aarch64/src/gcc/configure -
> > -prefix=/local/sellcey/gcc-aarch64/install --target=aarch64-cross
> > -linux-gnu  --with-newlib --without-headers --with
> > -sysroot=/local/sellcey/gcc-aarch64/install --enable-languages=c -
> > -enable-threads=no --disable-shared --disable-decimal-float -
> > -disable
> > -libsanitizer --disable-bootstrap
> > 
> > This is an x86 to aarch64 cross compiler.
> > 
> > Steve Ellcey
> > sell...@cavium.com
> 
> This is due to r250030, in which I added C++-specific selftests;
> looks
> like I need to also conditionalize them on --enable-languages.
> 
> Sorry about this.
> 
> A workaround is presumably to:
>   touch s-selftest-c++
> 
> I'll revert that change shortly.

Given that the previous status quo of the selftests was to require the
C frontend, I committed the attached patch (as r250036), under the
"obvious" rule, retaining the ability to optionally run the selftests
within the C++ frontend.

Sorry about the breakage
DaveIndex: gcc/ChangeLog
===
--- gcc/ChangeLog	(revision 250035)
+++ gcc/ChangeLog	(revision 250036)
@@ -1,3 +1,7 @@
+2017-07-06  David Malcolm  
+
+	* Makefile.in (selftest): Remove dependency on s-selftest-c++.
+
 2017-07-06  Jan Hubicka  
 
 	* lto-wrapper.c (merge_and_complain): Do not merge
Index: gcc/Makefile.in
===
--- gcc/Makefile.in	(revision 250035)
+++ gcc/Makefile.in	(revision 250036)
@@ -1920,8 +1920,10 @@
 # Use "s-selftest-FE" to ensure that we only run the selftests if the
 # driver, frontend, or selftest data change.
 .PHONY: selftest
-selftest: s-selftest-c s-selftest-c++
 
+# By default, only run the selftests within the C frontend
+selftest: s-selftest-c
+
 # C selftests
 s-selftest-c: $(C_SELFTEST_DEPS)
 	$(GCC_FOR_TARGET) $(C_SELFTEST_FLAGS)


Re: [PATCH] Add AddressSanitizer annotations to std::vector

2017-07-06 Thread Ivan Baravy
On 07/05/2017 10:00 PM, Jonathan Wakely wrote:
> This patch adds AddressSanitizer annotations to std::vector, so that
> ASan can detect out-of-bounds accesses to the unused capacity of a
> vector. e.g.
> 
>   std::vector v(2);
>   int* p = v.data();
>   v.pop_back();
>   return p[1];  // ERROR
> 
> This cannot be detected by Debug Mode, but with these annotations ASan
> knows that only v.data()[0] is valid and will give an error.
> 
> The annotations are only enabled for vector> and
> only when std::allocator's base class is either malloc_allocator or
> new_allocator. For other allocators the memory might not come from the
> freestore and so isn't tracked by ASan.
> 
> Something similar has been on the google branches for some time:
> https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=207517
> This patch is a complete rewrite from scratch, because the google code
> was not exception safe. If an exception happened while appending
> elements to a vector, so that the size didn't change, the google code
> did not undo the annotation for the increased size. It also didn't
> annotate before deallocating, to mark the unused capacity as valid
> again.
> 
> We can probably do similar annotations for std::deque, so that
> partially filled pages are annotated. I also have a patch for
> shared_ptr so that objects created by make_shared can be marked as
> invalid after they're destroyed.

Could you share your plans on sanitization of other standard containers?
My particular interest is in std::string which I'm working on now.

Also, will you backport the feature to GCC7 and GCC6?

>   * config/allocator/malloc_allocator_base.h [__SANITIZE_ADDRESS__]
>   (_GLIBCXX_SANITIZE_STD_ALLOCATOR): Define.
>   * config/allocator/new_allocator_base.h [__SANITIZE_ADDRESS__]
>   (_GLIBCXX_SANITIZE_STD_ALLOCATOR): Define.
>   * include/bits/stl_vector.h [_GLIBCXX_SANITIZE_STD_ALLOCATOR]
>   (_Vector_impl::_Asan, _Vector_impl::_Asan::_Reinit)
>   (_Vector_impl::_Asan::_Grow, _GLIBCXX_ASAN_ANNOTATE_REINIT)
>   (_GLIBCXX_ASAN_ANNOTATE_GROW, _GLIBCXX_ASAN_ANNOTATE_GREW)
>   (_GLIBCXX_ASAN_ANNOTATE_SHRINK, _GLIBCXX_ASAN_ANNOTATE_BEFORE_DEALLOC):
>   Define annotation helper types and macros.
>   (vector::~vector, vector::push_back, vector::pop_back)
>   (vector::_M_erase_at_end): Add annotations.
>   * include/bits/vector.tcc (vector::reserve, vector::emplace_back)
>   (vector::insert, vector::_M_erase, vector::operator=)
>   (vector::_M_fill_assign, vector::_M_assign_aux)
>   (vector::_M_insert_rval, vector::_M_emplace_aux)
>   (vector::_M_insert_aux, vector::_M_realloc_insert)
>   (vector::_M_fill_insert, vector::_M_default_append)
>   (vector::_M_shrink_to_fit, vector::_M_range_insert): Annotate.
> 
> Tested x86_64-linux (using -fsanitize=address, with some local patches
> to the testsuite) and powerpc64le-linux.
> 
> I plan to commit this to trunk tomorrow.
> 


Re: ToT build failure?

2017-07-06 Thread David Malcolm
On Thu, 2017-07-06 at 10:05 -0700, Steve Ellcey wrote:
> Is anyone else having problems building a cross-gcc where an intial
> gcc with C only is built first and used to build glibc?  I am
> trying this (it worked before) and am getting:
> 
> /local/sellcey/gcc-aarch64/obj/gcc_initial/./gcc/xgcc 
> -B/local/sellcey/gcc-aarch64/obj/gcc_initial/./gcc/ -xc++ -nostdinc
> /dev/null -S -o /dev/null -fself-test=/local/sellcey/gcc
> -aarch64/src/gcc/gcc/testsuite/selftests
> xgcc: error: language c++ not recognized
> xgcc: error: language c++ not recognized
> Makefile:1972: recipe for target 's-selftest-c++' failed
> make[1]: *** [s-selftest-c++] Error 1
> 
> The configure I use to build the initial GCC is:
> 
> /local/sellcey/gcc-aarch64/src/gcc/configure -
> -prefix=/local/sellcey/gcc-aarch64/install --target=aarch64-cross
> -linux-gnu  --with-newlib --without-headers --with
> -sysroot=/local/sellcey/gcc-aarch64/install --enable-languages=c -
> -enable-threads=no --disable-shared --disable-decimal-float --disable
> -libsanitizer --disable-bootstrap
> 
> This is an x86 to aarch64 cross compiler.
> 
> Steve Ellcey
> sell...@cavium.com

This is due to r250030, in which I added C++-specific selftests; looks
like I need to also conditionalize them on --enable-languages.

Sorry about this.

A workaround is presumably to:
  touch s-selftest-c++

I'll revert that change shortly.

Dave


ToT build failure?

2017-07-06 Thread Steve Ellcey
Is anyone else having problems building a cross-gcc where an intial
gcc with C only is built first and used to build glibc?  I am
trying this (it worked before) and am getting:

/local/sellcey/gcc-aarch64/obj/gcc_initial/./gcc/xgcc 
-B/local/sellcey/gcc-aarch64/obj/gcc_initial/./gcc/ -xc++ -nostdinc /dev/null 
-S -o /dev/null 
-fself-test=/local/sellcey/gcc-aarch64/src/gcc/gcc/testsuite/selftests
xgcc: error: language c++ not recognized
xgcc: error: language c++ not recognized
Makefile:1972: recipe for target 's-selftest-c++' failed
make[1]: *** [s-selftest-c++] Error 1

The configure I use to build the initial GCC is:

/local/sellcey/gcc-aarch64/src/gcc/configure 
--prefix=/local/sellcey/gcc-aarch64/install --target=aarch64-cross-linux-gnu  
--with-newlib --without-headers 
--with-sysroot=/local/sellcey/gcc-aarch64/install --enable-languages=c 
--enable-threads=no --disable-shared --disable-decimal-float 
--disable-libsanitizer --disable-bootstrap

This is an x86 to aarch64 cross compiler.

Steve Ellcey
sell...@cavium.com


Re: [PATCH, VAX] Correct ffs instruction constraint

2017-07-06 Thread Felix Deichmann
Jeff,

Am 29.06.2017 schrieb Jeff Law :
> Ideally we'd like to have a testcase for this in the regression suite.
> 
> If you could provide the .i file and options used which generated the
> incorrect ffs instruction I can use the reduction tools with a cross
> compiler to produce a nice simple test for the testsuite.

I put the corresponding .i file at:
http://www.netbsd.org/~flxd/scsipi_base.i.gz

See line 7638:
bit = __builtin_ffs(periph->periph_freetags[word]);

Command/Options used which generated the incorrect ffs instruction:

/nb8/obj/tooldir.NetBSD-7.0-amd64/bin/vax--netbsdelf-gcc -fno-pic
-ffreestanding -fno-zero-initialized-in-bss -Os -fno-strict-aliasing
-fno-common -std=gnu99 -Werror -Wall -Wno-main -Wno-format-zero-length
-Wpointer-arith -Wmissing-prototypes -Wstrict-prototypes
-Wold-style-definition -Wswitch -Wshadow -Wcast-qual -Wwrite-strings
-Wno-unreachable-code -Wno-pointer-sign -Wno-attributes
-Wno-sign-compare --sysroot=/nb8/obj/destdir.vax -D_VAX_INLINE_ -I.
-I/nb8/src/sys/../common/lib/libx86emu -I/nb8/src/sys/../common/include
-I/nb8/src/sys/arch -I/nb8/src/sys -nostdinc -D_KERNEL -D_KERNEL_OPT
-std=gnu99 -I/nb8/src/sys/lib/libkern/../../../common/lib/libc/quad
-I/nb8/src/sys/lib/libkern/../../../common/lib/libc/string
-I/nb8/src/sys/lib/libkern/../../../common/lib/libc/arch/vax/string -c
/nb8/src/sys/dev/scsipi/scsipi_base.c -o scsipi_base.o

Best regards,
Felix


Unify no_reorder and !flag_toplevel_reorder code

2017-07-06 Thread Jan Hubicka
Hi,
we have flag_topeleve_reorder and also no_reorder flag on the symbols.
They do roughtly the same but implementation is not 100% shared.  This makes
problem with LTO where flag_toplevel_reorder is not very meaningful (as we
don't really have meaningful toplevel args).  This patch makes 
!flag_topelevel_reorder
to simply set no_reorder flag and commonizes rest of the code except for 
partition
sorting I will handle incrementally.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* cgraphunit.c (cgraph_node::finalize_function): When
!flag_toplevel_reorde set no_reorder flag.
(varpool_node::finalize_decl): Likewise.
(symbol_table::compile): Drop no toplevel reorder path.

* lto-partition.c (lto_balanced_map): Do not check
flag_toplevel_reorder.
Index: lto/lto-partition.c
===
--- lto/lto-partition.c (revision 250021)
+++ lto/lto-partition.c (working copy)
@@ -506,7 +506,7 @@ lto_balanced_map (int n_lto_partitions,
   /* Collect all variables that should not be reordered.  */
   FOR_EACH_VARIABLE (vnode)
 if (vnode->get_partitioning_class () == SYMBOL_PARTITION
-   && (!flag_toplevel_reorder || vnode->no_reorder))
+   && vnode->no_reorder)
   varpool_order.safe_push (vnode);
   n_varpool_nodes = varpool_order.length ();
   varpool_order.qsort (varpool_node_cmp);
@@ -634,7 +634,7 @@ lto_balanced_map (int n_lto_partitions,
vnode = dyn_cast  (ref->referred);
if (!vnode->definition)
  continue;
-   if (!symbol_partitioned_p (vnode) && flag_toplevel_reorder
+   if (!symbol_partitioned_p (vnode)
&& !vnode->no_reorder
&& vnode->get_partitioning_class () == SYMBOL_PARTITION)
  add_symbol_to_partition (partition, vnode);
@@ -672,7 +672,7 @@ lto_balanced_map (int n_lto_partitions,
   because it allows them to be removed.  Coupling
   with objects they refer to only helps to reduce
   number of symbols promoted to hidden.  */
-   if (!symbol_partitioned_p (vnode) && flag_toplevel_reorder
+   if (!symbol_partitioned_p (vnode)
&& !vnode->no_reorder
&& !vnode->can_remove_if_no_refs_p ()
&& vnode->get_partitioning_class () == SYMBOL_PARTITION)
@@ -767,14 +767,10 @@ lto_balanced_map (int n_lto_partitions,
   next_nodes.truncate (0);
 
   /* Varables that are not reachable from the code go into last partition.  */
-  if (flag_toplevel_reorder)
-{
-  FOR_EACH_VARIABLE (vnode)
-   if (vnode->get_partitioning_class () == SYMBOL_PARTITION
-   && !symbol_partitioned_p (vnode)
-   && !vnode->no_reorder)
- next_nodes.safe_push (vnode);
-}
+  FOR_EACH_VARIABLE (vnode)
+if (vnode->get_partitioning_class () == SYMBOL_PARTITION
+   && !symbol_partitioned_p (vnode))
+  next_nodes.safe_push (vnode);
 
   /* Output remaining ordered symbols.  */
   while (varpool_pos < n_varpool_nodes)
Index: cgraphunit.c
===
--- cgraphunit.c(revision 250021)
+++ cgraphunit.c(working copy)
@@ -449,6 +449,8 @@ cgraph_node::finalize_function (tree dec
   node->definition = true;
   notice_global_symbol (decl);
   node->lowered = DECL_STRUCT_FUNCTION (decl)->cfg != NULL;
+  if (!flag_toplevel_reorder)
+node->no_reorder = true;
 
   /* With -fkeep-inline-functions we are keeping all inline functions except
  for extern inline ones.  */
@@ -471,7 +473,8 @@ cgraph_node::finalize_function (tree dec
  declared inline and nested functions.  These were optimized out
  in the original implementation and it is unclear whether we want
  to change the behavior here.  */
-  if (((!opt_for_fn (decl, optimize) || flag_keep_static_functions)
+  if (((!opt_for_fn (decl, optimize) || flag_keep_static_functions
+   || node->no_reorder)
&& !node->cpp_implicit_alias
&& !DECL_DISREGARD_INLINE_LIMITS (decl)
&& !DECL_DECLARED_INLINE_P (decl)
@@ -840,13 +843,14 @@ varpool_node::finalize_decl (tree decl)
  it is available to notice_global_symbol.  */
   node->definition = true;
   notice_global_symbol (decl);
+  if (!flag_toplevel_reorder)
+node->no_reorder = true;
   if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl)
   /* Traditionally we do not eliminate static variables when not
 optimizing and when not doing toplevel reoder.  */
   || node->no_reorder
-  || ((!flag_toplevel_reorder
-  && !DECL_COMDAT (node->decl)
-  && !DECL_ARTIFICIAL (node->decl
+  || (!DECL_COMDAT (node->decl)
+ && !DECL_ARTIFICIAL (node->decl)))
 node->force_output = true;
 
   if (symtab->state == CONSTRUCTION
@@ -857,8 +861,8 @@ varpool_node::finalize_decl (tree decl)
   /* 

bb-reorder tweek

2017-07-06 Thread Jan Hubicka
Hi,
while reading bb-reorder code I noticed that it now may pick edge with 
probability 0
as one preferred over edge with undefined probability.  This is not quite 
intended.
Also we never want to make the trace to go across abnormal/eh edge.

Bootstrapped/regtested x86_64-linux, will commit it shortly.

Honza

* bb-reorder.c (better_edge_p): Do not build traces across abnormal/eh
edges; zero probability is not better than uninitialized.

Index: bb-reorder.c
===
--- bb-reorder.c(revision 250021)
+++ bb-reorder.c(working copy)
@@ -957,7 +957,14 @@ better_edge_p (const_basic_block bb, con
 return !cur_best_edge
   || cur_best_edge->dest->index > e->dest->index;
 
-  if (prob > best_prob + diff_prob || !best_prob.initialized_p ())
+  /* Those edges are so expensive that continuing a trace is not useful
+ performance wise.  */
+  if (e->flags & (EDGE_ABNORMAL | EDGE_EH))
+return false;
+
+  if (prob > best_prob + diff_prob
+  || (!best_prob.initialized_p ()
+ && prob > profile_probability::guessed_never ()))
 /* The edge has higher probability than the temporary best edge.  */
 is_better_edge = true;
   else if (prob < best_prob - diff_prob)


Mark auto-FDO counters to afdo quality

2017-07-06 Thread Jan Hubicka
Hi,
now when we actually have quality tracking, we should use it.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* auto-profile.c (afdo_set_bb_count, afdo_propagate_edge,
afdo_annotate_cfg): Set counts/probabilities as determined by afdo.
Index: auto-profile.c
===
--- auto-profile.c  (revision 250021)
+++ auto-profile.c  (working copy)
@@ -1151,7 +1151,7 @@ afdo_set_bb_count (basic_block bb, const
   FOR_EACH_EDGE (e, ei, bb->succs)
   afdo_source_profile->mark_annotated (e->goto_locus);
 
-  bb->count = profile_count::from_gcov_type (max_count);
+  bb->count = profile_count::from_gcov_type (max_count).afdo ();
   return true;
 }
 
@@ -1228,7 +1228,7 @@ afdo_propagate_edge (bool is_succ, bb_se
 edge e, unknown_edge = NULL;
 edge_iterator ei;
 int num_unknown_edge = 0;
-profile_count total_known_count = profile_count::zero ();
+profile_count total_known_count = profile_count::zero ().afdo ();
 
 FOR_EACH_EDGE (e, ei, is_succ ? bb->succs : bb->preds)
   if (!is_edge_annotated (e, *annotated_edge))
@@ -1350,7 +1350,7 @@ afdo_propagate_circuit (const bb_set &an
  && !is_edge_annotated (ep, *annotated_edge))
 {
   ep->probability = profile_probability::never ();
-  ep->count = profile_count::zero ();
+  ep->count = profile_count::zero ().afdo ();
   set_edge_annotated (ep, annotated_edge);
 }
 }
@@ -1537,9 +1537,9 @@ afdo_annotate_cfg (const stmt_set &promo
   if (s == NULL)
 return;
   cgraph_node::get (current_function_decl)->count
- = profile_count::from_gcov_type (s->head_count ());
+ = profile_count::from_gcov_type (s->head_count ()).afdo ();
   ENTRY_BLOCK_PTR_FOR_FN (cfun)->count
- = profile_count::from_gcov_type (s->head_count ());
+ = profile_count::from_gcov_type (s->head_count ()).afdo ();
   profile_count max_count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
 
   FOR_EACH_BB_FN (bb, cfun)


Re: RFC: [PATCH] Add warn_if_not_aligned attribute

2017-07-06 Thread Joseph Myers
On Fri, 16 Jun 2017, H.J. Lu wrote:

> +@code{warning: alignment 8 of 'struct foo' is less than 16}.

I think @samp is better than @code for warnings, throughout, since they 
aren't pieces of program code.

> +This warning can be disabled by @option{-Wno-if-not-aligned}.
> +The @code{warn_if_not_aligned } attribute can also be used for types

Stray space before }.

> +static void
> +handle_warn_if_not_align (tree field, unsigned int record_align)

Missing comment above this function explaining its semantics and those of 
its arguments.

> +  if ((record_align % warn_if_not_align) != 0)
> +warning (opt_w, "alignment %d of %qT is less than %d",
> +  record_align, context, warn_if_not_align);

I'd expect %u for unsigned int alignments, instead of %d.

> +  unsigned int off
> += (tree_to_uhwi (DECL_FIELD_OFFSET (field))
> +   + tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field)) / BITS_PER_UNIT);
> +  if ((off % warn_if_not_align) != 0)
> +warning (opt_w, "%q+D offset %d in %qT isn't aligned to %d",
> +  field, off, context, warn_if_not_align);

And you can have struct offsets that don't fit in unsigned int (i.e. 
structures over 4 GB), so should be using unsigned HOST_WIDE_INT to store 
the offset and %wu to print it.  (Whereas various places in GCC restrict 
alignments to unsigned int.)

What happens if you specify the attribute on a bit-field, or on a type 
used to declare a bit-field?  I don't think either of those particularly 
makes sense, but I don't see tests for it either.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [ping] don't complain about undefined env vars in self specs on gcc -v

2017-07-06 Thread Olivier Hainque
Hi Joseph,

> On 05 Jul 2017, at 18:09, Joseph Myers  wrote:
>> Ping for patch proposed here:
>> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg00579.html
> 
> This patch is OK.

Just checked-in then :) Thanks for your review!

With Kind Regards,

Olivier



[PATCH, rs6000] 1/2 Add x86 MMX intrinsics to GCC PPC64LE taget

2017-07-06 Thread Steven Munroe
These is the second major contribution of X86 intrinsic equivalent
headers for PPC64LE.

X86 MMX technology was the earlest integer SIMD and 64-bit scalar
extension for IA32. MMX should have largely been replaced by now with
X86_64 64-bit scalars and SSE 128-bit SIMD operation in modern
application.  However it is still part of the X86 API and and supported
via the mmintrin.h header and numerous GCC built-ins. The mmintrin.h is
included from the SSE instruction headers and x86intrin,h. So it needs
to be there to simplify porting of existing X86 applications to PPC64LE.

In the specific case of X86 MMX (__m64) intrinsics, the PowerPC target
does not support a native __vector_size__ (8) type.  Instead we typedef
__m64 to a 64-bit unsigned long long, which is natively supported in
64-bit mode.  This works well for the _si64 and some _pi32 operations,
but starts to generate long sequences for _pi16 and _pi8 operations.
For those cases it better (faster and smaller code) to transfer __m64
data to the PowerPC (VMX/VSX) vector 128-bit unit, perform the
operation, and then transfer the result back to the __m64 type. This
implies that the direct register move instructions, introduced with
power8, are available for efficient implementation of these transfers.

This patch submission includes just the config.gcc and associated MMX
headers changes to make the review more manageable. A separate patch for
the DG test cases will follow.

./gcc/ChangeLog:

2017-07-06  Steven Munroe  

* config.gcc (powerpc*-*-*): Add mmintrin.h.
* config/rs6000/mmintrin.h: New file.
* config/rs6000/x86intrin.h: Include mmintrin.h.

Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 249663)
+++ gcc/config.gcc  (working copy)
@@ -456,7 +456,8 @@ powerpc*-*-*)
cpu_type=rs6000
extra_objs="rs6000-string.o"
extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
-   extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h x86intrin.h"
+   extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h"
+   extra_headers="${extra_headers} mmintrin.h x86intrin.h"
extra_headers="${extra_headers} ppu_intrinsics.h spu2vmx.h vec_types.h
si2vmx.h"
extra_headers="${extra_headers} paired.h"
case x$with_cpu in
Index: gcc/config/rs6000/mmintrin.h
===
--- gcc/config/rs6000/mmintrin.h(revision 0)
+++ gcc/config/rs6000/mmintrin.h(revision 0)
@@ -0,0 +1,1444 @@
+/* Copyright (C) 2002-2017 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License
and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not,
see
+   .  */
+
+/* Implemented from the specification included in the Intel C++
Compiler
+   User Guide and Reference, version 9.0.  */
+
+#ifndef NO_WARN_X86_INTRINSICS
+/* This header is distributed to simplify porting x86_64 code that
+   makes explicit use of Intel intrinsics to powerpc64le.
+   It is the user's responsibility to determine if the results are
+   acceptable and make additional changes as necessary.
+   Note that much code that uses Intel intrinsics can be rewritten in
+   standard C or GNU C extensions, which are more portable and better
+   optimized across multiple targets.
+
+   In the specific case of X86 MMX (__m64) intrinsics, the PowerPC
+   target does not support a native __vector_size__ (8) type.  Instead
+   we typedef __m64 to a 64-bit unsigned long long, which is natively
+   supported in 64-bit mode.  This works well for the _si64 and some
+   _pi32 operations, but starts to generate long sequences for _pi16
+   and _pi8 operations.  For those cases it better (faster and
+   smaller code) to transfer __m64 data to the PowerPC vector 128-bit
+   unit, perform the operation, and then transfer the result back to
+   the __m64 type. This implies that the direct register move
+   instructions, introduced with power8, are available for efficient
+   implementation of these transfers.
+
+   Net. Most MMX intrinsic operations can be performed efficie

[committed] v2: diagnostics: fix end-points of ranges within macros (PR c++/79300)

2017-07-06 Thread David Malcolm
On Mon, 2017-07-03 at 10:07 -0600, Jeff Law wrote:
> On 02/02/2017 01:53 PM, David Malcolm wrote:
> > PR c++/79300 identifies an issue in which diagnostics_show_locus
> > prints the wrong end-point for a range within a macro:
> > 
> >assert ((p + val_size) - buf == encoded_len);
> >~^~~~
> > 
> > as opposed to:
> > 
> >assert ((p + val_size) - buf == encoded_len);
> >~^~
> > 
> > The caret, start and finish locations of this compound location are
> > all virtual locations.
> > 
> > The root cause is that when diagnostic-show-locus.c's layout ctor
> > expands the caret and end-points, it calls
> >   linemap_client_expand_location_to_spelling_point
> > which (via expand_location_1) unwinds the macro expansions, and
> > then calls linemap_expand_location.  Doing so implicitly picks the
> > *caret* location for any virtual locations, and so in the above
> > case
> > it picks these spelling locations for the three parts of the
> > location:
> > 
> >assert ((p + val_size) - buf == encoded_len);
> >^^  ^
> >START|  FINISH
> >   CARET
> > 
> > and so erroneously strips the underlining from the final token,
> > apart
> > from its first character.
> > 
> > The fix is for layout's ctor to indicate that it wants the
> > start/finish
> > locations in such a situation, adding a new param to
> > linemap_client_expand_location_to_spelling_point, so that
> > expand_location_1 can handle this case by extracting the relevant
> > part
> > of the unwound compound location, and thus choose:
> > 
> >assert ((p + val_size) - buf == encoded_len);
> >^^^
> >START|FINISH
> >   CARET
> > 
> > Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
> > 
> > OK for stage 4, or should I wait until stage 1?
> > 
> > gcc/ChangeLog:
> > PR c++/79300
> > * diagnostic-show-locus.c (layout::layout): Use start and
> > finish
> > spelling location for the start and finish of each range.
> > * genmatch.c
> > (linemap_client_expand_location_to_spelling_point):
> > Add unused aspect param.
> > * input.c (expand_location_1): Add "aspect" param, and use it
> > to access the correct part of the location.
> > (expand_location): Pass LOCATION_ASPECT_CARET to new param of
> > expand_location_1.
> > (expand_location_to_spelling_point): Likewise.
> > (linemap_client_expand_location_to_spelling_point): Add
> > "aspect"
> > param, and pass it to expand_location_1.
> > 
> > gcc/testsuite/ChangeLog:
> > PR c++/79300
> > * c-c++-common/Wmisleading-indentation-3.c (fn_14): Update
> > expected underlining within macro expansion.
> > * c-c++-common/pr70264.c: Likewise.
> > * g++.dg/plugin/diagnostic-test-expressions-1.C
> > (test_within_macro_1): New test.
> > (test_within_macro_2): Likewise.
> > (test_within_macro_3): Likewise.
> > (test_within_macro_4): Likewise.
> > * gcc.dg/format/diagnostic-ranges.c (test_macro_3): Update
> > expected underlining within macro expansion.
> > (test_macro_4): Likewise.
> > * gcc.dg/plugin/diagnostic-test-expressions-1.c
> > (test_within_macro_1): New test.
> > (test_within_macro_2): Likewise.
> > (test_within_macro_3): Likewise.
> > (test_within_macro_4): Likewise.
> > * gcc.dg/spellcheck-fields-2.c (test_macro): Update expected
> > underlining within macro expansion.
> > 
> > libcpp/ChangeLog:
> > PR c++/79300
> > * include/line-map.h (enum location_aspect): New enum.
> > (linemap_client_expand_location_to_spelling_point): Add
> > enum location_aspect param.
> > * line-map.c (source_range::intersects_line_p): Update for new
> > param of linemap_client_expand_location_to_spelling_point.
> > (rich_location::get_expanded_location): Likewise.
> > (fixit_insert::affects_line_p): Likewise.
> So we punted this to gcc-8 stage1.   Now that I've finally looked at
> it,
> it looks good to me.
> 
> Sorry for the long wait.

Thanks; looks like I forgot to apply this one when stage 1 reopened.

The libcpp part of the patch needed a bit of reworking due to changes
I've made to the internals of fix-it hints.

For reference, here's what I committed to trunk (as r250022), after
bootstrap®rtest on x86_64-pc-linux-gnu:

gcc/ChangeLog:
PR c++/79300
* diagnostic-show-locus.c (layout::layout): Use start and finish
spelling location for the start and finish of each range.
* genmatch.c (linemap_client_expand_location_to_spelling_point):
Add unused aspect param.
* input.c (expand_location_1): Add "aspect" param, and use it
to access the correct part of the location.
(expand_location): Pass LOCATION_ASPECT_CARET to new param of
expand_locati

[PATCH v11] add -fpatchable-function-entry=N,M option

2017-07-06 Thread Torsten Duwe
Permit A 38

gcc/c-family/ChangeLog
2017-07-06  Torsten Duwe  

* c-attribs.c (c_common_attribute_table): Add entry for
"patchable_function_entry".

gcc/lto/ChangeLog
2017-07-06  Torsten Duwe  

* lto-lang.c (lto_attribute_table): Add entry for
"patchable_function_entry".

gcc/ChangeLog
2017-07-06  Torsten Duwe  

* common.opt: Introduce -fpatchable-function-entry
command line option, and its variables function_entry_patch_area_size
and function_entry_patch_area_start.
* opts.c (common_handle_option): Add -fpatchable_function_entry_ case,
including a two-value parser.
* target.def (print_patchable_function_entry): New target hook.
* targhooks.h (default_print_patchable_function_entry): New function.
* targhooks.c (default_print_patchable_function_entry): Likewise.
* toplev.c (process_options): Switch off IPA-RA if
patchable function entries are being generated.
* varasm.c (assemble_start_function): Look at the
patchable-function-entry command line switch and current
function attributes and maybe generate NOP instructions by
calling the print_patchable_function_entry hook.
* doc/extend.texi: Document patchable_function_entry attribute.
* doc/invoke.texi: Document -fpatchable_function_entry
command line option.
* doc/tm.texi.in (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY):
New target hook.
* doc/tm.texi: Likewise.

gcc/testsuite/ChangeLog
2017-07-06  Torsten Duwe  

* c-c++-common/patchable_function_entry-default.c: New test.
* c-c++-common/patchable_function_entry-decl.c: Likewise.
* c-c++-common/patchable_function_entry-definition.c: Likewise.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 626ffa1cde7..ecb00c1d5b9 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -142,6 +142,8 @@ static tree handle_bnd_variable_size_attribute (tree *, 
tree, tree, int, bool *)
 static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
 static tree handle_bnd_instrument (tree *, tree, tree, int, bool *);
 static tree handle_fallthrough_attribute (tree *, tree, tree, int, bool *);
+static tree handle_patchable_function_entry_attribute (tree *, tree, tree,
+  int, bool *);
 
 /* Table of machine-independent attributes common to all C-like languages.
 
@@ -351,6 +353,9 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_bnd_instrument, false },
   { "fallthrough",   0, 0, false, false, false,
  handle_fallthrough_attribute, false },
+  { "patchable_function_entry",1, 2, true, false, false,
+ handle_patchable_function_entry_attribute,
+ false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -3260,3 +3265,10 @@ handle_fallthrough_attribute (tree *, tree name, tree, 
int,
   *no_add_attrs = true;
   return NULL_TREE;
 }
+
+static tree
+handle_patchable_function_entry_attribute (tree *, tree, tree, int, bool *)
+{
+  /* Nothing to be done here.  */
+  return NULL_TREE;
+}
diff --git a/gcc/common.opt b/gcc/common.opt
index e81165c488b..78cfa568a95 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -163,6 +163,13 @@ bool flag_stack_usage_info = false
 Variable
 int flag_debug_asm
 
+; How many NOP insns to place at each function entry by default
+Variable
+HOST_WIDE_INT function_entry_patch_area_size
+
+; And how far the real asm entry point is into this area
+Variable
+HOST_WIDE_INT function_entry_patch_area_start
 
 ; Balance between GNAT encodings and standard DWARF to emit.
 Variable
@@ -2030,6 +2037,10 @@ fprofile-reorder-functions
 Common Report Var(flag_profile_reorder_functions)
 Enable function reordering that improves code placement.
 
+fpatchable-function-entry=
+Common Joined Optimization
+Insert NOP instructions at each function entry.
+
 frandom-seed
 Common Var(common_deferred_options) Defer
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 03ba8fc436c..86d567783f7 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -3105,6 +3105,27 @@ that affect more than one function.
 This attribute should be used for debugging purposes only.  It is not
 suitable in production code.
 
+@item patchable_function_entry
+@cindex @code{patchable_function_entry} function attribute
+@cindex extra NOP instructions at the function entry point
+In case the target's text segment can be made writable at run time by
+any means, padding the function entry with a number of NOPs can be
+used to provide a universal tool for instrumentation.
+
+The @code{patchable_function_entry} function attribute can be used to
+change the number of NOPs to any desired value.  The two-value syntax
+is the same as for the command-line switch
+@optio

Re: [PATCH][ASAN] Switch off by default allocas/VLA sanitization for KASAN

2017-07-06 Thread Jakub Jelinek
On Thu, Jul 06, 2017 at 04:31:49PM +0300, Maxim Ostapenko wrote:
> Hi,
> 
> since kernel doesn't support __asan_alloca_poison and
> __asan_allocas_unpoison runtime calls so far, the allocas/VLAs sanitization
> patch (https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00258.html) will break
> KASan builds.
> So it was decided to introduce an option --param asan-instrument-allocas=0/1
> (on by default for userspace and off for kernel) to avoid the issue.
> 
> Tested on x86_64-unknown-linux-gnu, OK after
> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00258.html will be applied?
> 
> -Maxim

> gcc/ChangeLog:
> 
> 2017-07-06  Maxim Ostapenko  
> 
>   * asan.h (asan_sanitize_allocas_p): Declare.
>   * asan.c (asan_sanitize_allocas_p): New function.
>   (handle_builtin_stack_restore): Bail out if !asan_sanitize_allocas_p.
>   (handle_builtin_alloca): Likewise.
>   * cfgexpand.c (expand_used_vars): Do not add allocas unpoisoning stuff
>   if !asan_sanitize_allocas_p.
>   * params.def (asan-instrument-allocas): Add new option.
>   * params.h (ASAN_PROTECT_ALLOCAS): Define.
>   * opts.c (common_handle_option): Disable allocas sanitization for
>   KASan by default.
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-07-06  Maxim Ostapenko  
> 
>c-c++-common/asan/kasan-alloca-1.c: New test.
>c-c++-common/asan/kasan-alloca-2.c: Likewise.

Ok.
Jakub


Re: [PATCH] Fix pr80044, -static and -pie insanity, and pr81170

2017-07-06 Thread Matthias Klose
As seen in PR81295, the bootstrap is broken on powerpc-linux-gnu with
--enable-default-pie. Using that patch the bootstrap succeeds.  The bootstrap
works fine on both powerpc64 be and le targets.

Matthias

On 22.06.2017 17:28, Alan Modra wrote:
> PR80044 notes that -static and -pie together behave differently when
> gcc is configured with --enable-default-pie as compared to configuring
> without (or --disable-default-pie).  This patch removes that
> difference.  In both cases you now will have -static completely
> overriding -pie.
> 
> Fixing this wasn't quite as simple as you'd expect, due to poor
> separation of functionality.  PIE_SPEC didn't just mean that -pie was
> on explicitly or by default, but also -r and -shared were *not* on.
> Fortunately the three files touched by this patch are the only places
> PIE_SPEC and NO_PIE_SPEC are used, so it isn't too hard to see that
> the reason PIE_SPEC and NO_PIE_SPEC are not inverses is the use of
> PIE_SPEC in LINK_PIE_SPEC.  So, move the inelegant symmetry breaking
> addition, to LINK_PIE_SPEC where it belongs.  Doing that showed
> another problem in gnu-user.h, with PIE_SPEC and NO_PIE_SPEC selection
> of crtbegin*.o not properly hooked into a chain of if .. elseif ..
> conditions, which required both PIE_SPEC and NO_PIE_SPEC to exclude
> -static and -shared.  Fixing that particular problem finally allows
> PIE_SPEC to serve just one purpose, and NO_PIE_SPEC to disappear.
> 
> Bootstrapped and regression tested powerpc64le-linux c,c++.  No
> regressions and a bunch of --enable-default-pie failures squashed.
> OK mainline and active branches?
> 
> Incidentally, there is a fairly strong case to be made for adding
> -static to the -shared, -pie, -no-pie chain of RejectNegative's in
> common.opt.  Since git 0d6378a9e (svn r48039) 2001-11-15, -static has
> done more than just the traditional "prevent linking with dynamic
> libraries", as -static selects crtbeginT.o rather than crtbegin.o
> on GNU systems.  Realizing this is what led me to close pr80044, which
> I'd opened with the aim of making -pie -static work together (with the
> traditional meaning of -static).  I don't that is worth doing, but
> mention pr80044 in the changelog due to fixing the insane output
> produced by -pie -static with --disable-default-pie.
> 
>   PR driver/80044
>   PR target/81170
>   * gcc.c (NO_PIE_SPEC): Delete.
>   (PIE_SPEC): Define as !no-pie/pie.  Move static|shared|r exclusion..
>   (LINK_PIE_SPEC): ..to here.
>   * config/gnu-user.h (GNU_USER_TARGET_STARTFILE_SPEC): Correct
>   chain of crtbegin*.o selection, update for PIE_SPEC changes and format.
>   (GNU_USER_TARGET_ENDFILE_SPEC): Similarly.
>   * config/sol2.h (STARTFILE_CRTBEGIN_SPEC): Similarly.
>   (ENDFILE_CRTEND_SPEC): Similarly.
>   * config/rs6000/sysv4.h (STARTFILE_LINUX_SPEC): Upgrade to
>   match gnu-user.h startfile.
>   (ENDFILE_LINUX_SPEC): Similarly.
> 
> diff --git a/gcc/config/gnu-user.h b/gcc/config/gnu-user.h
> index 2787a3d..de605b0 100644
> --- a/gcc/config/gnu-user.h
> +++ b/gcc/config/gnu-user.h
> @@ -50,19 +50,28 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>  
>  #if defined HAVE_LD_PIE
>  #define GNU_USER_TARGET_STARTFILE_SPEC \
> -  "%{!shared: %{pg|p|profile:gcrt1.o%s;: \
> -%{" PIE_SPEC ":Scrt1.o%s} %{" NO_PIE_SPEC ":crt1.o%s}}} \
> -   crti.o%s %{static:crtbeginT.o%s;: %{shared:crtbeginS.o%s} \
> -   %{" PIE_SPEC ":crtbeginS.o%s} \
> -   %{" NO_PIE_SPEC ":crtbegin.o%s}} \
> +  "%{shared:; \
> + pg|p|profile:gcrt1.o%s; \
> + static:crt1.o%s; \
> + " PIE_SPEC ":Scrt1.o%s; \
> + :crt1.o%s} \
> +   crti.o%s \
> +   %{static:crtbeginT.o%s; \
> + shared|" PIE_SPEC ":crtbeginS.o%s; \
> + :crtbegin.o%s} \
> %{fvtable-verify=none:%s; \
>   fvtable-verify=preinit:vtv_start_preinit.o%s; \
>   fvtable-verify=std:vtv_start.o%s} \
> " CRTOFFLOADBEGIN
>  #else
>  #define GNU_USER_TARGET_STARTFILE_SPEC \
> -  "%{!shared: %{pg|p|profile:gcrt1.o%s;:crt1.o%s}} \
> -   crti.o%s %{static:crtbeginT.o%s;shared|pie:crtbeginS.o%s;:crtbegin.o%s} \
> +  "%{shared:; \
> + pg|p|profile:gcrt1.o%s; \
> + :crt1.o%s} \
> +   crti.o%s \
> +   %{static:crtbeginT.o%s; \
> + shared|pie:crtbeginS.o%s; \
> + :crtbegin.o%s} \
> %{fvtable-verify=none:%s; \
>   fvtable-verify=preinit:vtv_start_preinit.o%s; \
>   fvtable-verify=std:vtv_start.o%s} \
> @@ -82,15 +91,20 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>"%{fvtable-verify=none:%s; \
>   fvtable-verify=preinit:vtv_end_preinit.o%s; \
>   fvtable-verify=std:vtv_end.o%s} \
> -   %{shared:crtendS.o%s;: %{" PIE_SPEC ":crtendS.o%s} \
> -   %{" NO_PIE_SPEC ":crtend.o%s}} crtn.o%s \
> +   %{static:crtend.o%s; \
> + shared|" PIE_SPEC ":crtendS.o%s; \
> + :crtend.o%s} \
> +   crtn.o%s \
> " CRTOFFLOADEND
>  #else
>  #define GNU_USER_TARGET_ENDFILE_SPEC \
>"%{

[PATCH][ASAN] Switch off by default allocas/VLA sanitization for KASAN

2017-07-06 Thread Maxim Ostapenko

Hi,

since kernel doesn't support __asan_alloca_poison and 
__asan_allocas_unpoison runtime calls so far, the allocas/VLAs 
sanitization patch 
(https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00258.html) will break 
KASan builds.
So it was decided to introduce an option --param 
asan-instrument-allocas=0/1 (on by default for userspace and off for 
kernel) to avoid the issue.


Tested on x86_64-unknown-linux-gnu, OK after 
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00258.html will be applied?


-Maxim
gcc/ChangeLog:

2017-07-06  Maxim Ostapenko  

	* asan.h (asan_sanitize_allocas_p): Declare.
	* asan.c (asan_sanitize_allocas_p): New function.
	(handle_builtin_stack_restore): Bail out if !asan_sanitize_allocas_p.
	(handle_builtin_alloca): Likewise.
	* cfgexpand.c (expand_used_vars): Do not add allocas unpoisoning stuff
	if !asan_sanitize_allocas_p.
	* params.def (asan-instrument-allocas): Add new option.
	* params.h (ASAN_PROTECT_ALLOCAS): Define.
	* opts.c (common_handle_option): Disable allocas sanitization for
	KASan by default.

gcc/testsuite/ChangeLog:

2017-07-06  Maxim Ostapenko  

	 c-c++-common/asan/kasan-alloca-1.c: New test.
	 c-c++-common/asan/kasan-alloca-2.c: Likewise.

diff --git a/gcc/asan.c b/gcc/asan.c
index 3ec7341..5b93bfc 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -310,6 +310,12 @@ asan_sanitize_stack_p (void)
   return (sanitize_flags_p (SANITIZE_ADDRESS) && ASAN_STACK);
 }
 
+bool
+asan_sanitize_allocas_p (void)
+{
+  return (asan_sanitize_stack_p () && ASAN_PROTECT_ALLOCAS);
+}
+
 /* Checks whether section SEC should be sanitized.  */
 
 static bool
@@ -569,7 +575,7 @@ get_last_alloca_addr ()
 static void
 handle_builtin_stack_restore (gcall *call, gimple_stmt_iterator *iter)
 {
-  if (!iter)
+  if (!iter || !asan_sanitize_allocas_p ())
 return;
 
   tree last_alloca = get_last_alloca_addr ();
@@ -607,7 +613,7 @@ handle_builtin_stack_restore (gcall *call, gimple_stmt_iterator *iter)
 static void
 handle_builtin_alloca (gcall *call, gimple_stmt_iterator *iter)
 {
-  if (!iter)
+  if (!iter || !asan_sanitize_allocas_p ())
 return;
 
   gassign *g;
diff --git a/gcc/asan.h b/gcc/asan.h
index 4e8120e..c82d4d9 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -108,6 +108,8 @@ extern void set_sanitized_sections (const char *);
 
 extern bool asan_sanitize_stack_p (void);
 
+extern bool asan_sanitize_allocas_p (void);
+
 /* Return TRUE if builtin with given FCODE will be intercepted by
libasan.  */
 
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index a6e4ef0..11bd604 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -2241,7 +2241,7 @@ expand_used_vars (void)
   expand_stack_vars (NULL, &data);
 }
 
-  if ((flag_sanitize & SANITIZE_ADDRESS) && cfun->calls_alloca)
+  if (asan_sanitize_allocas_p () && cfun->calls_alloca)
 var_end_seq = asan_emit_allocas_unpoison (virtual_stack_dynamic_rtx,
 	  virtual_stack_vars_rtx,
 	  var_end_seq);
diff --git a/gcc/opts.c b/gcc/opts.c
index 7460c2b..7555ed5 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1909,6 +1909,9 @@ common_handle_option (struct gcc_options *opts,
  opts_set->x_param_values);
 	  maybe_set_param_value (PARAM_ASAN_STACK, 0, opts->x_param_values,
  opts_set->x_param_values);
+	  maybe_set_param_value (PARAM_ASAN_PROTECT_ALLOCAS, 0,
+ opts->x_param_values,
+ opts_set->x_param_values);
 	  maybe_set_param_value (PARAM_ASAN_USE_AFTER_RETURN, 0,
  opts->x_param_values,
  opts_set->x_param_values);
diff --git a/gcc/params.def b/gcc/params.def
index 6b07518..805302b 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1142,6 +1142,11 @@ DEFPARAM (PARAM_ASAN_STACK,
  "Enable asan stack protection.",
  1, 0, 1)
 
+DEFPARAM (PARAM_ASAN_PROTECT_ALLOCAS,
+	"asan-instrument-allocas",
+	"Enable asan allocas/VLAs protection.",
+	1, 0, 1)
+
 DEFPARAM (PARAM_ASAN_GLOBALS,
  "asan-globals",
  "Enable asan globals protection.",
diff --git a/gcc/params.h b/gcc/params.h
index 8b91660..2188e18 100644
--- a/gcc/params.h
+++ b/gcc/params.h
@@ -232,6 +232,8 @@ extern void init_param_values (int *params);
   PARAM_VALUE (PARAM_ALLOW_PACKED_STORE_DATA_RACES)
 #define ASAN_STACK \
   PARAM_VALUE (PARAM_ASAN_STACK)
+#define ASAN_PROTECT_ALLOCAS \
+  PARAM_VALUE (PARAM_ASAN_PROTECT_ALLOCAS)
 #define ASAN_GLOBALS \
   PARAM_VALUE (PARAM_ASAN_GLOBALS)
 #define ASAN_INSTRUMENT_READS \
diff --git a/gcc/testsuite/c-c++-common/asan/kasan-alloca-1.c b/gcc/testsuite/c-c++-common/asan/kasan-alloca-1.c
new file mode 100644
index 000..518d190
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/kasan-alloca-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-fno-sanitize=address -fsanitize=kernel-address -fdump-tree-sanopt" } */
+/* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
+
+void foo(int index, int len) {
+  char str[len];
+  str[index] = '1'; // BOOM
+}
+
+/* { dg-final { scan-tree-dump-not "__builtin___asan_alloca_poison" "sanopt" } } */
+/* { dg-final { sc

[PATCH] gcc/doc: list what version each attribute was introduced in

2017-07-06 Thread Daniel P. Berrange
There are several hundred named attribute keys that have been
introduced over many GCC releases. Applications typically need
to be compilable with multiple GCC versions, so it is important
for developers to know when GCC introduced support for each
attribute.

This augments the texi docs that list attribute keys with
a note of what version introduced the feature. The version
information was obtained through archaeology of the GCC source
repository release tags, back to gcc-4_0_0-release. For
attributes added in 4.0.0 or later, an explicit version will
be noted. Any attribute that predates 4.0.0 will simply note
that it has existed prior to 4.0.0. It is thought there is
little need to go further back in time than 4.0.0 since few,
if any, apps will still be using such old compiler versions.

Where a named attribute can be used in many contexts (ie the
'visibility' attribute can be used for both functions or
variables), it was assumed that the attribute was supported
in all use contexts at the same time.

Future patches that add new attributes to GCC should be
required to follow this new practice, by documenting the
version.

Signed-off-by: Daniel P. Berrange 
---
 gcc/ChangeLog   |   6 +
 gcc/doc/extend.texi | 614 
 2 files changed, 620 insertions(+)

NB, I have not signed any FSF individual copyright assignment
agreement, as this patch is submitted under Red Hat copyright
ownership

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2acf140..d693787 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2017-07-06  Daniel P. Berrange  
+
+   * doc/extend.texi (Function Attributes, Type Attributes)
+   Label Attributes, Enumerator Attributes, C++ Attributes): Add
+   version information for each listed attribute.
+
 2017-07-06  Christophe Lyon  
 
* doc/sourcebuild.texi (Test Directives, Variants of
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 03ba8fc..7df85b0 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2354,6 +2354,8 @@ is not defined in the same translation unit.
 This attribute requires assembler and object file support,
 and may not be available on all targets.
 
+Introduced before 4.0.0
+
 @item aligned (@var{alignment})
 @cindex @code{aligned} function attribute
 This attribute specifies a minimum alignment for the function,
@@ -2375,6 +2377,8 @@ further information.
 The @code{aligned} attribute can also be used for variables and fields
 (@pxref{Variable Attributes}.)
 
+Introduced before 4.0.0
+
 @item alloc_align
 @cindex @code{alloc_align} function attribute
 The @code{alloc_align} attribute is used to tell the compiler that the
@@ -2396,6 +2400,8 @@ void* my_memalign(size_t, size_t) 
__attribute__((alloc_align(1)))
 declares that @code{my_memalign} returns memory with minimum alignment
 given by parameter 1.
 
+Introduced in 4.9.0
+
 @item alloc_size
 @cindex @code{alloc_size} function attribute
 The @code{alloc_size} attribute is used to tell the compiler that the
@@ -2421,6 +2427,8 @@ declares that @code{my_calloc} returns memory of the size 
given by
 the product of parameter 1 and 2 and that @code{my_realloc} returns memory
 of the size given by parameter 2.
 
+Introduced in 4.9.0
+
 @item always_inline
 @cindex @code{always_inline} function attribute
 Generally, functions are not inlined unless optimization is specified.
@@ -2431,6 +2439,8 @@ Note that if such a function is called indirectly the 
compiler may
 or may not inline it depending on optimization level and a failure
 to inline an indirect call may or may not be diagnosed.
 
+Introduced before 4.0.0
+
 @item artificial
 @cindex @code{artificial} function attribute
 This attribute is useful for small inline wrappers that if possible
@@ -2439,6 +2449,8 @@ info format it either means marking the function as 
artificial
 or using the caller location for all instructions within the inlined
 body.
 
+Introduced in 4.3.0
+
 @item assume_aligned
 @cindex @code{assume_aligned} function attribute
 The @code{assume_aligned} attribute is used to tell the compiler that the
@@ -2458,12 +2470,16 @@ declares that @code{my_alloc1} returns 16-byte aligned 
pointer and
 that @code{my_alloc2} returns a pointer whose value modulo 32 is equal
 to 8.
 
+Introduced in 4.9.0
+
 @item bnd_instrument
 @cindex @code{bnd_instrument} function attribute
 The @code{bnd_instrument} attribute on functions is used to inform the
 compiler that the function should be instrumented when compiled
 with the @option{-fchkp-instrument-marked-only} option.
 
+Introduced in 5.1.0
+
 @item bnd_legacy
 @cindex @code{bnd_legacy} function attribute
 @cindex Pointer Bounds Checker attributes
@@ -2471,6 +2487,8 @@ The @code{bnd_legacy} attribute on functions is used to 
inform the
 compiler that the function should not be instrumented when compiled
 with the @option{-fcheck-pointer-bounds} option.
 
+Introduced in 5.1.0
+
 @item cold
 @cindex @code{cold} function attr

Re: [PATCH 2/3, GCC/ARM] Add support for ARMv8-R architecture

2017-07-06 Thread Richard Earnshaw (lists)
On 06/07/17 13:40, Thomas Preudhomme wrote:
> Please find an updated patch in attachment. ChangeLog entry are now as
> follows:
> 
> *** gcc/ChangeLog ***
> 
> 2017-07-06  Thomas Preud'homme  
> 
> * config/arm/arm-cpus.in (armv8-r): Add new entry.
> * config/arm/arm-isa.h (ISA_ARMv8r): Define macro.
> * config/arm/arm-tables.opt: Regenerate.
> * config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R
> enumerator.
> * doc/invoke.texi: Mention -march=armv8-r and its extensions.
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2017-01-31  Thomas Preud'homme  
> 
> * lib/target-supports.exp: Generate
> check_effective_target_arm_arch_v8r_ok, add_options_for_arm_arch_v8r
> and check_effective_target_arm_arch_v8r_multilib.
> 
> *** libgcc/ChangeLog ***
> 
> 2017-01-31  Thomas Preud'homme  
> 
> * config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R.
> 

OK.


R.

> 
> Tested by building an arm-none-eabi GCC cross-compiler targetting
> ARMv8-R.
> 
> Is this ok for stage1?
> 
> Best regards,
> 
> Thomas
> 
> Best regards,
> 
> Thomas
> 
> On 29/06/17 16:13, Thomas Preudhomme wrote:
>> Please ignore this patch. I'll respin the patch on a more recent GCC.
>>
>> Best regards,
>>
>> Thomas
>>
>> On 29/06/17 14:55, Thomas Preudhomme wrote:
>>> Hi,
>>>
>>> This patch adds support for ARMv8-R architecture [1] which was recently
>>> announced. User level instructions for ARMv8-R are the same as those in
>>> ARMv8-A Aarch32 mode so this patch define ARMv8-R to have the same
>>> features as ARMv8-A in ARM backend.
>>>
>>> [1]
>>> https://developer.arm.com/products/architecture/r-profile/docs/ddi0568/latest/arm-architecture-reference-manual-supplement-armv8-for-the-armv8-r-aarch32-architecture-profile
>>>
>>>
>>> ChangeLog entries are as follow:
>>>
>>> *** gcc/ChangeLog ***
>>>
>>> 2017-01-31  Thomas Preud'homme  
>>>
>>>  * config/arm/arm-cpus.in (armv8-r, armv8-r+rcr): Add new entry.
>>>  * config/arm/arm-cpu-cdata.h: Regenerate.
>>>  * config/arm/arm-cpu-data.h: Regenerate.
>>>  * config/arm/arm-isa.h (ISA_ARMv8r): Define macro.
>>>  * config/arm/arm-tables.opt: Regenerate.
>>>  * config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R
>>>  enumerator.
>>>  * config/arm/bpabi.h (BE8_LINK_SPEC): Add entry for ARMv8-R and
>>>  ARMv8-R with CRC extensions.
>>>  * doc/invoke.texi: Mention -march=armv8-r and -march=armv8-r+crc
>>>  options.  Document meaning of -march=armv8-r+rcr.
>>>
>>> *** gcc/testsuite/ChangeLog ***
>>>
>>> 2017-01-31  Thomas Preud'homme  
>>>
>>>  * lib/target-supports.exp: Generate
>>>  check_effective_target_arm_arch_v8r_ok,
>>> add_options_for_arm_arch_v8r
>>>  and check_effective_target_arm_arch_v8r_multilib.
>>>
>>> *** libgcc/ChangeLog ***
>>>
>>> 2017-01-31  Thomas Preud'homme  
>>>
>>>  * config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R.
>>>
>>> Tested by building an arm-none-eabi GCC cross-compiler targetting
>>> ARMv8-R.
>>>
>>> Is this ok for stage1?
>>>
>>> Best regards,
>>>
>>> Thomas
> 
> 2_add_armv8r_support.patch
> 
> 
> diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
> index 
> 946d543ebb29416da9b4928161607cccacaa78a7..f35128acb7d68c6a0592355b9d3d56ee8f826aca
>  100644
> --- a/gcc/config/arm/arm-cpus.in
> +++ b/gcc/config/arm/arm-cpus.in
> @@ -380,6 +380,22 @@ begin arch armv8-m.main
>   option nodsp remove bit_ARMv7em
>  end arch armv8-m.main
>  
> +begin arch armv8-r
> + tune for cortex-r4
> + tune flags CO_PROC
> + base 8R
> + profile R
> + isa ARMv8r
> + option crc add bit_crc32
> +# fp.sp => fp-armv8 (d16); simd => simd + fp-armv8 + d32 + double precision
> +# note: no fp option for fp-armv8 (d16) + double precision at the moment
> + option fp.sp add FP_ARMv8
> + option simd add FP_ARMv8 NEON
> + option crypto add FP_ARMv8 CRYPTO
> + option nocrypto remove ALL_CRYPTO
> + option nofp remove ALL_FP
> +end arch armv8-r
> +
>  begin arch iwmmxt
>   tune for iwmmxt
>   tune flags LDSCHED STRONG XSCALE
> diff --git a/gcc/config/arm/arm-isa.h b/gcc/config/arm/arm-isa.h
> index 
> c0c2ccee330f2313951e980c5d399ae5d21005d6..0d66a0400c517668db023fc66ff43e26d43add51
>  100644
> --- a/gcc/config/arm/arm-isa.h
> +++ b/gcc/config/arm/arm-isa.h
> @@ -127,6 +127,7 @@ enum isa_feature
>  #define ISA_ARMv8_2a ISA_ARMv8_1a, isa_bit_ARMv8_2
>  #define ISA_ARMv8m_base ISA_ARMv6m, isa_bit_ARMv8, isa_bit_cmse, isa_bit_tdiv
>  #define ISA_ARMv8m_main ISA_ARMv7m, isa_bit_ARMv8, isa_bit_cmse
> +#define ISA_ARMv8r   ISA_ARMv8a
>  
>  /* List of all cryptographic extensions to stripout if crypto is
> disabled.  Currently, that's trivial, but we define it anyway for
> diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
> index 
> 5e2df9dd0716293fb551b6582a8c9c2c46fdaa90..51678c2566e841894c5c0e9c613c8c0f832e9988
>  100644
> --- a/gcc/config/arm/arm-tables.opt
> +++ b/gcc/config/arm/arm-tables.opt
> @@ -455,10 +455,13 @@ EnumValue
>  Enum(arm_a

[PATCH] Fix memory leaks in libstdc++ tests

2017-07-06 Thread Jonathan Wakely

This fixes some more ASan errors in the testsuite. The first one is a
simple leak. The second one is mroe complicated, the test created
loads of variables to hold either a strup'd string or "", in order to
restore the original environment. Since the test exits after that
function, we don't care about restoring the environment, so can get
rid of most of the variables.

* testsuite/20_util/specialized_algorithms/memory_management_tools/
1.cc: Free memory.
* testsuite/22_locale/locale/cons/5.cc: Remove redundant restoration
of original environment and free memory.

Tested powerpc64le-linux, committed to trunk.


commit 5d1ce22784966d603d1fb62e641b8f7d30e281d2
Author: Jonathan Wakely 
Date:   Thu Jul 6 13:23:58 2017 +0100

Fix memory leaks in libstdc++ tests

* testsuite/20_util/specialized_algorithms/memory_management_tools/
1.cc: Free memory.
* testsuite/22_locale/locale/cons/5.cc: Remove redundant restoration
of original environment and free memory.

diff --git 
a/libstdc++-v3/testsuite/20_util/specialized_algorithms/memory_management_tools/1.cc
 
b/libstdc++-v3/testsuite/20_util/specialized_algorithms/memory_management_tools/1.cc
index d80c125..39d3e76 100644
--- 
a/libstdc++-v3/testsuite/20_util/specialized_algorithms/memory_management_tools/1.cc
+++ 
b/libstdc++-v3/testsuite/20_util/specialized_algorithms/memory_management_tools/1.cc
@@ -180,6 +180,7 @@ void test12()
 std::uninitialized_default_construct(target, target+10);
   } catch (...) {
   }
+  free(target);
   VERIFY(ctor_count == 5);
   VERIFY(del_count == 5);
   throw_after = 0;
@@ -198,6 +199,7 @@ void test13()
 std::uninitialized_value_construct(target, target+10);
   } catch (...) {
   }
+  free(target);
   VERIFY(ctor_count == 5);
   VERIFY(del_count == 5);
   throw_after = 0;
@@ -216,6 +218,7 @@ void test14()
 std::uninitialized_default_construct_n(target, 10);
   } catch (...) {
   }
+  free(target);
   VERIFY(ctor_count == 5);
   VERIFY(del_count == 5);
   throw_after = 0;
@@ -234,6 +237,7 @@ void test15()
 std::uninitialized_value_construct_n(target, 10);
   } catch (...) {
   }
+  free(target);
   VERIFY(ctor_count == 5);
   VERIFY(del_count == 5);
   throw_after = 0;
@@ -254,6 +258,7 @@ void test16()
 std::uninitialized_move(source.begin(), source.end(), target);
   } catch (...) {
   }
+  free(target);
   VERIFY(ctor_count == 5);
   VERIFY(del_count == 5);
   throw_after = 0;
@@ -273,6 +278,7 @@ void test17()
 std::uninitialized_move_n(source.begin(), 10, target);
   } catch (...) {
   }
+  free(target);
   VERIFY(ctor_count == 5);
   VERIFY(del_count == 5);
   throw_after = 0;
diff --git a/libstdc++-v3/testsuite/22_locale/locale/cons/5.cc 
b/libstdc++-v3/testsuite/22_locale/locale/cons/5.cc
index 4c23c7f..c0a18ce 100644
--- a/libstdc++-v3/testsuite/22_locale/locale/cons/5.cc
+++ b/libstdc++-v3/testsuite/22_locale/locale/cons/5.cc
@@ -37,34 +37,7 @@ void test04()
 
 #ifdef _GLIBCXX_HAVE_SETENV
 
-  const char* LANG_orig = getenv("LANG") ? strdup(getenv("LANG")) : "";
-  const char* LC_ALL_orig = getenv("LC_ALL") ? strdup(getenv("LC_ALL")) : "";
-  const char* LC_CTYPE_orig = 
-getenv("LC_CTYPE") ? strdup(getenv("LC_CTYPE")) : "";
-  const char* LC_NUMERIC_orig = 
-getenv("LC_NUMERIC") ? strdup(getenv("LC_NUMERIC")) : "";
-  const char* LC_TIME_orig = 
-getenv("LC_TIME") ? strdup(getenv("LC_TIME")) : "";
-  const char* LC_COLLATE_orig =
-getenv("LC_COLLATE") ? strdup(getenv("LC_COLLATE")) : "";
-  const char* LC_MONETARY_orig = 
-getenv("LC_MONETARY") ? strdup(getenv("LC_MONETARY")) : "";
-  const char* LC_MESSAGES_orig = 
-getenv("LC_MESSAGES") ? strdup(getenv("LC_MESSAGES")) : "";
-#if _GLIBCXX_NUM_CATEGORIES
-  const char* LC_PAPER_orig = 
-getenv("LC_PAPER") ? strdup(getenv("LC_PAPER")) : "";
-  const char* LC_NAME_orig = 
-getenv("LC_NAME") ? strdup(getenv("LC_NAME")) : "";
-  const char* LC_ADDRESS_orig = 
-getenv("LC_ADDRESS") ? strdup(getenv("LC_ADDRESS")) : "";
-  const char* LC_TELEPHONE_orig = 
-getenv("LC_TELEPHONE") ? strdup(getenv("LC_TELEPHONE")) : "";
-  const char* LC_MEASUREMENT_orig = 
-getenv("LC_MEASUREMENT") ? strdup(getenv("LC_MEASUREMENT")) : "";
-  const char* LC_IDENTIFICATION_orig =
-getenv("LC_IDENTIFICATION") ? strdup(getenv("LC_IDENTIFICATION")) : "";
-#endif
+  char* LANG_orig = strdup(getenv("LANG") ? getenv("LANG") : "");
 
   // Check that a "POSIX" LC_ALL is equivalent to "C".
   if (!setenv("LC_ALL", "POSIX", 1))
@@ -91,12 +64,11 @@ void test04()
  VERIFY( loc.name() == "en_PH" );
}
   setenv("LC_ALL", "", 1);
-  setenv("LANG", LANG_orig ? LANG_orig : "", 1);
-  setenv("LC_COLLATE", LC_COLLATE_orig ? LC_COLLATE_orig : "", 1);
+  setenv("LANG", LANG_orig, 1);
 }
 
   // NB: LANG checks all LC_* macro settings. As such, all LC_* macros
-  // must be cleared for these tests, and then restored.
+  // must be cleared for these 

[PATCH] Fix memory leaks in libstdc++ ABI tests

2017-07-06 Thread Jonathan Wakely

These tests don't bother to free memory before exit, but that means we
get ASan errors for them. Fixed like so.

* testsuite/abi/pr42230.cc: Free memory.
* testsuite/util/testsuite_abi.cc (demangle): Return std::string
instead of pointer that might need freeing.
* testsuite/util/testsuite_abi.h (demangle): Likewise.
* testsuite/util/testsuite_hooks.cc (verify_demangle): Free memory.

Tested powerpc64le-linux, committed to trunk.


commit 6442c85d107824319d19b36799682335d8133689
Author: Jonathan Wakely 
Date:   Wed Jul 20 22:49:44 2016 +0100

Fix memory leaks in libstdc++ ABI tests

* testsuite/abi/pr42230.cc: Free memory.
* testsuite/util/testsuite_abi.cc (demangle): Return std::string
instead of pointer that might need freeing.
* testsuite/util/testsuite_abi.h (demangle): Likewise.
* testsuite/util/testsuite_hooks.cc (verify_demangle): Free memory.

diff --git a/libstdc++-v3/testsuite/abi/pr42230.cc 
b/libstdc++-v3/testsuite/abi/pr42230.cc
index 2a33899..3b5a1f6 100644
--- a/libstdc++-v3/testsuite/abi/pr42230.cc
+++ b/libstdc++-v3/testsuite/abi/pr42230.cc
@@ -12,5 +12,6 @@ int main()
   char* ret = abi::__cxa_demangle("e", 0, &length, &cc);
 
   assert( (cc < 0 && !ret) || (ret && length) );
+  std::free(ret);
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/util/testsuite_abi.cc 
b/libstdc++-v3/testsuite/util/testsuite_abi.cc
index 4d7f4ca..d18429a 100644
--- a/libstdc++-v3/testsuite/util/testsuite_abi.cc
+++ b/libstdc++-v3/testsuite/util/testsuite_abi.cc
@@ -590,21 +590,26 @@ create_symbols(const char* file)
 }
 
 
-const char*
+std::string
 demangle(const std::string& mangled)
 {
-  const char* name;
+  std::string name;
   if (mangled[0] != '_' || mangled[1] != 'Z')
 {
   // This is not a mangled symbol, thus has "C" linkage.
-  name = mangled.c_str();
+  name = mangled;
 }
   else
 {
   // Use __cxa_demangle to demangle.
   int status = 0;
-  name = abi::__cxa_demangle(mangled.c_str(), 0, 0, &status);
-  if (!name)
+  char* ptr = abi::__cxa_demangle(mangled.c_str(), 0, 0, &status);
+  if (ptr)
+   {
+ name = ptr;
+ free(ptr);
+   }
+  else
{
  switch (status)
{
diff --git a/libstdc++-v3/testsuite/util/testsuite_abi.h 
b/libstdc++-v3/testsuite/util/testsuite_abi.h
index 8275b23..77c5656 100644
--- a/libstdc++-v3/testsuite/util/testsuite_abi.h
+++ b/libstdc++-v3/testsuite/util/testsuite_abi.h
@@ -94,5 +94,5 @@ compare_symbols(const char* baseline_file, const char* 
test_file, bool verb);
 symbols
 create_symbols(const char* file);
 
-const char*
+std::string
 demangle(const std::string& mangled);
diff --git a/libstdc++-v3/testsuite/util/testsuite_hooks.cc 
b/libstdc++-v3/testsuite/util/testsuite_hooks.cc
index d1063e3..74e755d 100644
--- a/libstdc++-v3/testsuite/util/testsuite_hooks.cc
+++ b/libstdc++-v3/testsuite/util/testsuite_hooks.cc
@@ -131,8 +131,11 @@ namespace __gnu_test
   verify_demangle(const char* mangled, const char* wanted)
   {
 int status = 0;
-const char* s = abi::__cxa_demangle(mangled, 0, 0, &status);
-if (!s)
+const char* s = 0;
+char* demangled = abi::__cxa_demangle(mangled, 0, 0, &status);
+if (demangled)
+  s = demangled;
+else
   {
switch (status)
  {
@@ -156,6 +159,7 @@ namespace __gnu_test
 std::string w(wanted);
 if (w != s)
   std::__throw_runtime_error(s);
+free(demangled);
   }
 
   void


Re: [PATCH 2/3, GCC/ARM] Add support for ARMv8-R architecture

2017-07-06 Thread Thomas Preudhomme

Please find an updated patch in attachment. ChangeLog entry are now as follows:

*** gcc/ChangeLog ***

2017-07-06  Thomas Preud'homme  

* config/arm/arm-cpus.in (armv8-r): Add new entry.
* config/arm/arm-isa.h (ISA_ARMv8r): Define macro.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R
enumerator.
* doc/invoke.texi: Mention -march=armv8-r and its extensions.

*** gcc/testsuite/ChangeLog ***

2017-01-31  Thomas Preud'homme  

* lib/target-supports.exp: Generate
check_effective_target_arm_arch_v8r_ok, add_options_for_arm_arch_v8r
and check_effective_target_arm_arch_v8r_multilib.

*** libgcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  

* config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R.


Tested by building an arm-none-eabi GCC cross-compiler targetting
ARMv8-R.

Is this ok for stage1?

Best regards,

Thomas

Best regards,

Thomas

On 29/06/17 16:13, Thomas Preudhomme wrote:

Please ignore this patch. I'll respin the patch on a more recent GCC.

Best regards,

Thomas

On 29/06/17 14:55, Thomas Preudhomme wrote:

Hi,

This patch adds support for ARMv8-R architecture [1] which was recently
announced. User level instructions for ARMv8-R are the same as those in
ARMv8-A Aarch32 mode so this patch define ARMv8-R to have the same
features as ARMv8-A in ARM backend.

[1] 
https://developer.arm.com/products/architecture/r-profile/docs/ddi0568/latest/arm-architecture-reference-manual-supplement-armv8-for-the-armv8-r-aarch32-architecture-profile 



ChangeLog entries are as follow:

*** gcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  

 * config/arm/arm-cpus.in (armv8-r, armv8-r+rcr): Add new entry.
 * config/arm/arm-cpu-cdata.h: Regenerate.
 * config/arm/arm-cpu-data.h: Regenerate.
 * config/arm/arm-isa.h (ISA_ARMv8r): Define macro.
 * config/arm/arm-tables.opt: Regenerate.
 * config/arm/arm.h (enum base_architecture): Add BASE_ARCH_8R
 enumerator.
 * config/arm/bpabi.h (BE8_LINK_SPEC): Add entry for ARMv8-R and
 ARMv8-R with CRC extensions.
 * doc/invoke.texi: Mention -march=armv8-r and -march=armv8-r+crc
 options.  Document meaning of -march=armv8-r+rcr.

*** gcc/testsuite/ChangeLog ***

2017-01-31  Thomas Preud'homme  

 * lib/target-supports.exp: Generate
 check_effective_target_arm_arch_v8r_ok, add_options_for_arm_arch_v8r
 and check_effective_target_arm_arch_v8r_multilib.

*** libgcc/ChangeLog ***

2017-01-31  Thomas Preud'homme  

 * config/arm/lib1funcs.S: Defined __ARM_ARCH__ to 8 for ARMv8-R.

Tested by building an arm-none-eabi GCC cross-compiler targetting
ARMv8-R.

Is this ok for stage1?

Best regards,

Thomas
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 946d543ebb29416da9b4928161607cccacaa78a7..f35128acb7d68c6a0592355b9d3d56ee8f826aca 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -380,6 +380,22 @@ begin arch armv8-m.main
  option nodsp remove bit_ARMv7em
 end arch armv8-m.main
 
+begin arch armv8-r
+ tune for cortex-r4
+ tune flags CO_PROC
+ base 8R
+ profile R
+ isa ARMv8r
+ option crc add bit_crc32
+# fp.sp => fp-armv8 (d16); simd => simd + fp-armv8 + d32 + double precision
+# note: no fp option for fp-armv8 (d16) + double precision at the moment
+ option fp.sp add FP_ARMv8
+ option simd add FP_ARMv8 NEON
+ option crypto add FP_ARMv8 CRYPTO
+ option nocrypto remove ALL_CRYPTO
+ option nofp remove ALL_FP
+end arch armv8-r
+
 begin arch iwmmxt
  tune for iwmmxt
  tune flags LDSCHED STRONG XSCALE
diff --git a/gcc/config/arm/arm-isa.h b/gcc/config/arm/arm-isa.h
index c0c2ccee330f2313951e980c5d399ae5d21005d6..0d66a0400c517668db023fc66ff43e26d43add51 100644
--- a/gcc/config/arm/arm-isa.h
+++ b/gcc/config/arm/arm-isa.h
@@ -127,6 +127,7 @@ enum isa_feature
 #define ISA_ARMv8_2a	ISA_ARMv8_1a, isa_bit_ARMv8_2
 #define ISA_ARMv8m_base ISA_ARMv6m, isa_bit_ARMv8, isa_bit_cmse, isa_bit_tdiv
 #define ISA_ARMv8m_main ISA_ARMv7m, isa_bit_ARMv8, isa_bit_cmse
+#define ISA_ARMv8r	ISA_ARMv8a
 
 /* List of all cryptographic extensions to stripout if crypto is
disabled.  Currently, that's trivial, but we define it anyway for
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index 5e2df9dd0716293fb551b6582a8c9c2c46fdaa90..51678c2566e841894c5c0e9c613c8c0f832e9988 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -455,10 +455,13 @@ EnumValue
 Enum(arm_arch) String(armv8-m.main) Value(30)
 
 EnumValue
-Enum(arm_arch) String(iwmmxt) Value(31)
+Enum(arm_arch) String(armv8-r) Value(31)
 
 EnumValue
-Enum(arm_arch) String(iwmmxt2) Value(32)
+Enum(arm_arch) String(iwmmxt) Value(32)
+
+EnumValue
+Enum(arm_arch) String(iwmmxt2) Value(33)
 
 Enum
 Name(arm_fpu) Type(enum fpu_type)
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index c803d4461c08436ef5f8468f6018e3226ccf33f8..315622212a5ce10d0c771

RE: Add support for use_hazard_barrier_return function attribute

2017-07-06 Thread Maciej W. Rozycki
On Thu, 6 Jul 2017, Matthew Fortune wrote:

> >  Nothing wrong with your proposed change, but overall I wonder if (as a
> > follow-up change) we could find a nonintrusive way to have this pattern
> > (and `clear_hazard_' as well) produce JRS.HB rather than JR.HB in
> > microMIPS compilations, as using the 32-bit delay-slot NOP encoding
> > where the 16-bit one would do is obviously a tiny, but completely
> > unnecessary code space loss (and we do care about code space losses in
> > microMIPS compilations; conserving space is the very purpose of the
> > microMIPS ISA after all).
> > 
> >  Of course it wouldn't do if we rewrote the instruction pattern as
> > "%(jr%!.hb\t$31%/%)" here, because the NOP that follows would have to
> > come from an RTL instruction for `%!' to have any effect.  But perhaps
> > we could emit RTL instead somehow rather than hardcoding the NOP with
> > `%/'?
> 
> I think this case is so specialist we can safely just switch to writing
> out the NOP directly in the output pattern just keeping the %(%) for
> noreorder. This code will have to be reworked with microMIPSR6 when
> submitted so it can be handled then; good spot to use jrs.hb.

 It does not matter for `%!' whether you use `%/' or spell out `nop' 
literally.  I was more concerned about getting the instruction count 
correctly, which would be 1.5 for the JRS.HB case, however I think you can 
just set the `length' attribute directly, to 6.

 Still the issue of having separate almost identical patterns remains, as 
barring the use of `%!' I think you'll need to qualify them with 
TARGET_MICROMIPS and !TARGET_MICROMIPS respectively, to have different 
instruction mnemonics.  In this case I think you could write (untested):

(define_insn "mips_hb_return_internal"
  [(return)
   (unspec_volatile [(match_operand 0 "pmode_register_operand" ",")]
   UNSPEC_JRHB)]
  ""
  "@
   %(jrs.hb\t$31%/%)
   %(jr.hb\t$31%/%)"
  [(set_attr "compression" "micromips,*")
   (set_attr "length" "6,8")])

however the equivalent for `clear_hazard_' would be rather horrible 
(OTOH eventually it should use ADDIUPC in its SImode microMIPS variant, so 
perhaps this is acceptable as we'll have multiple different sequences 
anyway).

 For microMIPSr6 we'll then just have another variant with no delay slot.

  Maciej


Re: [PATCH, GCC/ARM] Remove ARMv8-M code for D17-D31

2017-07-06 Thread Thomas Preudhomme

Hi Richard,

On 28/06/17 16:56, Richard Earnshaw (lists) wrote:





This is silently baking in dangerous assumptions about GCC's internal
numbering of the registers.  That's not a good idea from a long-term
portability perspective.

At the very least you need to assert that all the interesting registers
are numbered in the range 0..63; but ideally the code should just handle
pretty much any assignment of internal register numbers.


There is already such an assert in my patch. :-)



Did you consider using sbitmaps rather than doing all the multi-word
stuff by steam?


I did now, most of it is trivial but interaction with compute_not_to_clear_mask 
is now more verbose because it returns a bitfield and one assert got quite ugly 
and expensive.


Please find an updated patch in attachment and judge by yourself.

Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 259597d8890ee84c5bd92b12b6f9f6521c8dcd2e..93e152b1f38d3675e4ada1de7a34c2c209d8db1f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3620,6 +3620,11 @@ arm_option_override (void)
   if (use_cmse && !arm_arch_cmse)
 error ("target CPU does not support ARMv8-M Security Extensions");
 
+  /* We don't clear D16-D31 VFP registers for cmse_nonsecure_call functions
+ and ARMv8-M Baseline and Mainline do not allow such configuration.  */
+  if (use_cmse && LAST_VFP_REGNUM > LAST_LO_VFP_REGNUM)
+error ("ARMv8-M Security Extensions incompatible with selected FPU");
+
   /* Disable scheduling fusion by default if it's not armv7 processor
  or doesn't prefer ldrd/strd.  */
   if (flag_schedule_fusion == 2
@@ -24996,42 +25001,41 @@ thumb1_expand_prologue (void)
 void
 cmse_nonsecure_entry_clear_before_return (void)
 {
-  uint64_t to_clear_mask[2];
+  sbitmap to_clear_bitmap;
   uint32_t padding_bits_to_clear = 0;
   uint32_t * padding_bits_to_clear_ptr = &padding_bits_to_clear;
   int regno, maxregno = IP_REGNUM;
   tree result_type;
   rtx result_rtl;
 
-  to_clear_mask[0] = (1ULL << (NUM_ARG_REGS)) - 1;
-  to_clear_mask[0] |= (1ULL << IP_REGNUM);
+  to_clear_bitmap = sbitmap_alloc (maxregno + 1);
+  bitmap_clear (to_clear_bitmap);
+  bitmap_set_range (to_clear_bitmap, R0_REGNUM, NUM_ARG_REGS);
+  bitmap_set_bit (to_clear_bitmap, IP_REGNUM);
 
   /* If we are not dealing with -mfloat-abi=soft we will need to clear VFP
  registers.  We also check that TARGET_HARD_FLOAT and !TARGET_THUMB1 hold
  to make sure the instructions used to clear them are present.  */
   if (TARGET_HARD_FLOAT && !TARGET_THUMB1)
 {
-  uint64_t float_mask = (1ULL << (D7_VFP_REGNUM + 1)) - 1;
+  int float_bits = D7_VFP_REGNUM - FIRST_VFP_REGNUM + 1;
   maxregno = LAST_VFP_REGNUM;
+  to_clear_bitmap = sbitmap_resize (to_clear_bitmap, maxregno, 0);
 
-  float_mask &= ~((1ULL << FIRST_VFP_REGNUM) - 1);
-  to_clear_mask[0] |= float_mask;
-
-  float_mask = (1ULL << (maxregno - 63)) - 1;
-  to_clear_mask[1] = float_mask;
+  bitmap_set_range (to_clear_bitmap, FIRST_VFP_REGNUM, float_bits);
 
   /* Make sure we don't clear the two scratch registers used to clear the
 	 relevant FPSCR bits in output_return_instruction.  */
   emit_use (gen_rtx_REG (SImode, IP_REGNUM));
-  to_clear_mask[0] &= ~(1ULL << IP_REGNUM);
+  bitmap_clear_bit (to_clear_bitmap, IP_REGNUM);
   emit_use (gen_rtx_REG (SImode, 4));
-  to_clear_mask[0] &= ~(1ULL << 4);
+  bitmap_clear_bit (to_clear_bitmap, 4);
 }
 
   /* If the user has defined registers to be caller saved, these are no longer
  restored by the function before returning and must thus be cleared for
  security purposes.  */
-  for (regno = NUM_ARG_REGS; regno < LAST_VFP_REGNUM; regno++)
+  for (regno = NUM_ARG_REGS; regno <= maxregno; regno++)
 {
   /* We do not touch registers that can be used to pass arguments as per
 	 the AAPCS, since these should never be made callee-saved by user
@@ -25041,29 +25045,50 @@ cmse_nonsecure_entry_clear_before_return (void)
   if (IN_RANGE (regno, IP_REGNUM, PC_REGNUM))
 	continue;
   if (call_used_regs[regno])
-	to_clear_mask[regno / 64] |= (1ULL << (regno % 64));
+	bitmap_set_bit (to_clear_bitmap, regno);
 }
 
   /* Make sure we do not clear the registers used to return the result in.  */
   result_type = TREE_TYPE (DECL_RESULT (current_function_decl));
   if (!VOID_TYPE_P (result_type))
 {
+  unsigned count;
+  uint64_t to_clear_return_mask;
   result_rtl = arm_function_value (result_type, current_function_decl, 0);
 
   /* No need to check that we return in registers, because we don't
 	 support returning on stack yet.  */
-  to_clear_mask[0]
-	&= ~compute_not_to_clear_mask (result_type, result_rtl, 0,
-   padding_bits_to_clear_ptr);
+  gcc_assert (REG_P (result_rtl));
+  to_clear_return_mask
+	= compute_not_to_clear_mask (result_type, result_rtl, 0,
+ padding_bits_to_clear_ptr);
+  if (to_clear_return_mas

[PATCH] Prevent __uses_alloc from holding dangling references

2017-07-06 Thread Jonathan Wakely

The polymorphic_allocator::construct functions create dangling
pointers to rvalues of type memory_resource* and then dereference
them, leading to undefined behaviour.

This fixes those functions to use lvalues, and then adds a deleted
overload of __use_alloc to prevent this happening again.

This means __use_alloc can't be used like:

 f( __use_alloc(c.get_allocator()) );

because get_allocator() returns an rvalue, but it's an internal-only
helper and we can just do this instead:

 auto alloc = c.get_allocator();
 f( __use_alloc(alloc) );
 


* include/bits/uses_allocator.h (__use_alloc(const _Alloc&&)): Add
deleted overload to prevent dangling references to rvalues.
* include/experimental/memory_resource
(polymorphic_allocator::construct): Do not call __use_alloc with
rvalue arguments.

Tested powerpc64le-linux, committed to trunk.

commit 072946c11b40cb13f709cb31ff5ef4a998d41f96
Author: Jonathan Wakely 
Date:   Thu Jul 6 12:22:49 2017 +0100

Prevent __uses_alloc from holding dangling references

* include/bits/uses_allocator.h (__use_alloc(const _Alloc&&)): Add
deleted overload to prevent dangling references to rvalues.
* include/experimental/memory_resource
(polymorphic_allocator::construct): Do not call __use_alloc with
rvalue arguments.

diff --git a/libstdc++-v3/include/bits/uses_allocator.h 
b/libstdc++-v3/include/bits/uses_allocator.h
index 89d4e43..4d60716 100644
--- a/libstdc++-v3/include/bits/uses_allocator.h
+++ b/libstdc++-v3/include/bits/uses_allocator.h
@@ -109,6 +109,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __ret._M_a = std::__addressof(__a);
   return __ret;
 }
+
+  template
+void
+__use_alloc(const _Alloc&&) = delete;
+
 #if __cplusplus > 201402L
   template 
 inline constexpr bool uses_allocator_v =
diff --git a/libstdc++-v3/include/experimental/memory_resource 
b/libstdc++-v3/include/experimental/memory_resource
index 653189c..99ace7a 100644
--- a/libstdc++-v3/include/experimental/memory_resource
+++ b/libstdc++-v3/include/experimental/memory_resource
@@ -168,8 +168,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template  //used here
void construct(_Tp1* __p, _Args&&... __args)
{
- auto __use_tag = __use_alloc<_Tp1, memory_resource*,
-  _Args...>(this->resource());
+ memory_resource* const __resource = this->resource();
+ auto __use_tag
+   = __use_alloc<_Tp1, memory_resource*, _Args...>(__resource);
  _M_construct(__use_tag, __p, std::forward<_Args>(__args)...);
}
 
@@ -180,10 +181,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   tuple<_Args1...> __x,
   tuple<_Args2...> __y)
{
+ memory_resource* const __resource = this->resource();
  auto __x_use_tag =
-   __use_alloc<_Tp1, memory_resource*, _Args1...>(this->resource());
+   __use_alloc<_Tp1, memory_resource*, _Args1...>(__resource);
  auto __y_use_tag =
-   __use_alloc<_Tp2, memory_resource*, _Args2...>(this->resource());
+   __use_alloc<_Tp2, memory_resource*, _Args2...>(__resource);
 
  ::new(__p) std::pair<_Tp1, _Tp2>(piecewise_construct,
   _M_construct_p(__x_use_tag, __x),


Re: [PATCH][x86] Add missing intrinsics for VGETMANT[SD,SS] and VGETEXP[SD,SS]

2017-07-06 Thread Kirill Yukhin
On 06 Jul 09:35, Peryt, Sebastian wrote:
> Hi,
> 
> This patch adds missing intrinsics for VGETEXPSD, VGETEXPSS, VGETMANTSD, 
> VGETMANTSS.
> 
> 2017-07-06  Sebastian Peryt  
> 
> gcc/
>   * config/i386/avx512fintrin.h (_mm_mask_getexp_round_ss, 
>   _mm_maskz_getexp_round_ss,  _mm_mask_getexp_round_sd, 
>   _mm_maskz_getexp_round_sd, _mm_mask_getmant_round_sd,
>   _mm_maskz_getmant_round_sd, _mm_mask_getmant_round_ss, 
>   _mm_maskz_getmant_round_ss, _mm_mask_getexp_ss, _mm_maskz_getexp_ss, 
>   _mm_mask_getexp_sd, _mm_maskz_getexp_sd, _mm_mask_getmant_sd, 
>   _mm_maskz_getmant_sd, _mm_mask_getmant_ss, 
>   _mm_maskz_getmant_ss): New intrinsics.
>   (__builtin_ia32_getexpss128_mask): Changed to ...
>   __builtin_ia32_getexpss128_round ... this.
>   (__builtin_ia32_getexpsd128_mask): Changed to ...
>   __builtin_ia32_getexpsd128_round ... this.
>   * config/i386/i386-builtin-types.def 
>   ((V2DF, V2DF, V2DF, INT, V2DF, UQI, INT),
>   (V4SF, V4SF, V4SF, INT, V4SF, UQI, INT)): New function type aliases.
>   * config/i386/i386-builtin.def (__builtin_ia32_getexpsd_mask_round, 
>   __builtin_ia32_getexpss_mask_round, 
> __builtin_ia32_getmantsd_mask_round, 
>   __builtin_ia32_getmantss_mask_round): New builtins.
>   * config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT,
>   V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT): Handle new types.
>   (CODE_FOR_avx512f_vgetmantv2df_mask_round, 
>   CODE_FOR_avx512f_vgetmantv4sf_mask_round): New cases.
>   * config/i386/sse.md 
>   (avx512f_sgetexp): Changed to ...
>   avx512f_sgetexp
>... this.
>   (vgetexp\t{%2, %1, %0|
>   %0, %1, %2}): Changed to ...
>   vgetexp
>   \t{%2, %1, %0|
>   %0, %1, %2} ... 
> this.
>   (avx512f_vgetmant): Changed to ...
>   avx512f_vgetmant
>... this.
>   (vgetmant\t{%3, %2, %1, %0|
>   %0, %1, %2, %3}): Changed to ...
>   vgetmant
>   \t{%3, %2, %1, %0|
>   %0, %1, %2
>   , %3} ... this.
>   * config/i386/subst.md (mask_scalar_operand4, 
>   round_saeonly_scalar_mask_operand4, round_saeonly_scalar_mask_op4, 
>   round_saeonly_scalar_nimm_predicate): New subst attributes.
> 
> gcc/testsuite/
>   * gcc.target/i386/avx512f-vgetexpsd-1.c (_mm_mask_getexp_sd, 
>   _mm_maskz_getexp_sd, _mm_mask_getexp_round_sd, 
>   _mm_maskz_getexp_round_sd): Test new intrinsics.
>   * gcc.target/i386/avx512f-vgetexpss-1.c (_mm_mask_getexp_ss, 
>   _mm_maskz_getexp_ss, _mm_mask_getexp_round_ss, 
>   _mm_maskz_getexp_round_ss): Ditto.
>   * gcc.target/i386/avx512f-vgetmantsd-1.c (_mm_mask_getmant_sd, 
>   _mm_maskz_getmant_sd, _mm_mask_getmant_round_sd, 
>   _mm_maskz_getmant_round_sd): Ditto.
>   * gcc.target/i386/avx512f-vgetmantss-1.c (_mm_mask_getmant_ss, 
>   _mm_maskz_getmant_ss, _mm_mask_getmant_round_ss, 
>   _mm_maskz_getmant_round_ss): Ditto.
>   * gcc.target/i386/avx512f-vgetexpsd-2.c (_mm_mask_getexp_sd, 
>   _mm_maskz_getexp_sd, _mm_getexp_round_sd, _mm_mask_getexp_round_sd, 
>   _mm_maskz_getexp_round_sd): New runtime tests.
>   * gcc.target/i386/avx512f-vgetexpss-2.c (_mm_mask_getexp_ss, 
>   _mm_maskz_getexp_ss, _mm_getexp_round_ss, _mm_mask_getexp_round_ss, 
>   _mm_maskz_getexp_round_ss): Ditto.
>   * gcc.target/i386/avx512f-vgetmantsd-2.c (_mm_mask_getmant_sd, 
>   _mm_maskz_getmant_sd, _mm_getmant_round_sd, _mm_mask_getmant_round_sd, 
>   _mm_maskz_getmant_round_sd): Ditto.
>   * gcc.target/i386/avx512f-vgetmantss-2.c (_mm_mask_getmant_ss, 
>   _mm_maskz_getmant_ss, _mm_getmant_round_ss, _mm_mask_getmant_round_ss, 
>   _mm_maskz_getmant_round_ss): Ditto.
>   * gcc.target/i386/avx-1.c (__builtin_ia32_getexpsd_mask_round, 
>   __builtin_ia32_getexpss_mask_round, 
> __builtin_ia32_getmantsd_mask_round, 
>   __builtin_ia32_getmantss_mask_round): Test new builtins.
>   * gcc.target/i386/sse-13.c : Ditto.
>   * gcc.target/i386/sse-23.c: Ditto. 
>   * gcc.target/i386/sse-14.c (_mm_maskz_getexp_round_sd, 
>   _mm_maskz_getexp_round_ss, _mm_mask_getmant_round_sd, 
>   _mm_maskz_getmant_round_sd, _mm_mask_getmant_round_ss,
>   _mm_maskz_getmant_round_ss, _mm_mask_getexp_round_sd, 
>   _mm_mask_getexp_round_ss): Test new intrinsics.
>   * gcc.target/i386/testround-1.c: Ditto.
>   * gcc.target/i386/sse-22.c (_mm_maskz_getmant_round_sd, 
>   _mm_maskz_getmant_round_ss, _mm_mask_getmant_round_sd, 
>   _mm_mask_getmant_round_ss): Test new intrinsics 
>   * gcc.target/i386/testimm-10.c (_mm_mask_getmant_sd, 
>   _mm_maskz_getmant_sd, _mm_mask_getmant_ss, 
>   _mm_maskz_getmant_ss): Test new intrinsics.
> 
> Is it ok for trunk?
Your patch is OK for trunk. I've comitted it.

--
Thanks, K

PS: Could you pls in future remove trailing whitespaces in ChangeLog entrie?
> 
> Thanks,
> Sebastian




Re: [patch][x86] Remove old rounding code

2017-07-06 Thread Kirill Yukhin
Hello Julia,
On 21 Jun 08:41, Koval, Julia wrote:
> Hi,
> This patch removes old parallel code for avx512er. Parallel in this case 
> can't be generated anymore, because all existing patterns were reworked to 
> unspec in r249423 and r249009. Ok for trunk?
Your patch is OK for trunk. I've comitted it.

> Thanks,
> Julia

--
Thanks, K




Re: [Arm] Obsoleting Command line option -mstructure-size-boundary in eabi configurations

2017-07-06 Thread Richard Earnshaw (lists)
On 06/07/17 06:46, Michael Collison wrote:
> NetBSD/Arm requires that DEFAULT_STRUCTURE_SIZE_BOUNDARY (see 
> config/arm/netbsd-elf.h for details). This patch disallows 
> -mstructure-size-boundary on netbsd if the value is not equal to the 
> DEFAULT_STRUCTURE_SIZE_BOUNDARY.
> 
> Okay for trunk?
> 
> 2017-07-05  Michael Collison  
> 
>   * config/arm/arm.c (arm_option_override): Disallow
>   -mstructure-size-boundary on netbsd if value is not
>   DEFAULT_STRUCTURE_SIZE_BOUNDARY.
> 
> 

Frankly, I'd rather we moved towards obsoleting this option entirely.
The origins are from the days of the APCS (note, not AAPCS) when the
default was 32 when most of the world expected 8.

Now that the AAPCS is widely adopted, APCS is obsolete (NetBSD uses
ATPCS) and NetBSD (the only port not based on AAPCS these days) defaults
to 8 I can't see why anybody now would be interested in using a
different value.

So let's just mark this option as deprecated (emit a warning if

global_options_set.x_arm_structure_size_boundary

is ever set by the user, regardless of value).  Then in GCC 9 we can
perhaps remove this code entirely.

Documentation and release notes will need corresponding updates as well.

R.

> pr1556.patch
> 
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index bc1e607..911c272 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -3471,7 +3471,18 @@ arm_option_override (void)
>  }
>else
>  {
> -  if (arm_structure_size_boundary != 8
> +  /* Do not allow structure size boundary to be overridden for netbsd.  
> */
> +
> +  if ((arm_abi == ARM_ABI_ATPCS)
> +   && (arm_structure_size_boundary != DEFAULT_STRUCTURE_SIZE_BOUNDARY))
> + {
> +   warning (0,
> +"option %<-mstructure-size-boundary%> is deprecated for 
> netbsd; "
> +"defaulting to %d",
> +DEFAULT_STRUCTURE_SIZE_BOUNDARY);
> +   arm_structure_size_boundary = DEFAULT_STRUCTURE_SIZE_BOUNDARY;
> + }
> +  else if (arm_structure_size_boundary != 8
> && arm_structure_size_boundary != 32
> && !(ARM_DOUBLEWORD_ALIGN && arm_structure_size_boundary == 64))
>   {
> 



[arm] Fix warning in parsecpu.awk

2017-07-06 Thread Richard Earnshaw (lists)
In awk, single quotes within a quoted string do not need escaping.  The
existing code causes awk to grumble in the build logs.

* config/arm/parsecpu.awk (gen_comm_data): Do not escape single
quotes in quoted strings.

Committed.

R.
diff --git a/gcc/config/arm/parsecpu.awk b/gcc/config/arm/parsecpu.awk
index d096bca..9d01e2c 100644
--- a/gcc/config/arm/parsecpu.awk
+++ b/gcc/config/arm/parsecpu.awk
@@ -301,7 +301,7 @@ function gen_comm_data () {
 	arch_base[archs[n]] ","
 	# profile letter code, or zero if none.
 	if (archs[n] in arch_prof) {
-	print "\'" arch_prof[archs[n]] "\',"
+	print "'" arch_prof[archs[n]] "',"
 	} else {
 	print "0,"
 	}


[arm] Fix cross-native builds

2017-07-06 Thread Richard Earnshaw (lists)
The patch I committed yesterday to remove some generated headers from
the source tree unfortunately has a dependency missing that is only
revealed when doing a cross-native or full Canadian cross build.  The
gen* programs were missing a dependency on one of the generated headers.

Fixed by adding an explicit dependency rule for GTM_H in the same way as
we do for TM_H.

* config/arm/t-arm (GTM_H): Add arm-cpu.h.

Checked that this restores cross-native building.

Committed.

diff --git a/gcc/config/arm/t-arm b/gcc/config/arm/t-arm
index 3877232..16177e0 100644
--- a/gcc/config/arm/t-arm
+++ b/gcc/config/arm/t-arm
@@ -19,6 +19,7 @@
 # .
 
 TM_H += arm-cpu.h
+GTM_H += arm-cpu.h
 
 # All md files - except for arm.md.
 # This list should be kept in alphabetical order and updated whenever an md


[PATCH][x86] Add missing intrinsics for VGETMANT[SD,SS] and VGETEXP[SD,SS]

2017-07-06 Thread Peryt, Sebastian
Hi,

This patch adds missing intrinsics for VGETEXPSD, VGETEXPSS, VGETMANTSD, 
VGETMANTSS.

2017-07-06  Sebastian Peryt  

gcc/
* config/i386/avx512fintrin.h (_mm_mask_getexp_round_ss, 
_mm_maskz_getexp_round_ss,  _mm_mask_getexp_round_sd, 
_mm_maskz_getexp_round_sd, _mm_mask_getmant_round_sd,
_mm_maskz_getmant_round_sd, _mm_mask_getmant_round_ss, 
_mm_maskz_getmant_round_ss, _mm_mask_getexp_ss, _mm_maskz_getexp_ss, 
_mm_mask_getexp_sd, _mm_maskz_getexp_sd, _mm_mask_getmant_sd, 
_mm_maskz_getmant_sd, _mm_mask_getmant_ss, 
_mm_maskz_getmant_ss): New intrinsics.
(__builtin_ia32_getexpss128_mask): Changed to ...
__builtin_ia32_getexpss128_round ... this.
(__builtin_ia32_getexpsd128_mask): Changed to ...
__builtin_ia32_getexpsd128_round ... this.
* config/i386/i386-builtin-types.def 
((V2DF, V2DF, V2DF, INT, V2DF, UQI, INT),
(V4SF, V4SF, V4SF, INT, V4SF, UQI, INT)): New function type aliases.
* config/i386/i386-builtin.def (__builtin_ia32_getexpsd_mask_round, 
__builtin_ia32_getexpss_mask_round, 
__builtin_ia32_getmantsd_mask_round, 
__builtin_ia32_getmantss_mask_round): New builtins.
* config/i386/i386.c (V2DF_FTYPE_V2DF_V2DF_INT_V2DF_UQI_INT,
V4SF_FTYPE_V4SF_V4SF_INT_V4SF_UQI_INT): Handle new types.
(CODE_FOR_avx512f_vgetmantv2df_mask_round, 
CODE_FOR_avx512f_vgetmantv4sf_mask_round): New cases.
* config/i386/sse.md 
(avx512f_sgetexp): Changed to ...
avx512f_sgetexp
 ... this.
(vgetexp\t{%2, %1, %0|
%0, %1, %2}): Changed to ...
vgetexp
\t{%2, %1, %0|
%0, %1, %2} ... 
this.
(avx512f_vgetmant): Changed to ...
avx512f_vgetmant
 ... this.
(vgetmant\t{%3, %2, %1, %0|
%0, %1, %2, %3}): Changed to ...
vgetmant
\t{%3, %2, %1, %0|
%0, %1, %2
, %3} ... this.
* config/i386/subst.md (mask_scalar_operand4, 
round_saeonly_scalar_mask_operand4, round_saeonly_scalar_mask_op4, 
round_saeonly_scalar_nimm_predicate): New subst attributes.

gcc/testsuite/
* gcc.target/i386/avx512f-vgetexpsd-1.c (_mm_mask_getexp_sd, 
_mm_maskz_getexp_sd, _mm_mask_getexp_round_sd, 
_mm_maskz_getexp_round_sd): Test new intrinsics.
* gcc.target/i386/avx512f-vgetexpss-1.c (_mm_mask_getexp_ss, 
_mm_maskz_getexp_ss, _mm_mask_getexp_round_ss, 
_mm_maskz_getexp_round_ss): Ditto.
* gcc.target/i386/avx512f-vgetmantsd-1.c (_mm_mask_getmant_sd, 
_mm_maskz_getmant_sd, _mm_mask_getmant_round_sd, 
_mm_maskz_getmant_round_sd): Ditto.
* gcc.target/i386/avx512f-vgetmantss-1.c (_mm_mask_getmant_ss, 
_mm_maskz_getmant_ss, _mm_mask_getmant_round_ss, 
_mm_maskz_getmant_round_ss): Ditto.
* gcc.target/i386/avx512f-vgetexpsd-2.c (_mm_mask_getexp_sd, 
_mm_maskz_getexp_sd, _mm_getexp_round_sd, _mm_mask_getexp_round_sd, 
_mm_maskz_getexp_round_sd): New runtime tests.
* gcc.target/i386/avx512f-vgetexpss-2.c (_mm_mask_getexp_ss, 
_mm_maskz_getexp_ss, _mm_getexp_round_ss, _mm_mask_getexp_round_ss, 
_mm_maskz_getexp_round_ss): Ditto.
* gcc.target/i386/avx512f-vgetmantsd-2.c (_mm_mask_getmant_sd, 
_mm_maskz_getmant_sd, _mm_getmant_round_sd, _mm_mask_getmant_round_sd, 
_mm_maskz_getmant_round_sd): Ditto.
* gcc.target/i386/avx512f-vgetmantss-2.c (_mm_mask_getmant_ss, 
_mm_maskz_getmant_ss, _mm_getmant_round_ss, _mm_mask_getmant_round_ss, 
_mm_maskz_getmant_round_ss): Ditto.
* gcc.target/i386/avx-1.c (__builtin_ia32_getexpsd_mask_round, 
__builtin_ia32_getexpss_mask_round, 
__builtin_ia32_getmantsd_mask_round, 
__builtin_ia32_getmantss_mask_round): Test new builtins.
* gcc.target/i386/sse-13.c : Ditto.
* gcc.target/i386/sse-23.c: Ditto. 
* gcc.target/i386/sse-14.c (_mm_maskz_getexp_round_sd, 
_mm_maskz_getexp_round_ss, _mm_mask_getmant_round_sd, 
_mm_maskz_getmant_round_sd, _mm_mask_getmant_round_ss,
_mm_maskz_getmant_round_ss, _mm_mask_getexp_round_sd, 
_mm_mask_getexp_round_ss): Test new intrinsics.
* gcc.target/i386/testround-1.c: Ditto.
* gcc.target/i386/sse-22.c (_mm_maskz_getmant_round_sd, 
_mm_maskz_getmant_round_ss, _mm_mask_getmant_round_sd, 
_mm_mask_getmant_round_ss): Test new intrinsics 
* gcc.target/i386/testimm-10.c (_mm_mask_getmant_sd, 
_mm_maskz_getmant_sd, _mm_mask_getmant_ss, 
_mm_maskz_getmant_ss): Test new intrinsics.

Is it ok for trunk?

Thanks,
Sebastian


Missing_GETEXP_GETMANT.patch
Description: Missing_GETEXP_GETMANT.patch


Re: [PATCH] [AArch64] Fix PR71112

2017-07-06 Thread Hurugalawadi, Naveen
Hi Ramana,

>> PR71112 is still open - should this be backported to GCC-6 ?

Ported the patch to gcc-6-branch and committed as:-
https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=250014

Bootstrapped and Regression Tested gcc-6-branch for AArch64
on aarch64-thunder-linux.

Thanks,
Naveen


Re: [Arm] Obsoleting Command line option -mstructure-size-boundary in eabi configurations

2017-07-06 Thread Kyrill Tkachov

Hi Michael,

On 06/07/17 06:46, Michael Collison wrote:

NetBSD/Arm requires that DEFAULT_STRUCTURE_SIZE_BOUNDARY (see 
config/arm/netbsd-elf.h for details). This patch disallows 
-mstructure-size-boundary on netbsd if the value is not equal to the 
DEFAULT_STRUCTURE_SIZE_BOUNDARY.

Okay for trunk?

2017-07-05  Michael Collison  

* config/arm/arm.c (arm_option_override): Disallow
-mstructure-size-boundary on netbsd if value is not
DEFAULT_STRUCTURE_SIZE_BOUNDARY.


I don't have any experience with this option so I'll let Richard or Ramana 
comment on whether this
is appropriate, but if you go ahead with this you'll also need to update the 
documentation for this
option in doc/invoke.texi.

If this goes in we'll also need an entry for the changes.html page for GCC 8.

Kyrill


RE: Add support for use_hazard_barrier_return function attribute

2017-07-06 Thread Matthew Fortune
Maciej Rozycki  writes:
> On Fri, 23 Jun 2017, Prachi Godbole wrote:
> 
> > Index: gcc/config/mips/mips.md
> > ===
> > --- gcc/config/mips/mips.md (revision 246899)
> > +++ gcc/config/mips/mips.md (working copy)
> > @@ -6578,6 +6581,20 @@
> >[(set_attr "type""jump")
> > (set_attr "mode""none")])
> >
> > +;; Insn to clear execution and instruction hazards while returning.
> > +;; However, it doesn't clear hazards created by the insn in its delay
> slot.
> > +;; Thus, explicitly place a nop in its delay slot.
> > +
> > +(define_insn "mips_hb_return_internal"
> > +  [(return)
> > +   (unspec_volatile [(match_operand 0 "pmode_register_operand" "")]
> > +   UNSPEC_JRHB)]
> > +  ""
> > +  {
> > +return "%(jr.hb\t$31%/%)";
> > +  }
> > +  [(set_attr "insn_count" "2")])
> > +
> >  ;; Normal return.
> >
> >  (define_insn "_internal"
> 
>  Nothing wrong with your proposed change, but overall I wonder if (as a
> follow-up change) we could find a nonintrusive way to have this pattern
> (and `clear_hazard_' as well) produce JRS.HB rather than JR.HB in
> microMIPS compilations, as using the 32-bit delay-slot NOP encoding
> where the 16-bit one would do is obviously a tiny, but completely
> unnecessary code space loss (and we do care about code space losses in
> microMIPS compilations; conserving space is the very purpose of the
> microMIPS ISA after all).
> 
>  Of course it wouldn't do if we rewrote the instruction pattern as
> "%(jr%!.hb\t$31%/%)" here, because the NOP that follows would have to
> come from an RTL instruction for `%!' to have any effect.  But perhaps
> we could emit RTL instead somehow rather than hardcoding the NOP with
> `%/'?

I think this case is so specialist we can safely just switch to writing
out the NOP directly in the output pattern just keeping the %(%) for
noreorder. This code will have to be reworked with microMIPSR6 when
submitted so it can be handled then; good spot to use jrs.hb.

Thanks,
Matthew


RE: Add support for use_hazard_barrier_return function attribute

2017-07-06 Thread Matthew Fortune
Prachi Godbole  writes:
> Please find the updated patch below. I hope I've covered everything.
> I've added the test for inline restriction, could you check if I got all
> the options correct?

I think the test is probably good enough. It is a little too forgiving due
to handling the indirect call case to foo which could just detect an
indirect call from foo to bar (the placement of a scan-assembler in the
.c file has no impact on where in the generated output it will match in
the corresponding .s). Given that the test would fail appropriately on
a bare metal configuration (which is where this is likely to be most
useful) then I think that is sufficient.

Watch out for the long lines in comments. There is one that is hitting
80cols noted below to tweak before committing.

> Changelog:
> 
> 2017-06-23  Prachi Godbole  
> 
> gcc/
>   * config/mips/mips.h (machine_function): New variable
>   use_hazard_barrier_return_p.
>   * config/mips/mips.md (UNSPEC_JRHB): New unspec.
>   (mips_hb_return_internal): New insn pattern.
>   * config/mips/mips.c (mips_attribute_table): Add attribute
>   use_hazard_barrier_return.
>   (mips_use_hazard_barrier_return_p): New static function.
>   (mips_function_attr_inlinable_p): Likewise.
>   (mips_compute_frame_info): Set use_hazard_barrier_return_p.  Emit error
>   for unsupported architecture choice.
>   (mips_function_ok_for_sibcall, mips_can_use_return_insn): Return false
>   for use_hazard_barrier_return.
>   (mips_expand_epilogue): Emit hazard barrier return.
>   * doc/extend.texi: Document use_hazard_barrier_return.
> 
> gcc/testsuite/
>   * gcc.target/mips/hazard-barrier-return-attribute.c: New tests.

OK to commit.

> ===
> --- gcc/config/mips/mips.c(revision 246899)
> +++ gcc/config/mips/mips.c(working copy)
> @@ -7863,6 +7889,17 @@ mips_function_ok_for_sibcall (tree decl, tree exp
>&& !targetm.binds_local_p (decl))
>  return false;
> +  /* Functions that need to return with a hazard barrier cannot sibcall 
> because:

Long line for a comment above.

> +
> + 1) Hazard barriers are not possible for direct jumps
> +
> + 2) Despite an indirect jump with hazard barrier being possible we do
> + not use it so that the logic for generating a hazard barrier jump
> + can be contained within the epilogue handling.  */
> +
> +  if (mips_use_hazard_barrier_return_p (current_function_decl))
> +return false;
> +
>/* Otherwise OK.  */
>return true;
>  }

Thanks for the new feature!

Matthew


Re: [PATCH][Aarch64] Add support for overflow add and sub operations

2017-07-06 Thread Richard Earnshaw (lists)
On 06/07/17 08:29, Michael Collison wrote:
> Richard,
> 
> Can you explain "Use of ne is wrong here.  The condition register should
> be set to the result of a compare rtl construct.  The same applies
> elsewhere within this patch.  NE is then used on the result of the
> comparison.  The mode of the compare then indicates what might or might
> not be valid in the way the comparison is finally constructed."?
> 
> Why is "ne" wrong? I don't doubt you are correct, but I see nothing in
> the internals manual that forbids it. I want to understand what issues
> this exposes.
> 

Because the idiomatic form on a machine with a flags register is

CCreg:mode = COMPARE:mode (A, B)

which is then used with

 (CCreg:mode, 0)

where cond-op is NE, EQ, GE, ... as appropriate.


> As you indicate I used this idiom in the arm port when I added the
> overflow operations there as well. Additionally other targets seem to
> use the comparison operators this way (i386 for the umulv).

Some targets really have boolean predicate operations that set results
explicitly in GP registers as the truth of A < B, etc.  On those
machines using

 pred-reg = cond-op (A, B)

makes sense, but not on ARM or AArch64.

R.

> 
> Regards,
> 
> Michael Collison
> 
> -Original Message-
> From: Richard Earnshaw (lists) [mailto:richard.earns...@arm.com]
> Sent: Wednesday, July 5, 2017 2:38 AM
> To: Michael Collison ; Christophe Lyon
> 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub
> operations
> 
> On 19/05/17 22:11, Michael Collison wrote:
>> Christophe,
>> 
>> I had a type in the two test cases: "addcs" should have been "adcs". I 
>> caught this previously but submitted the previous patch incorrectly. Updated 
>> patch attached.
>> 
>> Okay for trunk?
>> 
> 
> Apologies for the delay responding, I've been procrastinating over this
> one.   In part it's due to the size of the patch with very little
> top-level description of what's the motivation and overall approach to
> the problem.
> 
> It would really help review if this could be split into multiple patches
> with a description of what each stage achieves.
> 
> Anyway, there are a couple of obvious formatting issues to deal with
> first, before we get into the details of the patch.
> 
>> -Original Message-
>> From: Christophe Lyon [mailto:christophe.l...@linaro.org]
>> Sent: Friday, May 19, 2017 3:59 AM
>> To: Michael Collison 
>> Cc: gcc-patches@gcc.gnu.org; nd 
>> Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub 
>> operations
>> 
>> Hi Michael,
>> 
>> 
>> On 19 May 2017 at 07:12, Michael Collison  wrote:
>>> Hi,
>>>
>>> This patch improves code generations for builtin arithmetic overflow 
>>> operations for the aarch64 backend. As an example for a simple test case 
>>> such as:
>>>
>>> Sure for a simple test case such as:
>>>
>>> int
>>> f (int x, int y, int *ovf)
>>> {
>>>   int res;
>>>   *ovf = __builtin_sadd_overflow (x, y, &res);
>>>   return res;
>>> }
>>>
>>> Current trunk at -O2 generates
>>>
>>> f:
>>> mov w3, w0
>>> mov w4, 0
>>> add w0, w0, w1
>>> tbnzw1, #31, .L4
>>> cmp w0, w3
>>> blt .L3
>>> .L2:
>>> str w4, [x2]
>>> ret
>>> .p2align 3
>>> .L4:
>>> cmp w0, w3
>>> ble .L2
>>> .L3:
>>> mov w4, 1
>>> b   .L2
>>>
>>>
>>> With the patch this now generates:
>>>
>>> f:
>>> addsw0, w0, w1
>>> csetw1, vs
>>> str w1, [x2]
>>> ret
>>>
>>>
>>> Original patch from Richard Henderson:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html
>>>
>>>
>>> Okay for trunk?
>>>
>>> 2017-05-17  Michael Collison  
>>> Richard Henderson 
>>>
>>> * config/aarch64/aarch64-modes.def (CC_V): New.
>>> * config/aarch64/aarch64-protos.h
>>> (aarch64_add_128bit_scratch_regs): Declare
>>> (aarch64_add_128bit_scratch_regs): Declare.
>>> (aarch64_expand_subvti): Declare.
>>> (aarch64_gen_unlikely_cbranch): Declare
>>> * config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
>>> for signed overflow using CC_Vmode.
>>> (aarch64_get_condition_code_1): Handle CC_Vmode.
>>> (aarch64_gen_unlikely_cbranch): New function.
>>> (aarch64_add_128bit_scratch_regs): New function.
>>> (aarch64_subv_128bit_scratch_regs): New function.
>>> (aarch64_expand_subvti): New function.
>>> * config/aarch64/aarch64.md (addv4, uaddv4): New.
>>> (addti3): Create simpler code if low part is already known to be 0.
>>> (addvti4, uaddvti4): New.
>>> (*add3_compareC_cconly_imm): New.
>>> (*add3_compareC_cconly): New.
>>> (*add3_compareC_imm): New.
>>> (*add3_compareC): Rename from add3_compare1; do not
>>> handle constants within this pattern.
>>> (*add3_compareV_cconly_imm)

RE: [PATCH][Aarch64] Add support for overflow add and sub operations

2017-07-06 Thread Michael Collison
Richard,

Can you explain "Use of ne is wrong here.  The condition register should be set 
to the result of a compare rtl construct.  The same applies elsewhere within 
this patch.  NE is then used on the result of the comparison.  The mode of the 
compare then indicates what might or might not be valid in the way the 
comparison is finally constructed."?

Why is "ne" wrong? I don't doubt you are correct, but I see nothing in the 
internals manual that forbids it. I want to understand what issues this exposes.

As you indicate I used this idiom in the arm port when I added the overflow 
operations there as well. Additionally other targets seem to use the comparison 
operators this way (i386 for the umulv).

Regards,

Michael Collison

-Original Message-
From: Richard Earnshaw (lists) [mailto:richard.earns...@arm.com] 
Sent: Wednesday, July 5, 2017 2:38 AM
To: Michael Collison ; Christophe Lyon 

Cc: gcc-patches@gcc.gnu.org; nd 
Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub operations

On 19/05/17 22:11, Michael Collison wrote:
> Christophe,
> 
> I had a type in the two test cases: "addcs" should have been "adcs". I caught 
> this previously but submitted the previous patch incorrectly. Updated patch 
> attached.
> 
> Okay for trunk?
> 

Apologies for the delay responding, I've been procrastinating over this
one.   In part it's due to the size of the patch with very little
top-level description of what's the motivation and overall approach to the 
problem.

It would really help review if this could be split into multiple patches with a 
description of what each stage achieves.

Anyway, there are a couple of obvious formatting issues to deal with first, 
before we get into the details of the patch.

> -Original Message-
> From: Christophe Lyon [mailto:christophe.l...@linaro.org]
> Sent: Friday, May 19, 2017 3:59 AM
> To: Michael Collison 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: Re: [PATCH][Aarch64] Add support for overflow add and sub 
> operations
> 
> Hi Michael,
> 
> 
> On 19 May 2017 at 07:12, Michael Collison  wrote:
>> Hi,
>>
>> This patch improves code generations for builtin arithmetic overflow 
>> operations for the aarch64 backend. As an example for a simple test case 
>> such as:
>>
>> Sure for a simple test case such as:
>>
>> int
>> f (int x, int y, int *ovf)
>> {
>>   int res;
>>   *ovf = __builtin_sadd_overflow (x, y, &res);
>>   return res;
>> }
>>
>> Current trunk at -O2 generates
>>
>> f:
>> mov w3, w0
>> mov w4, 0
>> add w0, w0, w1
>> tbnzw1, #31, .L4
>> cmp w0, w3
>> blt .L3
>> .L2:
>> str w4, [x2]
>> ret
>> .p2align 3
>> .L4:
>> cmp w0, w3
>> ble .L2
>> .L3:
>> mov w4, 1
>> b   .L2
>>
>>
>> With the patch this now generates:
>>
>> f:
>> addsw0, w0, w1
>> csetw1, vs
>> str w1, [x2]
>> ret
>>
>>
>> Original patch from Richard Henderson:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01903.html
>>
>>
>> Okay for trunk?
>>
>> 2017-05-17  Michael Collison  
>> Richard Henderson 
>>
>> * config/aarch64/aarch64-modes.def (CC_V): New.
>> * config/aarch64/aarch64-protos.h
>> (aarch64_add_128bit_scratch_regs): Declare
>> (aarch64_add_128bit_scratch_regs): Declare.
>> (aarch64_expand_subvti): Declare.
>> (aarch64_gen_unlikely_cbranch): Declare
>> * config/aarch64/aarch64.c (aarch64_select_cc_mode): Test
>> for signed overflow using CC_Vmode.
>> (aarch64_get_condition_code_1): Handle CC_Vmode.
>> (aarch64_gen_unlikely_cbranch): New function.
>> (aarch64_add_128bit_scratch_regs): New function.
>> (aarch64_subv_128bit_scratch_regs): New function.
>> (aarch64_expand_subvti): New function.
>> * config/aarch64/aarch64.md (addv4, uaddv4): New.
>> (addti3): Create simpler code if low part is already known to be 0.
>> (addvti4, uaddvti4): New.
>> (*add3_compareC_cconly_imm): New.
>> (*add3_compareC_cconly): New.
>> (*add3_compareC_imm): New.
>> (*add3_compareC): Rename from add3_compare1; do not
>> handle constants within this pattern.
>> (*add3_compareV_cconly_imm): New.
>> (*add3_compareV_cconly): New.
>> (*add3_compareV_imm): New.
>> (add3_compareV): New.
>> (add3_carryinC, add3_carryinV): New.
>> (*add3_carryinC_zero, *add3_carryinV_zero): New.
>> (*add3_carryinC, *add3_carryinV): New.
>> (subv4, usubv4): New.
>> (subti): Handle op1 zero.
>> (subvti4, usub4ti4): New.
>> (*sub3_compare1_imm): New.
>> (sub3_carryinCV): New.
>> (*sub3_carryinCV_z1_z2, *sub3_carryinCV_z1): New.
>> (*sub3_carryinCV_z2, *sub3_carryinCV): New.
>> * testsuite/gcc.target/arm/builtin_sadd_128.c: New 

Re: [gomp4, nvptx, committed] Fix assert in nvptx_propagate_unified

2017-07-06 Thread Thomas Schwinge
Hi Tom!

On Fri, 30 Jun 2017 17:15:24 +0200, Tom de Vries  wrote:
> with the openacc test-case in attached patch, I ran into an assert here:

Using your test case, in my build with
"--enable-checking=yes,df,fold,rtl", I already earlier run into an ICE...

> static void
> nvptx_propagate_unified (rtx_insn *unified)
> {
> rtx_insn *probe = unified;
> rtx cond_reg = SET_DEST (PATTERN (unified));
> rtx pat;
> 
> /* Find the comparison.  (We could skip this and simply scan to he
>blocks' terminating branch, if we didn't care for self
>checking.)  */
> for (;;)
>   {
> probe = NEXT_INSN (probe);
> pat = PATTERN (probe);

... here:


[...]/libgomp/testsuite/libgomp.oacc-c/../libgomp.oacc-c-c++-common/reduction-cplx-flt-2.c:19:9:
 internal compiler error: RTL check: expected elt 3 type 'e' or 'u', have '0' 
(rtx note) in PATTERN, at rtl.h:1440

Breakpoint 2, internal_error (gmsgid=0x10cc840 "RTL check: expected elt %d 
type '%c' or '%c', have '%c' (rtx %s) in %s, at %s:%d") at 
[...]/gcc/diagnostic.c:1251
1251{
(gdb) bt
#0  internal_error (gmsgid=0x10cc840 "RTL check: expected elt %d type '%c' 
or '%c', have '%c' (rtx %s) in %s, at %s:%d") at [...]/gcc/diagnostic.c:1251
#1  0x009bd2c7 in rtl_check_failed_type2 (r=0x7688cd40, 
n=, c1=, c2=, file=, line=, func=0x106ac48 
<_ZZ7PATTERNP7rtx_defE12__FUNCTION__> "PATTERN") at [...]/gcc/rtl.c:802
#2  0x00529ef3 in PATTERN (insn=) at 
[...]/gcc/rtl.h:1440
#3  0x005e5a2b in PATTERN (insn=) at 
[...]/gcc/rtl.h:1440
#4  0x00d08b96 in nvptx_propagate_unified (unified=0x7688ccc0) 
at [...]/gcc/config/nvptx/nvptx.c:2299
#5  0x00d093e7 in nvptx_split_blocks (map=map@entry=0x7fffcc40) 
at [...]/gcc/config/nvptx/nvptx.c:2428
#6  0x00d0d08b in nvptx_reorg () at 
[...]/gcc/config/nvptx/nvptx.c:3840
#7  0x009bb0ea in (anonymous 
namespace)::pass_machine_reorg::execute (this=) at 
[...]/gcc/reorg.c:3952
[...]
(gdb) frame 4
#4  0x00d08b96 in nvptx_propagate_unified (unified=0x7688ccc0) 
at [...]/gcc/config/nvptx/nvptx.c:2299
2299  pat = PATTERN (probe);
(gdb) print probe
$1 = (rtx_insn *) 0x7688cd40
(gdb) call debug_rtx(probe)
(note 56 54 57 3 NOTE_INSN_DELETED)

> 
> if (GET_CODE (pat) == SET
> && GET_RTX_CLASS (GET_CODE (SET_SRC (pat))) == RTX_COMPARE
> && XEXP (SET_SRC (pat), 0) == cond_reg)
>   break;
> gcc_assert (NONJUMP_INSN_P (probe));
>   }
> ...
> 
> The assert happens when processing insn 56:
> ...
> (insn 54 53 56 3 (set (reg:SI 47 [ _71 ])
>(unspec:SI [
>(reg:SI 36 [ _58 ])
>] UNSPEC_BR_UNIFIED)) 108 {cond_uni}
> (nil))
> (note 56 54 57 3 NOTE_INSN_DELETED)
> (insn 57 56 58 3 (set (reg:BI 68)
>(gt:BI (reg:SI 47 [ _71 ])
>(const_int 1 [0x1]))) 99 {*cmpsi}
> (expr_list:REG_DEAD (reg:SI 47 [ _71 ])
>(nil)))
> ...
> The insn 56 was originally a '(set (reg x) (const_int 1))', but that one 
> has been combined into insn 57 and replaced with a 'NOTE_INSN_DELETED'. 
> So it seems reasonable for the loop to skip over this note.
> 
> Fixed by making the assert condition less strict.
> 
> Build on x86_64 with nvptx accelerator.
> 
> Tested test-case included in the patch.
> 
> Committed as trivial.

> --- a/gcc/config/nvptx/nvptx.c
> +++ b/gcc/config/nvptx/nvptx.c
> @@ -2300,7 +2300,7 @@ nvptx_propagate_unified (rtx_insn *unified)
> && GET_RTX_CLASS (GET_CODE (SET_SRC (pat))) == RTX_COMPARE
> && XEXP (SET_SRC (pat), 0) == cond_reg)
>   break;
> -  gcc_assert (NONJUMP_INSN_P (probe));
> +  gcc_assert (NONJUMP_INSN_P (probe) || !INSN_P (probe));
>  }
>rtx pred_reg = SET_DEST (pat);

These problems (both yours and mine) do not reproduce on trunk, right?
But I suppose these are still a latent, just waiting for a different test
case?  Maybe this is a case to write an RTL-level test case?  (Unless the
fix is deemed trivial enough to warrent spending time on this.)

Anyway, I don't know a lot about RTL, but the following patch does cure
this test case (now running other testing).  Would you please check that,
and also whether nvptx_propagate_unified then still works as expected?
Is this patch OK (both for gomp-4_0-branch, and also for trunk?), or
should this rather use something like:

-if (!INSN_P (probe))
+if (NOTE_P (probe) && NOTE_KIND (probe) == NOTE_INSN_DELETED)
   continue;

..., or something yet different?

--- gcc/config/nvptx/nvptx.c
+++ gcc/config/nvptx/nvptx.c
@@ -2286,7 +2286,7 @@ nvptx_propagate_unified (rtx_insn *unified)
 {
   rtx_insn *probe = unified;
   rtx cond_reg = SET_DEST (PATTERN (unified));
-  rtx pat;
+  rtx pat = NULL_RTX;
 
   /* Find the comparison.  (We could skip this and simply scan to he
  blocks' terminating branch, if we didn'