Re: [PATCH] Add -static-pie to GCC driver to create static PIE

2017-08-08 Thread Richard Biener
On August 9, 2017 12:18:41 AM GMT+02:00, "H.J. Lu"  wrote:
>This patch adds -static-pie to GCC driver to create static PIE.  A
>static
>position independent executable (PIE) is similar to static executable,
>but can be loaded at any address without a dynamic linker.  All linker
>input files must be compiled with -fpie or -fPIE and linker must
>support
>--no-dynamic-linker to avoid linking with dynamic linker.  "-z text" is
>also needed to prevent dynamic relocations in read-only segments.
>
>OK for trunk?

What's wrong with -static -pie?

>H.J.
>---
>   PR driver/81498
>   * common.opt (-static-pie): New alias.
>   (shared): Negate static-pie.
>   (static-pie): New option.
>   * config/gnu-user.h (GNU_USER_TARGET_STARTFILE_SPEC): Add
>   -static-pie support.
>   (GNU_USER_TARGET_ENDFILE_SPEC): Likewise.
>   (LINK_EH_SPEC): Likewise.
>   (LINK_GCC_C_SEQUENCE_SPEC): Likewise.
>   * config/i386/gnu-user.h (GNU_USER_TARGET_LINK_SPEC): Likewise.
>   * config/i386/gnu-user64.h (GNU_USER_TARGET_LINK_SPEC): Likewise.
>   * gcc.c (LINK_COMMAND_SPEC): Likewise.
>   (init_gcc_specs): Likewise.
>   (init_spec): Likewise.
>   * doc/invoke.texi: Document -static-pie.
>---
> gcc/common.opt   |  9 -
> gcc/config/gnu-user.h| 15 ---
> gcc/config/i386/gnu-user.h   |  7 ---
> gcc/config/i386/gnu-user64.h | 11 ++-
> gcc/doc/invoke.texi  | 11 ++-
> gcc/gcc.c| 17 ++---
> 6 files changed, 46 insertions(+), 24 deletions(-)
>
>diff --git a/gcc/common.opt b/gcc/common.opt
>index 1cb1c83d306..246566168cc 100644
>--- a/gcc/common.opt
>+++ b/gcc/common.opt
>@@ -353,6 +353,9 @@ Common Alias(pedantic-errors)
> -pie
> Driver Alias(pie)
> 
>+-static-pie
>+Driver Alias(static-pie)
>+
> -pipe
> Driver Alias(pipe)
> 
>@@ -3062,7 +3065,7 @@ x
> Driver Joined Separate
> 
> shared
>-Driver RejectNegative Negative(pie)
>+Driver RejectNegative Negative(static-pie)
> Create a shared library.
> 
> shared-libgcc
>@@ -3114,6 +3117,10 @@ pie
> Driver RejectNegative Negative(no-pie)
> Create a position independent executable.
> 
>+static-pie
>+Driver RejectNegative Negative(pie)
>+Create a static position independent executable.
>+
> z
> Driver Joined Separate
> 
>diff --git a/gcc/config/gnu-user.h b/gcc/config/gnu-user.h
>index de605b0c466..a967b69a350 100644
>--- a/gcc/config/gnu-user.h
>+++ b/gcc/config/gnu-user.h
>@@ -53,11 +53,11 @@ see the files COPYING3 and COPYING.RUNTIME
>respectively.  If not, see
>   "%{shared:; \
>  pg|p|profile:gcrt1.o%s; \
>  static:crt1.o%s; \
>- " PIE_SPEC ":Scrt1.o%s; \
>+ static-pie|" PIE_SPEC ":Scrt1.o%s; \
>  :crt1.o%s} \
>crti.o%s \
>%{static:crtbeginT.o%s; \
>- shared|" PIE_SPEC ":crtbeginS.o%s; \
>+ shared|static-pie|" PIE_SPEC ":crtbeginS.o%s; \
>  :crtbegin.o%s} \
>%{fvtable-verify=none:%s; \
>  fvtable-verify=preinit:vtv_start_preinit.o%s; \
>@@ -70,7 +70,7 @@ see the files COPYING3 and COPYING.RUNTIME
>respectively.  If not, see
>  :crt1.o%s} \
>crti.o%s \
>%{static:crtbeginT.o%s; \
>- shared|pie:crtbeginS.o%s; \
>+ shared|pie|static-pie:crtbeginS.o%s; \
>  :crtbegin.o%s} \
>%{fvtable-verify=none:%s; \
>  fvtable-verify=preinit:vtv_start_preinit.o%s; \
>@@ -92,7 +92,7 @@ see the files COPYING3 and COPYING.RUNTIME
>respectively.  If not, see
>  fvtable-verify=preinit:vtv_end_preinit.o%s; \
>  fvtable-verify=std:vtv_end.o%s} \
>%{static:crtend.o%s; \
>- shared|" PIE_SPEC ":crtendS.o%s; \
>+ shared|static-pie|" PIE_SPEC ":crtendS.o%s; \
>  :crtend.o%s} \
>crtn.o%s \
>" CRTOFFLOADEND
>@@ -102,7 +102,7 @@ see the files COPYING3 and COPYING.RUNTIME
>respectively.  If not, see
>  fvtable-verify=preinit:vtv_end_preinit.o%s; \
>  fvtable-verify=std:vtv_end.o%s} \
>%{static:crtend.o%s; \
>- shared|pie:crtendS.o%s; \
>+ shared|pie|static-pie:crtendS.o%s; \
>  :crtend.o%s} \
>crtn.o%s \
>" CRTOFFLOADEND
>@@ -132,12 +132,13 @@ see the files COPYING3 and COPYING.RUNTIME
>respectively.  If not, see
> #define LIB_SPEC GNU_USER_TARGET_LIB_SPEC
> 
> #if defined(HAVE_LD_EH_FRAME_HDR)
>-#define LINK_EH_SPEC "%{!static:--eh-frame-hdr} "
>+#define LINK_EH_SPEC "%{!static|static-pie:--eh-frame-hdr} "
> #endif
> 
> #undef LINK_GCC_C_SEQUENCE_SPEC
> #define LINK_GCC_C_SEQUENCE_SPEC \
>-  "%{static:--start-group} %G %L %{static:--end-group}%{!static:%G}"
>+  "%{static|static-pie:--start-group} %G %L \
>+   %{static|static-pie:--end-group}%{!static:%{!static-pie:%G}}"
> 
> /* Use --as-needed -lgcc_s for eh support.  */
> #ifdef HAVE_LD_AS_NEEDED
>diff --git a/gcc/config/i386/gnu-user.h b/gcc/config/i386/gnu-user.h
>index a4c88f1a848..8983dc9ecd7 100644
>--- a/gcc/config/i386/gnu-user.h
>+++ b/gcc/config/i386/gnu-user.h
>@@ -77,9 +77,10 @@ along with GCC; see the file COPYING3.  If not see
>#define 

RE: [PATCH][Arm] Test suite failures resulting from deprecation of -mstructure-size-boundary

2017-08-08 Thread Michael Collison
Because the comment (for example) in g+=.dg/ext/packed8.C says

 // NOTE: This test assumes packed structure layout differs from unpacked
//   structure layout.  This isn't true, e.g., with the default
//   arm-none-elf options.

If we could just delete the -mstructure-size-boundary=8 option why was it added 
in the first place?

-Original Message-
From: Richard Earnshaw (lists) [mailto:richard.earns...@arm.com] 
Sent: Monday, August 7, 2017 5:32 AM
To: Michael Collison ; gcc-patches@gcc.gnu.org
Cc: nd 
Subject: Re: [PATCH][Arm] Test suite failures resulting from deprecation of 
-mstructure-size-boundary

On 06/08/17 00:25, Michael Collison wrote:
> This patch fixes test case failures on arm targets due to 
> '-mstructure-size-boundary' being deprecated. The test cases were failing 
> because a warning was being issued and due to the fact that the size of 
> packed and unpacked structures is the same after deprecating 
> '-mstructure-size-boundary'
> 
> Okay for trunk?
> 
> 2017-08-04  Michael Collison  
> 
>   * testsuite/g++.dg/ext/packed8.C: Skip test for arm_eabi.
>   * testsuite/g++.dg/init/array16.C: Skip test for arm_eabi.
>   * testsuite/g++.dg/other/crash-4.C: Skip test for arm_eabi.
>   * testsuite/gcc.dg/builtin-stringop-chk-1.c: Skip test for arm_eabi.
> 

Why would we want to skip the test?  If you delete the 
-mstructure-size-boundary option then the test should execute correctly on all 
arm-eabi based platforms and most other ARM platforms where 8 was the default 
anyway.

R.

> 
> pr7519v1.patch
> 
> 
> diff --git a/gcc/testsuite/g++.dg/ext/packed8.C 
> b/gcc/testsuite/g++.dg/ext/packed8.C
> index 91ee8b3..4f38670 100644
> --- a/gcc/testsuite/g++.dg/ext/packed8.C
> +++ b/gcc/testsuite/g++.dg/ext/packed8.C
> @@ -2,7 +2,7 @@
>  // NOTE: This test assumes packed structure layout differs from unpacked
>  //   structure layout.  This isn't true, e.g., with the default
>  //   arm-none-elf options.
> -// { dg-options "-mstructure-size-boundary=8" { target arm*-*-* } }
> +// { dg-skip-if "packed structure layout does not differ from 
> +unpacked layout" { { arm*-*-* } && { arm_eabi } } }
>  
>  class A
>  {
> diff --git a/gcc/testsuite/g++.dg/init/array16.C 
> b/gcc/testsuite/g++.dg/init/array16.C
> index 188d1a8..3334e25 100644
> --- a/gcc/testsuite/g++.dg/init/array16.C
> +++ b/gcc/testsuite/g++.dg/init/array16.C
> @@ -1,7 +1,7 @@
>  // Causes timeout for the MMIX simulator on a 3GHz P4 and we can't  
> // have "compile" for some targets and "run" for others.
>  // { dg-do run { target { ! mmix-*-* } } } -// { dg-options 
> "-mstructure-size-boundary=8" { target arm*-*-* } }
> +// { dg-skip-if "packed structure layout does not differ from 
> +unpacked layout" { { arm*-*-* } && { arm_eabi } } }
>  
>  // Copyright (C) 2004 Free Software Foundation, Inc.
>  // Contributed by Nathan Sidwell 8 Dec 2004  
> diff --git a/gcc/testsuite/g++.dg/other/crash-4.C 
> b/gcc/testsuite/g++.dg/other/crash-4.C
> index a77fe05..8530f44 100644
> --- a/gcc/testsuite/g++.dg/other/crash-4.C
> +++ b/gcc/testsuite/g++.dg/other/crash-4.C
> @@ -7,7 +7,7 @@
>  // NOTE: This test assumes packed structure layout differs from unpacked
>  //   structure layout.  This isn't true, e.g., with the default
>  //   arm-none-elf options.
> -// { dg-options "-mstructure-size-boundary=8" { target arm*-*-* } }
> +// { dg-skip-if "packed structure layout does not differ" { { 
> +arm*-*-* } && { arm_eabi } } }
>  
>  struct a
>  {
> diff --git a/gcc/testsuite/gcc.dg/builtin-stringop-chk-1.c 
> b/gcc/testsuite/gcc.dg/builtin-stringop-chk-1.c
> index e265578..d839097 100644
> --- a/gcc/testsuite/gcc.dg/builtin-stringop-chk-1.c
> +++ b/gcc/testsuite/gcc.dg/builtin-stringop-chk-1.c
> @@ -2,7 +2,7 @@
> are emitted properly.  */
>  /* { dg-do compile } */
>  /* { dg-options "-O2 -Wno-format -std=gnu99 
> -ftrack-macro-expansion=0" } */
> -/* { dg-additional-options "-mstructure-size-boundary=8" { target 
> arm*-*-* } } */
> +// { dg-skip-if "packed structure layout does not differ from 
> +unpacked layout" { { arm*-*-* } && { arm_eabi } } }
>  // { dg-skip-if "packed attribute missing for t" { "epiphany-*-*" } }
>  
>  extern void abort (void);
> 



Re: [PATCH, rs6000] testcase coverage for vec_pack builtin

2017-08-08 Thread Segher Boessenkool
On Tue, Aug 08, 2017 at 04:24:26PM -0500, Will Schmidt wrote:
>   * fold-vec-pack-double.c: New.
>   * fold-vec-pack-int.c: New.
>   * fold-vec-pack-longlong.c: New.
>   * fold-vec-pack-short.c: New.

Okay for trunk (with the changelog fixed).  Thanks,


Segher


Re: [PATCH, rs6000] Add testcase coverage for vec_msum built-in

2017-08-08 Thread Segher Boessenkool
Hi,

On Tue, Aug 08, 2017 at 04:22:40PM -0500, Will Schmidt wrote:
>   * fold-vec-msum-char.c: New.
>   * fold-vec-msum-short.c: New.

Okay for trunk with the paths in the changelog fixed.  Thanks!


Segher


Re: [PATCH, rs6000] testcase coverage for vec_madd built-in functions

2017-08-08 Thread Segher Boessenkool
Hi!

On Tue, Aug 08, 2017 at 04:20:14PM -0500, Will Schmidt wrote:
>   * fold-vec-madd-double.c: New.
>   * fold-vec-madd-float.c: New.
>   * fold-vec-madd-short.c: New.

Same as before, it needs the path in here (the path is relative to the
directory the changelog is in).

The actual tests look fine, please commit.  Thanks!


Segher


Re: [PATCH, rs6000] testcase coverage for vec_cntlz

2017-08-08 Thread Segher Boessenkool
Hi Will,

>   * fold-vec-cntlz-int.c: New.

* gcc.target/powerpc/fold-vec-cntlz-int.c: New.

Okay for trunk with the changelog fixed like that.  Thanks,


Segher


Re: [PATCH, rs6000] enable early debug and disable switch for gimple folding

2017-08-08 Thread Segher Boessenkool
Hi!

On Tue, Aug 08, 2017 at 04:14:56PM -0500, Will Schmidt wrote:
>   * config/rs6000/rs6000.c: rs6000_option_override_internal() Add blurb
>   to indicate when early gimple folding has been disabled.

* config/rs6000/rs6000.c (rs6000_option_override_internal): ...


>   rs6000_gimple_fold_builtin(): Add debug content.

(rs6000_gimple_fold_builtin): ...

> @@ -16157,10 +16161,26 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator 
> *gsi)
>gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == 
> BUILT_IN_MD);
>enum rs6000_builtins fn_code
>  = (enum rs6000_builtins) DECL_FUNCTION_CODE (fndecl);
>tree arg0, arg1, lhs;
>  
> +  size_t uns_fncode = (size_t)fn_code;

Space after cast.  More of this in the rest of the patch.

> +  if (TARGET_DEBUG_BUILTIN)
> +  {
> +  fprintf (stderr, "rs6000_gimple_fold_builtin %d %s %s \n",
> +fn_code,fn_name1,fn_name2);

No space before \n, space after commas.

> +  if (rs6000_gimple_folding_disable)
> + return false;

One space indented too many there.

> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> index e94aa07..4372b00 100644
> --- a/gcc/config/rs6000/rs6000.opt
> +++ b/gcc/config/rs6000/rs6000.opt
> @@ -146,10 +146,14 @@ Generate AltiVec instructions using little-endian 
> element order.
>  
>  maltivec=be
>  Target Report RejectNegative Var(rs6000_altivec_element_order, 2)
>  Generate AltiVec instructions using big-endian element order.
>  
> +mgimple-folding=off
> +Target Report RejectNegative Var(rs6000_gimple_folding_disable, 1)
> +Disable early gimple folding of builtins.

Please use -mgimple-folding instead, or better, -mfold-gimple, along
with its no- variant?  So no RejectNegative.

It's probably easier to read if the internal var is the positive
version, too?


Segher


[PATCH] Add -static-pie to GCC driver to create static PIE

2017-08-08 Thread H.J. Lu
This patch adds -static-pie to GCC driver to create static PIE.  A static
position independent executable (PIE) is similar to static executable,
but can be loaded at any address without a dynamic linker.  All linker
input files must be compiled with -fpie or -fPIE and linker must support
--no-dynamic-linker to avoid linking with dynamic linker.  "-z text" is
also needed to prevent dynamic relocations in read-only segments.

OK for trunk?

H.J.
---
PR driver/81498
* common.opt (-static-pie): New alias.
(shared): Negate static-pie.
(static-pie): New option.
* config/gnu-user.h (GNU_USER_TARGET_STARTFILE_SPEC): Add
-static-pie support.
(GNU_USER_TARGET_ENDFILE_SPEC): Likewise.
(LINK_EH_SPEC): Likewise.
(LINK_GCC_C_SEQUENCE_SPEC): Likewise.
* config/i386/gnu-user.h (GNU_USER_TARGET_LINK_SPEC): Likewise.
* config/i386/gnu-user64.h (GNU_USER_TARGET_LINK_SPEC): Likewise.
* gcc.c (LINK_COMMAND_SPEC): Likewise.
(init_gcc_specs): Likewise.
(init_spec): Likewise.
* doc/invoke.texi: Document -static-pie.
---
 gcc/common.opt   |  9 -
 gcc/config/gnu-user.h| 15 ---
 gcc/config/i386/gnu-user.h   |  7 ---
 gcc/config/i386/gnu-user64.h | 11 ++-
 gcc/doc/invoke.texi  | 11 ++-
 gcc/gcc.c| 17 ++---
 6 files changed, 46 insertions(+), 24 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 1cb1c83d306..246566168cc 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -353,6 +353,9 @@ Common Alias(pedantic-errors)
 -pie
 Driver Alias(pie)
 
+-static-pie
+Driver Alias(static-pie)
+
 -pipe
 Driver Alias(pipe)
 
@@ -3062,7 +3065,7 @@ x
 Driver Joined Separate
 
 shared
-Driver RejectNegative Negative(pie)
+Driver RejectNegative Negative(static-pie)
 Create a shared library.
 
 shared-libgcc
@@ -3114,6 +3117,10 @@ pie
 Driver RejectNegative Negative(no-pie)
 Create a position independent executable.
 
+static-pie
+Driver RejectNegative Negative(pie)
+Create a static position independent executable.
+
 z
 Driver Joined Separate
 
diff --git a/gcc/config/gnu-user.h b/gcc/config/gnu-user.h
index de605b0c466..a967b69a350 100644
--- a/gcc/config/gnu-user.h
+++ b/gcc/config/gnu-user.h
@@ -53,11 +53,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
   "%{shared:; \
  pg|p|profile:gcrt1.o%s; \
  static:crt1.o%s; \
- " PIE_SPEC ":Scrt1.o%s; \
+ static-pie|" PIE_SPEC ":Scrt1.o%s; \
  :crt1.o%s} \
crti.o%s \
%{static:crtbeginT.o%s; \
- shared|" PIE_SPEC ":crtbeginS.o%s; \
+ shared|static-pie|" PIE_SPEC ":crtbeginS.o%s; \
  :crtbegin.o%s} \
%{fvtable-verify=none:%s; \
  fvtable-verify=preinit:vtv_start_preinit.o%s; \
@@ -70,7 +70,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
  :crt1.o%s} \
crti.o%s \
%{static:crtbeginT.o%s; \
- shared|pie:crtbeginS.o%s; \
+ shared|pie|static-pie:crtbeginS.o%s; \
  :crtbegin.o%s} \
%{fvtable-verify=none:%s; \
  fvtable-verify=preinit:vtv_start_preinit.o%s; \
@@ -92,7 +92,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
  fvtable-verify=preinit:vtv_end_preinit.o%s; \
  fvtable-verify=std:vtv_end.o%s} \
%{static:crtend.o%s; \
- shared|" PIE_SPEC ":crtendS.o%s; \
+ shared|static-pie|" PIE_SPEC ":crtendS.o%s; \
  :crtend.o%s} \
crtn.o%s \
" CRTOFFLOADEND
@@ -102,7 +102,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
  fvtable-verify=preinit:vtv_end_preinit.o%s; \
  fvtable-verify=std:vtv_end.o%s} \
%{static:crtend.o%s; \
- shared|pie:crtendS.o%s; \
+ shared|pie|static-pie:crtendS.o%s; \
  :crtend.o%s} \
crtn.o%s \
" CRTOFFLOADEND
@@ -132,12 +132,13 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #define LIB_SPEC GNU_USER_TARGET_LIB_SPEC
 
 #if defined(HAVE_LD_EH_FRAME_HDR)
-#define LINK_EH_SPEC "%{!static:--eh-frame-hdr} "
+#define LINK_EH_SPEC "%{!static|static-pie:--eh-frame-hdr} "
 #endif
 
 #undef LINK_GCC_C_SEQUENCE_SPEC
 #define LINK_GCC_C_SEQUENCE_SPEC \
-  "%{static:--start-group} %G %L %{static:--end-group}%{!static:%G}"
+  "%{static|static-pie:--start-group} %G %L \
+   %{static|static-pie:--end-group}%{!static:%{!static-pie:%G}}"
 
 /* Use --as-needed -lgcc_s for eh support.  */
 #ifdef HAVE_LD_AS_NEEDED
diff --git a/gcc/config/i386/gnu-user.h b/gcc/config/i386/gnu-user.h
index a4c88f1a848..8983dc9ecd7 100644
--- a/gcc/config/i386/gnu-user.h
+++ b/gcc/config/i386/gnu-user.h
@@ -77,9 +77,10 @@ along with GCC; see the file COPYING3.  If not see
 #define GNU_USER_TARGET_LINK_SPEC "-m %(link_emulation) %{shared:-shared} \
   %{!shared: \
 %{!static: \
-  %{rdynamic:-export-dynamic} \
-  -dynamic-linker %(dynamic_linker)} \
-  %{static:-static}}"
+  %{!static-pie: \
+   

Re: C PATCH to display types when printing a conversion warning (PR c/81233)

2017-08-08 Thread Joseph Myers
On Thu, 13 Jul 2017, Marek Polacek wrote:

> Bootstrapped/regtested on x86_64-linux and powerpc64le-unknown-linux-gnu,
> ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: C PATCH to further improve diagnostic for -Wsign-compare (PR c/81417)

2017-08-08 Thread Joseph Myers
On Wed, 2 Aug 2017, Marek Polacek wrote:

> Hmm, how about this, then?
> 
> "operand of ?: changes signedness from %qT to %qT due to unsignedness of 
> other operand"
> 
> I couldn't come up with anything more brief yet conveying all the information.
> I don't like adding "second"/"third"/... very much; we should offer a good
> location already.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: PING^2: [PATCH] PR driver/81523: Make -static override -pie

2017-08-08 Thread Joseph Myers
On Mon, 7 Aug 2017, H.J. Lu wrote:

> Hi Joseph,
> 
> Can you take a look at this patch:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00011.html

This patch is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH, rs6000] testcase coverage for vec_pack builtin

2017-08-08 Thread Will Schmidt
Hi, 

[PATCH, rs6000] testcase coverage for vec_pack builtin

Add some testcase coverage for the vec_pack built-in.

Tested across power platforms (p6 and newer).
OK for trunk?

Thanks,
-Will

[gcc/testsuite]

|---2017-08-08  Will Schmidt  

* fold-vec-pack-double.c: New.
* fold-vec-pack-int.c: New.
* fold-vec-pack-longlong.c: New.
* fold-vec-pack-short.c: New.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-double.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-double.c
new file mode 100644
index 000..29d049a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-double.c
@@ -0,0 +1,18 @@
+/* Verify that overloaded built-ins for vec_pack with
+   double inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mvsx -mpower8-vector -O2" } */
+
+#include 
+
+// vector float vec_pack (vector double, vector double);
+
+vector float
+test_pack (vector double vd2, vector double vd3)
+{
+  return vec_pack (vd2, vd3);
+}
+
+/* { dg-final { scan-assembler-times "vpkudum" 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-int.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-int.c
new file mode 100644
index 000..940faf3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-int.c
@@ -0,0 +1,28 @@
+/* Verify that overloaded built-ins for vec_pack with int
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2" } */
+
+#include 
+
+vector bool short
+testbi_h (vector bool int vbi2, vector bool int vbi3)
+{
+  return vec_pack (vbi2, vbi3);
+}
+
+vector signed short
+testsi_h (vector signed int vsi2, vector signed int vsi3)
+{
+  return vec_pack (vsi2, vsi3);
+}
+
+vector unsigned short
+testui_h (vector unsigned int vui2, vector unsigned int vui3)
+{
+  return vec_pack (vui2, vui3);
+}
+
+/* { dg-final { scan-assembler-times "vpkuwum" 3 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-longlong.c
new file mode 100644
index 000..d8acc3c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-longlong.c
@@ -0,0 +1,28 @@
+/* Verify that overloaded built-ins for vec_pack with long long
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mvsx -mpower8-vector -O2" } */
+
+#include 
+
+vector bool int
+testbl_h (vector bool long long vbl2, vector bool long long vbl3)
+{
+  return vec_pack (vbl2, vbl3);
+}
+
+vector signed int
+testsl_h (vector signed long long vsl2, vector signed long long vsl3)
+{
+  return vec_pack (vsl2, vsl3);
+}
+
+vector unsigned int
+testul_h (vector unsigned long vul2, vector unsigned long vul3)
+{
+  return vec_pack (vul2, vul3);
+}
+
+/* { dg-final { scan-assembler-times "vpkudum" 3 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-short.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-short.c
new file mode 100644
index 000..37cd191
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-pack-short.c
@@ -0,0 +1,28 @@
+/* Verify that overloaded built-ins for vec_pack with short
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2" } */
+
+#include 
+
+vector bool char
+testbi_eh (vector bool short vbs2, vector bool short vbs3)
+{
+  return vec_pack (vbs2, vbs3);
+}
+
+vector signed char
+testsi_eh (vector signed short vss2, vector signed short vss3)
+{
+  return vec_pack (vss2, vss3);
+}
+
+vector unsigned char
+testui_eh (vector unsigned short vus2, vector unsigned short vus3)
+{
+  return vec_pack (vus2, vus3);
+}
+
+/* { dg-final { scan-assembler-times "vpkuhum" 3 } } */




[PATCH, rs6000] Add testcase coverage for vec_msum built-in

2017-08-08 Thread Will Schmidt
Hi, 

[PATCH, rs6000] Add testcase coverage for vec_msum built-in

Add testcase coverage for vec_msum built-in.

Tested across power platforms (p6 and newer).  
OK for trunk?

Thanks,
-Will

[gcc/testsuite]

2017-08-08  Will Schmidt  

* fold-vec-msum-char.c: New.
* fold-vec-msum-short.c: New.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-msum-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-msum-char.c
new file mode 100644
index 000..53519d5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-msum-char.c
@@ -0,0 +1,25 @@
+/* Verify that overloaded built-ins for vec_msum() with char inputs
+   produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2" } */
+
+#include 
+
+vector unsigned int
+test_msum_ui_uc_uc_ui (vector unsigned char vuc2, vector unsigned char vuc3,
+  vector unsigned int vui2)
+{
+  return vec_msum (vuc2, vuc3, vui2);
+}
+
+vector signed int
+test_msum_si_sc_uc_si (vector signed char vsc2, vector unsigned char vuc3,
+  vector signed int vsi2)
+{
+  return vec_msum (vsc2, vuc3, vsi2);
+}
+
+/* { dg-final { scan-assembler-times "vmsumubm" 1 } } */
+/* { dg-final { scan-assembler-times "vmsummbm" 1 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-msum-short.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-msum-short.c
new file mode 100644
index 000..61e1d35
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-msum-short.c
@@ -0,0 +1,25 @@
+/* Verify that overloaded built-ins for vec_msum with int
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2" } */
+
+#include 
+
+vector signed int
+test_msum_si (vector signed short vss2, vector signed short vss3,
+  vector signed int vsi2)
+{
+  return vec_msum (vss2, vss3, vsi2);
+}
+
+vector unsigned int
+test_msum)ui (vector unsigned short vus2, vector unsigned short vus3,
+  vector unsigned int vui2)
+{
+  return vec_msum (vus2, vus3, vui2);
+}
+
+/* { dg-final { scan-assembler-times "vmsumshm" 1 } } */
+/* { dg-final { scan-assembler-times "vmsumuhm" 1 } } */




[PATCH, rs6000] testcase coverage for vec_madd built-in functions

2017-08-08 Thread Will Schmidt
Hi, 

[Patch, rs6000] testcase coverage for vec_madd built-in function

Add testcase coverage for vec_madd built-in function.

Tested across power platforms (p6 and newer).  
OK for trunk?

Thanks, 
-Will

[testsuite]

2017-08-08  Will Schmidt  

* fold-vec-madd-double.c: New.
* fold-vec-madd-float.c: New.
* fold-vec-madd-short.c: New.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-madd-double.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-madd-double.c
new file mode 100644
index 000..0fe7824
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-madd-double.c
@@ -0,0 +1,17 @@
+/* Verify that overloaded built-ins for vec_madd with
+   double inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+
+vector double
+testd_l (vector double vd2, vector double vd3, vector double vd4)
+{
+  return vec_madd (vd2, vd3, vd4);
+}
+
+/* { dg-final { scan-assembler-times "xvmaddmdp|xvmaddadp" 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-madd-float.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-madd-float.c
new file mode 100644
index 000..fcfe0c3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-madd-float.c
@@ -0,0 +1,17 @@
+/* Verify that overloaded built-ins for vec_madd with float
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include 
+
+vector float
+testf_l (vector float vf2, vector float vf3, vector float vf4)
+{
+  return vec_madd (vf2, vf3, vf4);
+}
+
+/* { dg-final { scan-assembler-times "xvmaddmsp|xvmaddasp" 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-madd-short.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-madd-short.c
new file mode 100644
index 000..0e78f35
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-madd-short.c
@@ -0,0 +1,38 @@
+/* Verify that overloaded built-ins for vec_madd with short
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec -O2" } */
+
+#include 
+
+vector signed short
+test_mad_sss (vector signed short vss2, vector signed short vss3,
+ vector signed short vss4)
+{
+  return vec_madd (vss2, vss3, vss4);
+}
+
+vector signed short
+test_mad_suu (vector signed short vss2, vector unsigned short vus3,
+   vector unsigned short vus4)
+{
+  return vec_madd (vss2, vus3, vus4);
+}
+
+vector signed short
+test_mad_uss (vector unsigned short vus2, vector signed short vss3,
+ vector signed short vss4)
+{
+  return vec_madd (vus2, vss3, vss4);
+}
+
+vector unsigned short
+test_mad_uuu (vector unsigned short vus2, vector unsigned short vus3,
+   vector unsigned short vus4)
+{
+  return vec_madd (vus2, vus3, vus4);
+}
+
+/* { dg-final { scan-assembler-times "vmladduhm" 4 } } */




[PATCH, rs6000] testcase coverage for vec_cntlz

2017-08-08 Thread Will Schmidt
Hi, 

[PATCH, rs6000] testcase coverage for vec_cntlz

Add testcase coverage for vec_cntlz built-in functions.

Tested across power platforms (p6 and newer).  
OK for trunk?

Thanks,
-Will

[gcc/testsuite]

2017-08-08  Will Schmidt  

* fold-vec-cntlz-int.c: New.
* fold-vec-cntlz-char.c: New.
* fold-vec-cntlz-short.c: New.
* fold-vec-cntlz-longlong.c: New.

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-char.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-char.c
new file mode 100644
index 000..61dfbcc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-char.c
@@ -0,0 +1,22 @@
+/* Verify that overloaded built-ins for vec_cntlz with char
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-maltivec -mpower8-vector -O2" } */
+
+#include 
+
+vector signed char
+testsc_h (vector signed char vsc2)
+{
+  return vec_cntlz (vsc2);
+}
+
+vector unsigned char
+testuc_h (vector unsigned char vuc2)
+{
+  return vec_cntlz (vuc2);
+}
+
+/* { dg-final { scan-assembler-times "vclzb" 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-int.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-int.c
new file mode 100644
index 000..ae4dd57
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-int.c
@@ -0,0 +1,22 @@
+/* Verify that overloaded built-ins for vec_cntlz with int
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-maltivec -mpower8-vector -O2" } */
+
+#include 
+
+vector signed int
+testsi (vector signed int vsi2)
+{
+  return vec_cntlz (vsi2);
+}
+
+vector unsigned int
+testui (vector unsigned int vui2)
+{
+  return vec_cntlz (vui2);
+}
+
+/* { dg-final { scan-assembler-times "vclzw" 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-longlong.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-longlong.c
new file mode 100644
index 000..1a72a2d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-longlong.c
@@ -0,0 +1,22 @@
+/* Verify that overloaded built-ins for vec_cntlz with long long
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mvsx -mpower8-vector -O2" } */
+
+#include 
+
+vector signed long long
+testsl (vector signed long long vsl2)
+{
+  return vec_cntlz (vsl2);
+}
+
+vector unsigned long long
+testul (vector unsigned long long vul2)
+{
+  return vec_cntlz (vul2);
+}
+
+/* { dg-final { scan-assembler-times "vclzd" 2 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-short.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-short.c
new file mode 100644
index 000..0f05cac
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-cntlz-short.c
@@ -0,0 +1,22 @@
+/* Verify that overloaded built-ins for vec_cntlz with int
+   inputs produce the right results.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-maltivec -mpower8-vector -O2" } */
+
+#include 
+
+vector signed short
+testsi (vector signed short vss2)
+{
+  return vec_cntlz (vss2);
+}
+
+vector unsigned short
+testui (vector unsigned short vus2)
+{
+  return vec_cntlz (vus2);
+}
+
+/* { dg-final { scan-assembler-times "vclzh" 2 } } */




[PATCH, rs6000] enable early debug and disable switch for gimple folding

2017-08-08 Thread Will Schmidt
Hi, 

[PATCH, rs6000] enable early debug and disable switch for gimple folding

Enable debug options related to gimple folding for the rs6000 target.
Adds some output to the existing -mdebug=builtin option
Add a -mgimple-folding=off option to disable the early rs6000 gimple folding.

OK for trunk?

Thanks,
-Will


[gcc]

2017-08-08  Will Schmidt  

* config/rs6000/rs6000.c: rs6000_option_override_internal() Add blurb
to indicate when early gimple folding has been disabled.
rs6000_gimple_fold_builtin(): Add debug content.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 1fb9861..0466fd0 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4180,10 +4180,14 @@ rs6000_option_override_internal (bool global_init_p)
 {
   warning (0, N_("-maltivec=le not allowed for big-endian targets"));
   rs6000_altivec_element_order = 0;
 }
 
+  if (rs6000_gimple_folding_disable)
+ fprintf (stderr,
+ "gimple folding of rs6000 builtins has been disabled.\n");
+
   /* Add some warnings for VSX.  */
   if (TARGET_VSX)
 {
   const char *msg = NULL;
   if (!TARGET_HARD_FLOAT || !TARGET_SINGLE_FLOAT || !TARGET_DOUBLE_FLOAT)
@@ -16157,10 +16161,26 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
   gcc_checking_assert (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD);
   enum rs6000_builtins fn_code
 = (enum rs6000_builtins) DECL_FUNCTION_CODE (fndecl);
   tree arg0, arg1, lhs;
 
+  size_t uns_fncode = (size_t)fn_code;
+  enum insn_code icode = rs6000_builtin_info[uns_fncode].icode;
+  const char *fn_name1 = rs6000_builtin_info[uns_fncode].name;
+  const char *fn_name2 = ((icode != CODE_FOR_nothing)
+ ? get_insn_name ((int)icode)
+ : "nothing");
+
+  if (TARGET_DEBUG_BUILTIN)
+  {
+  fprintf (stderr, "rs6000_gimple_fold_builtin %d %s %s \n",
+  fn_code,fn_name1,fn_name2);
+  }
+
+  if (rs6000_gimple_folding_disable)
+ return false;
+
   /* Generic solution to prevent gimple folding of code without a LHS.  */
   if (!gimple_call_lhs (stmt))
 return false;
 
   switch (fn_code)
@@ -16516,10 +16536,13 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT);
update_call_from_tree (gsi, res);
return true;
   }
 default:
+   if (TARGET_DEBUG_BUILTIN)
+  fprintf (stderr, "gimple builtin intrinsic not matched:%d %s %s \n",
+  fn_code,fn_name1,fn_name2);
   break;
 }
 
   return false;
 }
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index e94aa07..4372b00 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -146,10 +146,14 @@ Generate AltiVec instructions using little-endian element 
order.
 
 maltivec=be
 Target Report RejectNegative Var(rs6000_altivec_element_order, 2)
 Generate AltiVec instructions using big-endian element order.
 
+mgimple-folding=off
+Target Report RejectNegative Var(rs6000_gimple_folding_disable, 1)
+Disable early gimple folding of builtins.
+
 mhard-dfp
 Target Report Mask(DFP) Var(rs6000_isa_flags)
 Use decimal floating point instructions.
 
 mmulhw




[PATCH] Changes for v3 of the C++ patch

2017-08-08 Thread David Malcolm
For reference, here's the diff of the v3 C++ patch to the v2 patch,
in case it's useful:

gcc/cp/ChangeLog:
* parser.c (token_pair::require_open): Use tabs rather than spaces.
(token_pair::peek_open): Delete.
(token_pair::require_close): Use tabs rather than spaces.
(cp_parser_compound_literal_p): Remove consumption of opening
paren.
(cp_parser_postfix_expression): Add matching_parens instance.  Use
it to consume the opening paren previously consumed by
cp_parser_compound_literal_p.  Convert call to cp_parser_require
to parens.require_close.
(cp_parser_sizeof_operand): Convert call to parens.peek_open to
call to consume_open to consume the opening paren previously
consumed by cp_parser_compound_literal_p.
---
 gcc/cp/parser.c | 32 +++-
 1 file changed, 11 insertions(+), 21 deletions(-)

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 3037ac7..0da92ab 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -4527,7 +4527,7 @@ class token_pair
   {
 m_open_loc = cp_lexer_peek_token (parser->lexer)->location;
 return cp_parser_require (parser, traits_t::open_token_type,
-  traits_t::required_token_open);
+ traits_t::required_token_open);
   }
 
   /* Consume the next token from PARSER, recording its location as
@@ -4541,16 +4541,6 @@ class token_pair
 return tok;
   }
 
-  /* Peek the next token from PARSER, recording its location as
- that of the opening token within the pair.  */
-
-  void peek_open (cp_parser *parser)
-  {
-cp_token *tok = cp_lexer_peek_token (parser->lexer);
-gcc_assert (tok->type == traits_t::open_token_type);
-m_open_loc = tok->location;
-  }
-
   /* If the next token is the closing symbol for this pair, consume it
  and return it.
  Otherwise, issue an error, highlighting the location of the
@@ -4559,8 +4549,8 @@ class token_pair
   cp_token *require_close (cp_parser *parser) const
   {
 return cp_parser_require (parser, traits_t::close_token_type,
-  traits_t::required_token_close,
-  m_open_loc);
+ traits_t::required_token_close,
+ m_open_loc);
   }
 
  private:
@@ -6443,9 +6433,6 @@ cp_parser_qualifying_entity (cp_parser *parser,
 static bool
 cp_parser_compound_literal_p (cp_parser *parser)
 {
-  /* Consume the `('.  */
-  cp_lexer_consume_token (parser->lexer);
-
   cp_lexer_save_tokens (parser->lexer);
 
   /* Skip tokens until the next token is a closing parenthesis.
@@ -6857,6 +6844,9 @@ cp_parser_postfix_expression (cp_parser *parser, bool 
address_p, bool cast_p,
 
cp_parser_parse_tentatively (parser);
 
+   matching_parens parens;
+   parens.consume_open (parser);
+
/* Avoid calling cp_parser_type_id pointlessly, see comment
   in cp_parser_cast_expression about c++/29234.  */
if (!cp_parser_compound_literal_p (parser))
@@ -6868,8 +6858,7 @@ cp_parser_postfix_expression (cp_parser *parser, bool 
address_p, bool cast_p,
parser->in_type_id_in_expr_p = true;
type = cp_parser_type_id (parser);
parser->in_type_id_in_expr_p = saved_in_type_id_in_expr_p;
-   /* Look for the `)'.  */
-   cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN);
+   parens.require_close (parser);
  }
 
/* If things aren't going well, there's no need to
@@ -27773,12 +27762,13 @@ cp_parser_sizeof_operand (cp_parser* parser, enum rid 
keyword)
 {
   tree type = NULL_TREE;
 
-  matching_parens parens;
-  parens.peek_open (parser);
-
   /* We can't be sure yet whether we're looking at a type-id or an
 expression.  */
   cp_parser_parse_tentatively (parser);
+
+  matching_parens parens;
+  parens.consume_open (parser);
+
   /* Note: as a GNU Extension, compound literals are considered
 postfix-expressions as they are in C99, so they are valid
 arguments to sizeof.  See comment in cp_parser_cast_expression
-- 
1.8.5.3



Re: [PATCH 2/3] Matching tokens: C parts (v2)

2017-08-08 Thread David Malcolm
On Fri, 2017-08-04 at 12:09 -0600, Jeff Law wrote:
> On 08/04/2017 08:32 AM, David Malcolm wrote:
> > On Thu, 2017-08-03 at 11:34 -0600, Jeff Law wrote:
> > > On 08/01/2017 02:21 PM, David Malcolm wrote:
> > > > Changed in v2:
> > > > 
> > > > * Renamed template argument to traits_t; eliminated subclasses,
> > > > just
> > > >   using traits struct.
> > > > * Moved enum constants into struct bodies (string constants
> > > > can't
> > > > be
> > > >   without constexpr, which isn't available in C++98).
> > > > * Fixed typo.
> > > > 
> > > > OK for trunk?
> > > > 
> > > > gcc/c/ChangeLog:
> > > > * c-parser.c (c_parser_error): Rename to...
> > > > (c_parser_error_richloc): ...this, making static, and
> > > > adding
> > > > "richloc" parameter, passing it to the c_parse_error
> > > > call,
> > > > rather than calling
> > > > c_parser_set_source_position_from_token.
> > > > (c_parser_error): Reintroduce, reimplementing in terms
> > > > of the
> > > > above, converting return type from void to bool.
> > > > (class token_pair): New class.
> > > > (struct matching_paren_traits): New struct.
> > > > (matching_parens): New typedef.
> > > > (struct matching_brace_traits): New struct.
> > > > (matching_braces): New typedef.
> > > > (get_matching_symbol): New function.
> > > > (c_parser_require): Add param MATCHING_LOCATION, using
> > > > it to
> > > > highlight matching "opening" tokens for missing
> > > > "closing"
> > > > tokens.
> > > > (c_parser_skip_until_found): Likewise.
> > > > (c_parser_static_assert_declaration_no_semi): Convert
> > > > explicit
> > > > parsing of CPP_OPEN_PAREN and CPP_CLOSE_PAREN to use of
> > > > class matching_parens, so that the pertinent open
> > > > parenthesis
> > > > is
> > > > highlighted when there are problems locating the close
> > > > parenthesis.
> > > > (c_parser_struct_or_union_specifier): Likewise.
> > > > (c_parser_typeof_specifier): Likewise.
> > > > (c_parser_alignas_specifier): Likewise.
> > > > (c_parser_simple_asm_expr): Likewise.
> > > > (c_parser_braced_init): Likewise, for matching_braces.
> > > > (c_parser_paren_condition): Likewise, for
> > > > matching_parens.
> > > > (c_parser_switch_statement): Likewise.
> > > > (c_parser_for_statement): Likewise.
> > > > (c_parser_asm_statement): Likewise.
> > > > (c_parser_asm_operands): Likewise.
> > > > (c_parser_cast_expression): Likewise.
> > > > (c_parser_sizeof_expression): Likewise.
> > > > (c_parser_alignof_expression): Likewise.
> > > > (c_parser_generic_selection): Likewise.
> > > > (c_parser_postfix_expression): Likewise for cases
> > > > RID_VA_ARG,
> > > > RID_OFFSETOF, RID_TYPES_COMPATIBLE_P, RID_AT_SELECTOR,
> > > > RID_AT_PROTOCOL, RID_AT_ENCODE, reindenting as
> > > > necessary.
> > > > In case CPP_OPEN_PAREN, pass loc_open_paren to the
> > > > c_parser_skip_until_found call.
> > > > (c_parser_objc_class_definition): Use class
> > > > matching_parens as
> > > > above.
> > > > (c_parser_objc_method_decl): Likewise.
> > > > (c_parser_objc_try_catch_finally_statement): Likewise.
> > > > (c_parser_objc_synchronized_statement): Likewise.
> > > > (c_parser_objc_at_property_declaration): Likewise.
> > > > (c_parser_oacc_wait_list): Likewise.
> > > > (c_parser_omp_var_list_parens): Likewise.
> > > > (c_parser_omp_clause_collapse): Likewise.
> > > > (c_parser_omp_clause_default): Likewise.
> > > > (c_parser_omp_clause_if): Likewise.
> > > > (c_parser_omp_clause_num_threads): Likewise.
> > > > (c_parser_omp_clause_num_tasks): Likewise.
> > > > (c_parser_omp_clause_grainsize): Likewise.
> > > > (c_parser_omp_clause_priority): Likewise.
> > > > (c_parser_omp_clause_hint): Likewise.
> > > > (c_parser_omp_clause_defaultmap): Likewise.
> > > > (c_parser_oacc_single_int_clause): Likewise.
> > > > (c_parser_omp_clause_ordered): Likewise.
> > > > (c_parser_omp_clause_reduction): Likewise.
> > > > (c_parser_omp_clause_schedule): Likewise.
> > > > (c_parser_omp_clause_num_teams): Likewise.
> > > > (c_parser_omp_clause_thread_limit): Likewise.
> > > > (c_parser_omp_clause_aligned): Likewise.
> > > > (c_parser_omp_clause_linear): Likewise.
> > > > (c_parser_omp_clause_safelen): Likewise.
> > > > (c_parser_omp_clause_simdlen): Likewise.
> > > > (c_parser_omp_clause_depend): Likewise.
> > > > (c_parser_omp_clause_map): Likewise.
> > > > (c_parser_omp_clause_device): Likewise.
> > > > (c_parser_omp_clause_dist_schedule): Likewise.
> > > > (c_parser_omp_clause_proc_bind): 

RE: [PATCH] [Aarch64] Optimize subtract in shift counts

2017-08-08 Thread Michael Collison
Correct.

-Original Message-
From: Richard Kenner [mailto:ken...@vlsi1.ultra.nyu.edu] 
Sent: Tuesday, August 8, 2017 1:20 PM
To: Michael Collison 
Cc: gcc-patches@gcc.gnu.org; nd ; pins...@gmail.com
Subject: RE: [PATCH] [Aarch64] Optimize subtract in shift counts

> Correct. It is truncated for integer shift, but not simd shift 
> instructions. We generate a pattern in the split that only generates 
> the integer shift instructions.

That's unfortunate, because it would be nice to do this in simplify_rtx, since 
it's machine-independent, but that has to be conditioned on 
SHIFT_COUNT_TRUNCATED, so you wouldn't get the benefit of it.


[PATCH] matching tokens: C++ parts (v3)

2017-08-08 Thread David Malcolm
On Mon, 2017-08-07 at 14:25 -0400, Jason Merrill wrote:

Thanks for looking at this.

> On 08/01/2017 04:21 PM, David Malcolm wrote:
> > @@ -27632,6 +27769,9 @@ cp_parser_sizeof_operand (cp_parser*
> > parser, enum rid keyword)
> >  {
> >tree type = NULL_TREE;
> >
> > +  matching_parens parens;
> > +  parens.peek_open (parser);
>
> I was puzzled by this until I found that
> cp_parser_compound_literal_p
> consumes the open paren.  Let's remove that in favor of calling
> consume_open here, so we don't need peek_open anymore.

Done.

> About passing parser in or not, I'm happy with the current approach;
> adding things to the stack isn't free in a highly recursive program
> like
> GCC.

Thanks; I'll keep "parser" out of the new classes then.

Here's an updated "v3" patch.

Successfully bootstrapped on x86_64-pc-linux-gnu in conjunction
with the other patches (1 and 2 of the v2 kit).

OK for trunk, assuming the other patches are approved? (patch 2 in the kit,
for the C frontend, still needs approval).

Changes in v3:

Here's a ChangeLog for the change relative to the previous v2 patch:

gcc/cp/ChangeLog:
* parser.c (token_pair::require_open): Use tabs rather than spaces.
(token_pair::peek_open): Delete.
(token_pair::require_close): Use tabs rather than spaces.
(cp_parser_compound_literal_p): Remove consumption of opening
paren.
(cp_parser_postfix_expression): Add matching_parens instance.  Use
it to consume the opening paren previously consumed by
cp_parser_compound_literal_p.  Convert call to cp_parser_require
to parens.require_close.
(cp_parser_sizeof_operand): Convert call to parens.peek_open to
call to consume_open to consume the opening paren previously
consumed by cp_parser_compound_literal_p.

Here's an integrated ChangeLog for the patch as a whole:

gcc/cp/ChangeLog:
* parser.c (cp_parser_error): Update for new param to
c_parse_error.
(class token_pair): New class.
(struct matching_paren_traits): New struct.
(matching_parens): New typedef.
(struct matching_brace_traits): New struct.
(matching_braces): New typedef.
(cp_parser_statement_expr): Convert explicit parsing of
CPP_OPEN_PAREN and CPP_CLOSE_PAREN to use of
class matching_parens, so that the pertinent open parenthesis is
highlighted when there are problems locating the close
parenthesis.
(cp_parser_primary_expression): Likewise.
(cp_parser_compound_literal_p): Remove consumption of opening
paren.
(cp_parser_postfix_expression): Convert explicit parsing of
CPP_OPEN_PAREN and CPP_CLOSE_PAREN to use matching parens, as
above.  Use it to consume the opening paren previously consumed by
cp_parser_compound_literal_p.
(cp_parser_parenthesized_expression_list): Likewise.
(cp_parser_unary_expression): Likewise.
(cp_parser_new_expression): Likewise.
(cp_parser_cast_expression): Likewise.
(cp_parser_builtin_offsetof): Likewise.
(cp_parser_trait_expr): Likewise.
(cp_parser_lambda_declarator_opt): Likewise.
(cp_parser_lambda_body): Likewise, for matching_braces.
(cp_parser_compound_statement): Likewise.
(cp_parser_selection_statement): Likewise, for matching_parens.
(cp_parser_iteration_statement): Likewise.
(cp_parser_already_scoped_statement): Likewise, for
matching_braces.
(cp_parser_linkage_specification): Likewise.
(cp_parser_static_assert): Likewise, for matching_parens.
(cp_parser_decltype): Likewise.
(cp_parser_operator): Likewise.
(cp_parser_enum_specifier): Likewise.
(cp_parser_namespace_definition): Likewise.
(cp_parser_direct_declarator): Likewise.
(cp_parser_braced_list): Likewise.
(cp_parser_class_specifier_1): Likewise, for matching_braces.
(cp_parser_constant_initializer): Likewise.
(cp_parser_noexcept_specification_opt): Likewise, for
matching_parens.
(cp_parser_exception_specification_opt): Likewise.
(cp_parser_handler): Likewise.
(cp_parser_asm_specification_opt): Likewise.
(cp_parser_asm_operand_list): Likewise.
(cp_parser_gnu_attributes_opt): Likewise.
(cp_parser_std_attribute_spec): Likewise.
(cp_parser_requirement_parameter_list): Likewise.
(cp_parser_requirement_body): Likewise, for matching_braces.
(cp_parser_compound_requirement): Likewise.
(cp_parser_template_introduction): Likewise.
(cp_parser_sizeof_pack): Likewise, for matching_parens.
(cp_parser_sizeof_operand): Likewise; use it to consume the
opening paren previously consumed by cp_parser_compound_literal_p.
(get_matching_symbol): New function.
(cp_parser_required_error): Add param 

RE: [PATCH] [Aarch64] Optimize subtract in shift counts

2017-08-08 Thread Richard Kenner
> Correct. It is truncated for integer shift, but not simd shift
> instructions. We generate a pattern in the split that only generates
> the integer shift instructions.

That's unfortunate, because it would be nice to do this in simplify_rtx,
since it's machine-independent, but that has to be conditioned on
SHIFT_COUNT_TRUNCATED, so you wouldn't get the benefit of it.


RE: [PATCH] [Aarch64] Optimize subtract in shift counts

2017-08-08 Thread Michael Collison
Correct. It is truncated for integer shift, but not simd shift instructions. We 
generate a pattern in the split that only generates the integer shift 
instructions.

-Original Message-
From: Richard Kenner [mailto:ken...@vlsi1.ultra.nyu.edu] 
Sent: Tuesday, August 8, 2017 1:04 PM
To: Michael Collison 
Cc: gcc-patches@gcc.gnu.org; nd ; pins...@gmail.com
Subject: RE: [PATCH] [Aarch64] Optimize subtract in shift counts

> Because for integer shift instructions the shift count is truncated. 
> We ensure that we only use integer shift instructions by emitting a 
> shift with a mask. This only matches integer shift instructions in the 
> md file.

That's why I asked about SHIFT_COUNT_TRUNCATED.  So it's truncated for some 
shifts, but not all?


RE: [PATCH] [Aarch64] Optimize subtract in shift counts

2017-08-08 Thread Richard Kenner
> Because for integer shift instructions the shift count is
> truncated. We ensure that we only use integer shift instructions by
> emitting a shift with a mask. This only matches integer shift
> instructions in the md file.

That's why I asked about SHIFT_COUNT_TRUNCATED.  So it's truncated for
some shifts, but not all?


RE: [PATCH] [Aarch64] Optimize subtract in shift counts

2017-08-08 Thread Michael Collison
Because for integer shift instructions the shift count is truncated. We ensure 
that we only use integer shift instructions by emitting a shift with a mask. 
This only matches integer shift instructions in the md file.

-Original Message-
From: Richard Kenner [mailto:ken...@vlsi1.ultra.nyu.edu] 
Sent: Tuesday, August 8, 2017 12:52 PM
To: Michael Collison 
Cc: gcc-patches@gcc.gnu.org; nd ; pins...@gmail.com
Subject: RE: [PATCH] [Aarch64] Optimize subtract in shift counts

> This case is covered by Wilco's previous reply:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00575.html

Which I don't understand:

> No it's perfectly safe - it becomes an integer-only shift after the 
> split since it keeps the masking as part of the pattern.

Let say we have your first example:

long f1(long x, int i)
{
  return x >> (64 - i);
}

If "i" is -2, this should be a shift of 66 (which is indeed, technically 
undefined), but becomes a shift of 62.  What am I missing?


RE: [PATCH] [Aarch64] Optimize subtract in shift counts

2017-08-08 Thread Richard Kenner
> This case is covered by Wilco's previous reply:
> 
> https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00575.html

Which I don't understand:

> No it's perfectly safe - it becomes an integer-only shift after the
> split since it keeps the masking as part of the pattern.

Let say we have your first example:

long f1(long x, int i)
{
  return x >> (64 - i);
}

If "i" is -2, this should be a shift of 66 (which is indeed, technically
undefined), but becomes a shift of 62.  What am I missing?


RE: [PATCH] [Aarch64] Optimize subtract in shift counts

2017-08-08 Thread Michael Collison
This case is covered by Wilco's previous reply:

https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00575.html

-Original Message-
From: Richard Kenner [mailto:ken...@vlsi1.ultra.nyu.edu] 
Sent: Tuesday, August 8, 2017 5:13 AM
To: Michael Collison 
Cc: gcc-patches@gcc.gnu.org; nd ; pins...@gmail.com
Subject: RE: [PATCH] [Aarch64] Optimize subtract in shift counts

> The pattern will only be matched if the value is positive. More 
> specifically if the constant value is 32 (SImode) or 64 (DImode).

I don't mean the constant, but the value subtracted from it.
If that's negative, then we have a shift count larger than the wordsize.



PING Re: [PATCH 0/6] [i386] PR80969 Fix ICE with -mabi=ms -mavx512f

2017-08-08 Thread Daniel Santos
Original message: https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02005.html

Patches 2 and 3 have been committed and I have corrected the error in
patch 5.  I configuring with --enable-checking=yes,rtl
--enable-languages=all and retested with
RUNTESTFLAGS="--target_board=unix/\{,-m32\}"  The updated patches fix an
error when using mov instead of push and add documentation for changes
to target-supports.exp.  I have included modified ChangeLogs.

In addition to to fixing the ICE, this patch set makes more efficient
use of stack space in some cases the outgoing stack boundary is > 16
bytes and realignment is necessary.  This adds new tests, some of which
require avx512f (gcc/testsuite/gcc.target/i386/pr80969-4*.c) -- these I
have only tested these using Intel SDE.

Below is an updated list of the patches.

1. https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02006.html
2. Committed.
3. Committed.
4. https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02009.html
5. v2 -- https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00249.html
6. v2 -- https://gcc.gnu.org/ml/gcc-patches/2017-08/msg00618.html

Thanks,
Daniel
2017-08-08  Daniel Santos  

* config/i386/i386.h (ix86_frame::stack_realign_allocate_offset):
Remove
(ix86_frame::stack_realign_allocate): New field.
(struct machine_frame_state): Modify comments.
(machine_frame_state::sp_realigned_fp_end): New field.
* config/i386/i386.c (ix86_compute_frame_layout): Modify.
(sp_valid_at): Likewise.
(fp_valid_at): Likewise.
(choose_baseaddr): Modify comments.
(ix86_emit_outlined_ms2sysv_save): Modify.
(ix86_expand_prologue): Likewise.
* doc/sourcebuild.texi (avx2, avx2_runtime): Add missing items to
effective-targets.
(avx512f, avx512f_runtime): Add new items to effective-tarets.
2017-08-08  Daniel Santos  

* lib/target-supports.exp (check_avx512_os_support_available): New
Procedure.
(check_avx2_hw_available): Modify.
(check_avx512f_hw_available): New Procedure.
(check_effective_target_avx512f_runtime): Likewise.
* gcc.target/i386/pr80969-1.c: New testcase.
* gcc.target/i386/pr80969-2a.c: Likewise.
* gcc.target/i386/pr80969-2.c: Likewise.
* gcc.target/i386/pr80969-3.c: Likewise.
* gcc.target/i386/pr80969-4a.c: Likewise.
* gcc.target/i386/pr80969-4b.c: Likewise.
* gcc.target/i386/pr80969-4.c: Likewise.


[PATCH 6/6 v2] [i386, testsuite] Add tests, fix bug in check_avx2_hw_available

2017-08-08 Thread Daniel Santos
This update adds documentation for the new effective taregts in addition to a
few existing effective targets that were undocumented.

Changes to lib/target-supports.exp and documentation:
* Add effective-targets avx512f and avx512f_runtime (needed for new
  tests).
* Corrects bug in check_avx2_hw_available.
* Adds documentation for effective-targets avx2, avx2_runtime (both
  missing), avx512f and avx512f_runtime.

The following tests are added.  The testcase in the PR is used as a base
and relevant variants are added to test other factors affected by the
patch set.

pr80969-1.c   Base test case.
pr80969-2.c   With ms to sysv call.
pr80969-2a.c  With ms to sysv call using stubs.
pr80969-3.c   With alloca (for DRAP test).
pr80969-4.c   With va_args passed via va_list
pr80969-4a.c  With va_args passed via va_list and ms to sysv call.
pr80969-4b.c  With va_args passed via va_list and ms to sysv call using
  stubs.

Signed-off-by: Daniel Santos 
---
 gcc/doc/sourcebuild.texi   |  12 +++
 gcc/testsuite/gcc.target/i386/pr80969-1.c  |  16 
 gcc/testsuite/gcc.target/i386/pr80969-2.c  |  26 ++
 gcc/testsuite/gcc.target/i386/pr80969-2a.c |  26 ++
 gcc/testsuite/gcc.target/i386/pr80969-3.c  |  31 
 gcc/testsuite/gcc.target/i386/pr80969-4.c  | 123 
 gcc/testsuite/gcc.target/i386/pr80969-4a.c | 124 +
 gcc/testsuite/gcc.target/i386/pr80969-4b.c | 124 +
 gcc/testsuite/lib/target-supports.exp  |  66 +++
 9 files changed, 548 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr80969-4b.c

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 85af8778167..66f040f212d 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1852,6 +1852,18 @@ Target supports compiling @code{avx} instructions.
 @item avx_runtime
 Target supports the execution of @code{avx} instructions.
 
+@item avx2
+Target supports compiling @code{avx2} instructions.
+
+@item avx2_runtime
+Target supports the execution of @code{avx2} instructions.
+
+@item avx512f
+Target supports compiling @code{avx512f} instructions.
+
+@item avx512f_runtime
+Target supports the execution of @code{avx512f} instructions.
+
 @item cell_hw
 Test system can execute AltiVec and Cell PPU instructions.
 
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-1.c 
b/gcc/testsuite/gcc.target/i386/pr80969-1.c
new file mode 100644
index 000..eb8d767a778
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-1.c
@@ -0,0 +1,16 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+int a[56];
+int b;
+int main (int argc, char *argv[]) {
+  int c;
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-2.c 
b/gcc/testsuite/gcc.target/i386/pr80969-2.c
new file mode 100644
index 000..e868d6c7e5c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-2.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test when calling a sysv func.  */
+
+int a[56];
+int b;
+
+static void __attribute__((sysv_abi)) sysv ()
+{
+}
+
+void __attribute__((sysv_abi)) (*volatile const sysv_noinfo)() = sysv;
+
+int main (int argc, char *argv[]) {
+  int c;
+  sysv_noinfo ();
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-2a.c 
b/gcc/testsuite/gcc.target/i386/pr80969-2a.c
new file mode 100644
index 000..071a90534a4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-2a.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-options "-Ofast -mabi=ms -mavx512f -mcall-ms2sysv-xlogues" } */
+/* { dg-require-effective-target avx512f } */
+
+/* Test when calling a sysv func using save/restore stubs.  */
+
+int a[56];
+int b;
+
+static void __attribute__((sysv_abi)) sysv ()
+{
+}
+
+void __attribute__((sysv_abi)) (*volatile const sysv_noinfo)() = sysv;
+
+int main (int argc, char *argv[]) {
+  int c;
+  sysv_noinfo ();
+  for (; b; b++) {
+c = b;
+if (b & 1)
+  c = 2;
+a[b] = c;
+  }
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr80969-3.c 
b/gcc/testsuite/gcc.target/i386/pr80969-3.c
new file mode 100644
index 000..5982981b55c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr80969-3.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */

Re: [PATCH] i386: Don't use frame pointer without stack access

2017-08-08 Thread H.J. Lu
On Tue, Aug 8, 2017 at 11:00 AM, Richard Biener
 wrote:
> On August 8, 2017 7:36:35 PM GMT+02:00, Richard Sandiford 
>  wrote:
>>Richard Sandiford  writes:
>>> Richard Biener  writes:
 On August 8, 2017 6:38:30 PM GMT+02:00, "H.J. Lu"
>> wrote:
>On Mon, Aug 7, 2017 at 1:05 PM, Richard Sandiford
> wrote:
>> Arjan van de Ven  writes:
>>> On 8/7/2017 8:43 AM, Jakub Jelinek wrote:
 On Mon, Aug 07, 2017 at 08:39:24AM -0700, H.J. Lu wrote:
> When Linux/x86-64 kernel is compiled with
>>-fno-omit-frame-pointer.
> this optimization removes more than 730
>
> pushq %rbp
> movq %rsp, %rbp
> popq %rbp

 If you don't want the frame pointer, why are you compiling with
 -fno-omit-frame-pointer?  Are you going to add
 -fforce-no-omit-frame-pointer or something similar so that
>>people
>can
 actually get what they are asking for?  This doesn't really make
>sense.
 It is perfectly fine to omit frame pointer by default, when it
>isn't
 required for something, but if the user asks for it, we
>>shouldn't
>ignore his
 request.

>>>
>>>
>>> wanting a framepointer is very nice and desired...  ... but if
>>the
>>> optimizer/ins scheduler moves instructions outside of the frame'd
>>> portion, (it does it for cases like below as well), the value is
>>> already negative for these functions that don't have stack use.
>>>
>>> :
>>> movall_schedules@@Base-0x38460,%rax
>>> push   %rbp
>>> mov%rsp,%rbp
>>> pop%rbp
>>> cmpq   $0x0,(%rax)
>>> setne  %al
>>> movzbl %al,%eax
>>> retq
>>
>> Yeah, and it could be even weirder for big single-block functions.
>> I think GCC has been doing this kind of scheduling of prologue and
>> epilogue instructions for a while, so there hasn*t really been a
>> guarantee which parts of the function will have a new FP and which
>> will still have the old one.
>>
>> Also, with an arbitrarily-picked host compiler (GCC 6.3.1),
>shrink-wrapping
>> kicks in when the following is compiled with -O3
>-fno-omit-frame-pointer:
>>
>> void f (int *);
>> void
>> g (int *x)
>> {
>>   for (int i = 0; i < 1000; ++i)
>> x[i] += 1;
>>   if (x[0])
>> {
>>   int temp;
>>   f ();
>> }
>> }
>>
>> so only the block with the call to f sets up FP.  The relatively
>> long-running loop runs with the caller's FP.
>>
>> I hope we can go for a target-independent position that what HJ*s
>> patch does is OK...
>>
>
>In light of this,  I am resubmitting my patch.  I added 3 more
>testcases
>and also handle:
>
>typedef int v8si __attribute__ ((vector_size (32)));
>
>void
>foo (v8si *out_start, v8si *out_end, v8si *regions)
>{
>v8si base = regions[3];
>*out_start = base;
>*out_end = base;
>}
>
>OK for trunk?

 The invoker specified -fno-omit-frame-pointer, why did you eliminate
>>it?
 I'd argue it's OK when neither -f nor -fno- is explicitly specified
 irrespective of the default in case we document the change but an
 explicit -fno- is pretty clear.
>>>
>>> I don't buy that we're ignoring the user.  -fomit-frame-pointer says
>>> that, when you're creating a frame, it's OK not to set up the frame
>>> pointer.  Forcing it off means that if you create a frame, you need
>>> to set up the frame pointer too.  But it doesn't say anything about
>>> whether the frame itself is needed.  I.e. it's
>>-fno-omit-frame*-pointer*
>>> rather than -fno-omit-frame.
>
> Isn't that a bit splitting hairs if you look at (past) history?
>
> You could also interpret -fno-omit-frame-pointer as obviously forcing a frame 
> as otherwise there's nothing to omit...
>
>>> It seems like the responses have been treating it more like
>>> a combination of:
>>>
>>> -fno-shrink-wrapping
>>> -fno-omit-frame-pointer
>>> the equivalent of the old textual prologues and epilogues
>>>
>>> but the positive option -fomit-frame-pointer doesn't have any effect
>>> on the last two.
>>
>>er, you know what I mean :-)  It doesn't have any effect on
>>-fshrink-wrapping or the textual-style prologues and epilogues.
>
> True.  But I think people do not appreciate new options too much if existing 
> ones worked in the past...
>

Should we also disable LTO and function inlining with -fno-omit-frame-pointer?
Both of them may eliminate frame pointer.

-- 
H.J.


Re: [PATCH][v2] Fix target attribute handling (PR c++/81355).

2017-08-08 Thread Martin Sebor

On 08/07/2017 10:59 PM, Martin Liška wrote:

On 08/02/2017 09:56 PM, Martin Sebor wrote:

On 08/02/2017 01:04 PM, Jeff Law wrote:

On 07/28/2017 05:13 AM, Martin Liška wrote:

Hello.

Following patch skips empty strings in 'target' attribute.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2017-07-13  Martin Liska  

PR c++/81355
* attribs.c (sorted_attr_string): Skip empty strings.

gcc/testsuite/ChangeLog:

2017-07-13  Martin Liska  

PR c++/81355
* g++.dg/other/pr81355.C: New test.

OK.  THough one could legitimately ask if this ought to be a parsing error.


I would suggest to at least issue a warning.  It seems that
the empty string must almost certainly be a bug here, say due
to unintended macro expansion.

Otherwise, if it should remain silently (and intentionally)
accepted, I recommend to document it.  As it is, the manual
says that the "string" argument is equivalent to compiling
with -mstring which for "" would be rejected by the driver.

Martin


Thanks you both for feedback. I decided to come up with an error message for 
that.

Feedback appreciated.


My only comment is on the text of the error message.  I think
the %' directive is supposed to be used instead of a bare
apostrophe.  But rather than using the somewhat informal "can't"
I would suggest to follow other similar diagnostics that might
print something like:

  empty string in attribute %

(analogous to "empty precision in %s format" or "empty scalar
initializer" etc. in gcc.pot)

or

  attribute % argument may not be empty

(similar to "output filename may not be empty").

Martin


Re: [PATCH] i386: Don't use frame pointer without stack access

2017-08-08 Thread Richard Biener
On August 8, 2017 7:36:35 PM GMT+02:00, Richard Sandiford 
 wrote:
>Richard Sandiford  writes:
>> Richard Biener  writes:
>>> On August 8, 2017 6:38:30 PM GMT+02:00, "H.J. Lu"
> wrote:
On Mon, Aug 7, 2017 at 1:05 PM, Richard Sandiford
 wrote:
> Arjan van de Ven  writes:
>> On 8/7/2017 8:43 AM, Jakub Jelinek wrote:
>>> On Mon, Aug 07, 2017 at 08:39:24AM -0700, H.J. Lu wrote:
 When Linux/x86-64 kernel is compiled with
>-fno-omit-frame-pointer.
 this optimization removes more than 730

 pushq %rbp
 movq %rsp, %rbp
 popq %rbp
>>>
>>> If you don't want the frame pointer, why are you compiling with
>>> -fno-omit-frame-pointer?  Are you going to add
>>> -fforce-no-omit-frame-pointer or something similar so that
>people
can
>>> actually get what they are asking for?  This doesn't really make
sense.
>>> It is perfectly fine to omit frame pointer by default, when it
isn't
>>> required for something, but if the user asks for it, we
>shouldn't
ignore his
>>> request.
>>>
>>
>>
>> wanting a framepointer is very nice and desired...  ... but if
>the
>> optimizer/ins scheduler moves instructions outside of the frame'd
>> portion, (it does it for cases like below as well), the value is
>> already negative for these functions that don't have stack use.
>>
>> :
>> movall_schedules@@Base-0x38460,%rax
>> push   %rbp
>> mov%rsp,%rbp
>> pop%rbp
>> cmpq   $0x0,(%rax)
>> setne  %al
>> movzbl %al,%eax
>> retq
>
> Yeah, and it could be even weirder for big single-block functions.
> I think GCC has been doing this kind of scheduling of prologue and
> epilogue instructions for a while, so there hasn*t really been a
> guarantee which parts of the function will have a new FP and which
> will still have the old one.
>
> Also, with an arbitrarily-picked host compiler (GCC 6.3.1),
shrink-wrapping
> kicks in when the following is compiled with -O3
-fno-omit-frame-pointer:
>
> void f (int *);
> void
> g (int *x)
> {
>   for (int i = 0; i < 1000; ++i)
> x[i] += 1;
>   if (x[0])
> {
>   int temp;
>   f ();
> }
> }
>
> so only the block with the call to f sets up FP.  The relatively
> long-running loop runs with the caller's FP.
>
> I hope we can go for a target-independent position that what HJ*s
> patch does is OK...
>

In light of this,  I am resubmitting my patch.  I added 3 more
testcases
and also handle:

typedef int v8si __attribute__ ((vector_size (32)));

void
foo (v8si *out_start, v8si *out_end, v8si *regions)
{
v8si base = regions[3];
*out_start = base;
*out_end = base;
}

OK for trunk?
>>>
>>> The invoker specified -fno-omit-frame-pointer, why did you eliminate
>it?
>>> I'd argue it's OK when neither -f nor -fno- is explicitly specified
>>> irrespective of the default in case we document the change but an
>>> explicit -fno- is pretty clear.
>>
>> I don't buy that we're ignoring the user.  -fomit-frame-pointer says
>> that, when you're creating a frame, it's OK not to set up the frame
>> pointer.  Forcing it off means that if you create a frame, you need
>> to set up the frame pointer too.  But it doesn't say anything about
>> whether the frame itself is needed.  I.e. it's
>-fno-omit-frame*-pointer*
>> rather than -fno-omit-frame.

Isn't that a bit splitting hairs if you look at (past) history?

You could also interpret -fno-omit-frame-pointer as obviously forcing a frame 
as otherwise there's nothing to omit...

>> It seems like the responses have been treating it more like
>> a combination of:
>>
>> -fno-shrink-wrapping
>> -fno-omit-frame-pointer
>> the equivalent of the old textual prologues and epilogues
>>
>> but the positive option -fomit-frame-pointer doesn't have any effect
>> on the last two.
>
>er, you know what I mean :-)  It doesn't have any effect on
>-fshrink-wrapping or the textual-style prologues and epilogues.

True.  But I think people do not appreciate new options too much if existing 
ones worked in the past...

Richard.


Re: [PATCH 00/22] RFC: integrated 3rd-party static analysis support

2017-08-08 Thread Richard Sandiford
David Malcolm  writes:
> This patch kit clearly isn't ready yet as-is (see e.g. the
> "known unknowns" below), but I'm posting it now in the hope of
> getting early feedback.

Looks really useful.  One thing I wasn't sure of from a quick scan was:
what would happen if the input had something like a syntax error?  Would
it suppress the output from the analysers and just print GCC's output,
or would it still print both?

Very minor question in the grand scheme of things, sorry.

Richard


Re: [PATCH] i386: Don't use frame pointer without stack access

2017-08-08 Thread Richard Sandiford
Richard Sandiford  writes:
> Richard Biener  writes:
>> On August 8, 2017 6:38:30 PM GMT+02:00, "H.J. Lu"  
>> wrote:
>>>On Mon, Aug 7, 2017 at 1:05 PM, Richard Sandiford
>>> wrote:
 Arjan van de Ven  writes:
> On 8/7/2017 8:43 AM, Jakub Jelinek wrote:
>> On Mon, Aug 07, 2017 at 08:39:24AM -0700, H.J. Lu wrote:
>>> When Linux/x86-64 kernel is compiled with -fno-omit-frame-pointer.
>>> this optimization removes more than 730
>>>
>>> pushq %rbp
>>> movq %rsp, %rbp
>>> popq %rbp
>>
>> If you don't want the frame pointer, why are you compiling with
>> -fno-omit-frame-pointer?  Are you going to add
>> -fforce-no-omit-frame-pointer or something similar so that people
>>>can
>> actually get what they are asking for?  This doesn't really make
>>>sense.
>> It is perfectly fine to omit frame pointer by default, when it
>>>isn't
>> required for something, but if the user asks for it, we shouldn't
>>>ignore his
>> request.
>>
>
>
> wanting a framepointer is very nice and desired...  ... but if the
> optimizer/ins scheduler moves instructions outside of the frame'd
> portion, (it does it for cases like below as well), the value is
> already negative for these functions that don't have stack use.
>
> :
> movall_schedules@@Base-0x38460,%rax
> push   %rbp
> mov%rsp,%rbp
> pop%rbp
> cmpq   $0x0,(%rax)
> setne  %al
> movzbl %al,%eax
> retq

 Yeah, and it could be even weirder for big single-block functions.
 I think GCC has been doing this kind of scheduling of prologue and
 epilogue instructions for a while, so there hasn*t really been a
 guarantee which parts of the function will have a new FP and which
 will still have the old one.

 Also, with an arbitrarily-picked host compiler (GCC 6.3.1),
>>>shrink-wrapping
 kicks in when the following is compiled with -O3
>>>-fno-omit-frame-pointer:

 void f (int *);
 void
 g (int *x)
 {
   for (int i = 0; i < 1000; ++i)
 x[i] += 1;
   if (x[0])
 {
   int temp;
   f ();
 }
 }

 so only the block with the call to f sets up FP.  The relatively
 long-running loop runs with the caller's FP.

 I hope we can go for a target-independent position that what HJ*s
 patch does is OK...

>>>
>>>In light of this,  I am resubmitting my patch.  I added 3 more
>>>testcases
>>>and also handle:
>>>
>>>typedef int v8si __attribute__ ((vector_size (32)));
>>>
>>>void
>>>foo (v8si *out_start, v8si *out_end, v8si *regions)
>>>{
>>>v8si base = regions[3];
>>>*out_start = base;
>>>*out_end = base;
>>>}
>>>
>>>OK for trunk?
>>
>> The invoker specified -fno-omit-frame-pointer, why did you eliminate it?
>> I'd argue it's OK when neither -f nor -fno- is explicitly specified
>> irrespective of the default in case we document the change but an
>> explicit -fno- is pretty clear.
>
> I don't buy that we're ignoring the user.  -fomit-frame-pointer says
> that, when you're creating a frame, it's OK not to set up the frame
> pointer.  Forcing it off means that if you create a frame, you need
> to set up the frame pointer too.  But it doesn't say anything about
> whether the frame itself is needed.  I.e. it's -fno-omit-frame*-pointer*
> rather than -fno-omit-frame.
>
> It seems like the responses have been treating it more like
> a combination of:
>
> -fno-shrink-wrapping
> -fno-omit-frame-pointer
> the equivalent of the old textual prologues and epilogues
>
> but the positive option -fomit-frame-pointer doesn't have any effect
> on the last two.

er, you know what I mean :-)  It doesn't have any effect on
-fshrink-wrapping or the textual-style prologues and epilogues.


Re: [PATCH] i386: Don't use frame pointer without stack access

2017-08-08 Thread Richard Sandiford
Richard Biener  writes:
> On August 8, 2017 6:38:30 PM GMT+02:00, "H.J. Lu"  wrote:
>>On Mon, Aug 7, 2017 at 1:05 PM, Richard Sandiford
>> wrote:
>>> Arjan van de Ven  writes:
 On 8/7/2017 8:43 AM, Jakub Jelinek wrote:
> On Mon, Aug 07, 2017 at 08:39:24AM -0700, H.J. Lu wrote:
>> When Linux/x86-64 kernel is compiled with -fno-omit-frame-pointer.
>> this optimization removes more than 730
>>
>> pushq %rbp
>> movq %rsp, %rbp
>> popq %rbp
>
> If you don't want the frame pointer, why are you compiling with
> -fno-omit-frame-pointer?  Are you going to add
> -fforce-no-omit-frame-pointer or something similar so that people
>>can
> actually get what they are asking for?  This doesn't really make
>>sense.
> It is perfectly fine to omit frame pointer by default, when it
>>isn't
> required for something, but if the user asks for it, we shouldn't
>>ignore his
> request.
>


 wanting a framepointer is very nice and desired...  ... but if the
 optimizer/ins scheduler moves instructions outside of the frame'd
 portion, (it does it for cases like below as well), the value is
 already negative for these functions that don't have stack use.

 :
 movall_schedules@@Base-0x38460,%rax
 push   %rbp
 mov%rsp,%rbp
 pop%rbp
 cmpq   $0x0,(%rax)
 setne  %al
 movzbl %al,%eax
 retq
>>>
>>> Yeah, and it could be even weirder for big single-block functions.
>>> I think GCC has been doing this kind of scheduling of prologue and
>>> epilogue instructions for a while, so there hasn*t really been a
>>> guarantee which parts of the function will have a new FP and which
>>> will still have the old one.
>>>
>>> Also, with an arbitrarily-picked host compiler (GCC 6.3.1),
>>shrink-wrapping
>>> kicks in when the following is compiled with -O3
>>-fno-omit-frame-pointer:
>>>
>>> void f (int *);
>>> void
>>> g (int *x)
>>> {
>>>   for (int i = 0; i < 1000; ++i)
>>> x[i] += 1;
>>>   if (x[0])
>>> {
>>>   int temp;
>>>   f ();
>>> }
>>> }
>>>
>>> so only the block with the call to f sets up FP.  The relatively
>>> long-running loop runs with the caller's FP.
>>>
>>> I hope we can go for a target-independent position that what HJ*s
>>> patch does is OK...
>>>
>>
>>In light of this,  I am resubmitting my patch.  I added 3 more
>>testcases
>>and also handle:
>>
>>typedef int v8si __attribute__ ((vector_size (32)));
>>
>>void
>>foo (v8si *out_start, v8si *out_end, v8si *regions)
>>{
>>v8si base = regions[3];
>>*out_start = base;
>>*out_end = base;
>>}
>>
>>OK for trunk?
>
> The invoker specified -fno-omit-frame-pointer, why did you eliminate it?
> I'd argue it's OK when neither -f nor -fno- is explicitly specified
> irrespective of the default in case we document the change but an
> explicit -fno- is pretty clear.

I don't buy that we're ignoring the user.  -fomit-frame-pointer says
that, when you're creating a frame, it's OK not to set up the frame
pointer.  Forcing it off means that if you create a frame, you need
to set up the frame pointer too.  But it doesn't say anything about
whether the frame itself is needed.  I.e. it's -fno-omit-frame*-pointer*
rather than -fno-omit-frame.

It seems like the responses have been treating it more like
a combination of:

-fno-shrink-wrapping
-fno-omit-frame-pointer
the equivalent of the old textual prologues and epilogues

but the positive option -fomit-frame-pointer doesn't have any effect
on the last two.

Thanks,
Richard


Re: [PATCH] Fix middle-end/81737

2017-08-08 Thread Marek Polacek
On Mon, Aug 07, 2017 at 04:07:49PM +0200, Richard Biener wrote:
> On August 7, 2017 11:09:59 AM GMT+02:00, Marek Polacek  
> wrote:
> >On Mon, Aug 07, 2017 at 10:58:09AM +0200, Jakub Jelinek wrote:
> >> On Mon, Aug 07, 2017 at 10:47:51AM +0200, Marek Polacek wrote:
> >> > In my recent change I failed to check whether the type domain
> >> > of a type is non-NULL and this goof causes crashing on this
> >> > testcase.
> >> > 
> >> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> >> > 
> >> > 2017-08-07  Marek Polacek  
> >> > 
> >> >  PR middle-end/81737
> >> >  * fold-const.c (fold_indirect_ref_1): Check type_domain.
> >> > 
> >> >  * gcc.dg/pr81737.c: New test.
> >> 
> >> The old code was assuming size_zero_node if type_domain is NULL
> >> or TYPE_MIN_VALUE is NULL, which is reasonable for C/C++, but indeed
> >might
> >> be wrong perhaps for Fortran or Ada.
> 
> It's how the middle-end defines it, so please restore this behavior (sorry 
> for missing this in the review).  IIRC there are at least one other 'copy' of 
> the folding somewhere.

Sure, this patch should do it:

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-08-08  Marek Polacek  

PR middle/81695
* fold-const.c (fold_indirect_ref_1): Restore original behavior
regarding size_zero_node.

diff --git gcc/fold-const.c gcc/fold-const.c
index 5a118ca50a1..2c47d1d16a4 100644
--- gcc/fold-const.c
+++ gcc/fold-const.c
@@ -14109,22 +14109,21 @@ fold_indirect_ref_1 (location_t loc, tree type, tree 
op0)
   && type == TREE_TYPE (op00type))
{
  tree type_domain = TYPE_DOMAIN (op00type);
- tree min;
+ tree min = size_zero_node;
  if (type_domain != NULL_TREE
- && (min = TYPE_MIN_VALUE (type_domain))
+ && TYPE_MIN_VALUE (type_domain)
  && TREE_CODE (min) == INTEGER_CST)
+   min = TYPE_MIN_VALUE (type_domain);
+ offset_int off = wi::to_offset (op01);
+ offset_int el_sz = wi::to_offset (TYPE_SIZE_UNIT (type));
+ offset_int remainder;
+ off = wi::divmod_trunc (off, el_sz, SIGNED, );
+ if (remainder == 0)
{
- offset_int off = wi::to_offset (op01);
- offset_int el_sz = wi::to_offset (TYPE_SIZE_UNIT (type));
- offset_int remainder;
- off = wi::divmod_trunc (off, el_sz, SIGNED, );
- if (remainder == 0)
-   {
- off = off + wi::to_offset (min);
- op01 = wide_int_to_tree (sizetype, off);
- return build4_loc (loc, ARRAY_REF, type, op00, op01,
-NULL_TREE, NULL_TREE);
-   }
+ off = off + wi::to_offset (min);
+ op01 = wide_int_to_tree (sizetype, off);
+ return build4_loc (loc, ARRAY_REF, type, op00, op01,
+NULL_TREE, NULL_TREE);
}
}
}

Marek


Re: [PATCH] i386: Don't use frame pointer without stack access

2017-08-08 Thread Uros Bizjak
On Tue, Aug 8, 2017 at 6:38 PM, H.J. Lu  wrote:

> When Linux/x86-64 kernel is compiled with -fno-omit-frame-pointer.
> this optimization removes more than 730
>
> pushq %rbp
> movq %rsp, %rbp
> popq %rbp

 If you don't want the frame pointer, why are you compiling with
 -fno-omit-frame-pointer?  Are you going to add
 -fforce-no-omit-frame-pointer or something similar so that people can
 actually get what they are asking for?  This doesn't really make sense.
 It is perfectly fine to omit frame pointer by default, when it isn't
 required for something, but if the user asks for it, we shouldn't ignore 
 his
 request.

>>>
>>>
>>> wanting a framepointer is very nice and desired...  ... but if the
>>> optimizer/ins scheduler moves instructions outside of the frame'd
>>> portion, (it does it for cases like below as well), the value is
>>> already negative for these functions that don't have stack use.
>>>
>>> :
>>> movall_schedules@@Base-0x38460,%rax
>>> push   %rbp
>>> mov%rsp,%rbp
>>> pop%rbp
>>> cmpq   $0x0,(%rax)
>>> setne  %al
>>> movzbl %al,%eax
>>> retq
>>
>> Yeah, and it could be even weirder for big single-block functions.
>> I think GCC has been doing this kind of scheduling of prologue and
>> epilogue instructions for a while, so there hasn*t really been a
>> guarantee which parts of the function will have a new FP and which
>> will still have the old one.
>>
>> Also, with an arbitrarily-picked host compiler (GCC 6.3.1), shrink-wrapping
>> kicks in when the following is compiled with -O3 -fno-omit-frame-pointer:
>>
>> void f (int *);
>> void
>> g (int *x)
>> {
>>   for (int i = 0; i < 1000; ++i)
>> x[i] += 1;
>>   if (x[0])
>> {
>>   int temp;
>>   f ();
>> }
>> }
>>
>> so only the block with the call to f sets up FP.  The relatively
>> long-running loop runs with the caller's FP.
>>
>> I hope we can go for a target-independent position that what HJ*s
>> patch does is OK...
>>
>
> In light of this,  I am resubmitting my patch.  I added 3 more testcases
> and also handle:
>
> typedef int v8si __attribute__ ((vector_size (32)));
>
> void
> foo (v8si *out_start, v8si *out_end, v8si *regions)
> {
> v8si base = regions[3];
> *out_start = base;
> *out_end = base;
> }
>
> OK for trunk?

I think that the patch doesn't worsen the situation with FP debugging,
a couple of cases were presented where function operates on the caller
frame. Let's wait a bit for a counter-examples, where the patch hurts
debugging. IMO, the patch is the way to go, as shrink-wrapping is more
toxic than presented patch.

Uros.


Re: [PATCH] i386: Don't use frame pointer without stack access

2017-08-08 Thread Richard Biener
On August 8, 2017 6:38:30 PM GMT+02:00, "H.J. Lu"  wrote:
>On Mon, Aug 7, 2017 at 1:05 PM, Richard Sandiford
> wrote:
>> Arjan van de Ven  writes:
>>> On 8/7/2017 8:43 AM, Jakub Jelinek wrote:
 On Mon, Aug 07, 2017 at 08:39:24AM -0700, H.J. Lu wrote:
> When Linux/x86-64 kernel is compiled with -fno-omit-frame-pointer.
> this optimization removes more than 730
>
> pushq %rbp
> movq %rsp, %rbp
> popq %rbp

 If you don't want the frame pointer, why are you compiling with
 -fno-omit-frame-pointer?  Are you going to add
 -fforce-no-omit-frame-pointer or something similar so that people
>can
 actually get what they are asking for?  This doesn't really make
>sense.
 It is perfectly fine to omit frame pointer by default, when it
>isn't
 required for something, but if the user asks for it, we shouldn't
>ignore his
 request.

>>>
>>>
>>> wanting a framepointer is very nice and desired...  ... but if the
>>> optimizer/ins scheduler moves instructions outside of the frame'd
>>> portion, (it does it for cases like below as well), the value is
>>> already negative for these functions that don't have stack use.
>>>
>>> :
>>> movall_schedules@@Base-0x38460,%rax
>>> push   %rbp
>>> mov%rsp,%rbp
>>> pop%rbp
>>> cmpq   $0x0,(%rax)
>>> setne  %al
>>> movzbl %al,%eax
>>> retq
>>
>> Yeah, and it could be even weirder for big single-block functions.
>> I think GCC has been doing this kind of scheduling of prologue and
>> epilogue instructions for a while, so there hasn*t really been a
>> guarantee which parts of the function will have a new FP and which
>> will still have the old one.
>>
>> Also, with an arbitrarily-picked host compiler (GCC 6.3.1),
>shrink-wrapping
>> kicks in when the following is compiled with -O3
>-fno-omit-frame-pointer:
>>
>> void f (int *);
>> void
>> g (int *x)
>> {
>>   for (int i = 0; i < 1000; ++i)
>> x[i] += 1;
>>   if (x[0])
>> {
>>   int temp;
>>   f ();
>> }
>> }
>>
>> so only the block with the call to f sets up FP.  The relatively
>> long-running loop runs with the caller's FP.
>>
>> I hope we can go for a target-independent position that what HJ*s
>> patch does is OK...
>>
>
>In light of this,  I am resubmitting my patch.  I added 3 more
>testcases
>and also handle:
>
>typedef int v8si __attribute__ ((vector_size (32)));
>
>void
>foo (v8si *out_start, v8si *out_end, v8si *regions)
>{
>v8si base = regions[3];
>*out_start = base;
>*out_end = base;
>}
>
>OK for trunk?

The invoker specified -fno-omit-frame-pointer, why did you eliminate it?  I'd 
argue it's OK when neither -f nor -fno- is explicitly specified irrespective of 
the default in case we document the change but an explicit -fno- is pretty 
clear.

And yes, unfortunate placement of frame pointer init/de-init should be fixed.

Richard.

>
>Thanks.



Re: [PATCH 0/5] RFC, WIP: RTL cost improvements

2017-08-08 Thread Richard Sandiford
Jeff Law  writes:
> On 08/05/2017 11:15 AM, Segher Boessenkool wrote:
>> On Sat, Aug 05, 2017 at 11:07:20AM +0100, Richard Sandiford wrote:
>>> Segher Boessenkool  writes:
 This series creates pattern_cost and insn_cost functions that together
 replace the existing insn_rtx_cost function.

 pattern_cost is like the old insn_rtx_cost function; insn_cost takes
 an actual rtx_insn * as input, not just a pattern.

 Also a targetm.insn_cost is added, which targets can use to implement
 a more exact cost more easily.

 The combine patch is pretty gross (but functional), it needs some
 refactoring (to not call recog so often).  The rs6000 patch is very
 much a work in progress.

 How does this look?  Is this the right direction?
>>>
>>> Seems good to me FWIW.  Since this hook is entirely new, would it
>>> be worth standardising on attribute names for size and speed costs,
>>> a bit like "length" and "enabled"?  I think otherwise the target hooks
>>> are going to end up with similar boilerplate.
>> 
>> For size cost I currently use just "length", but I haven't looked at
>> size cost much at all yet.
> I think that's fine.  "length" is pretty standardized at this point and
> it's the right metric.  For ports that don't bother defining a length
> attribute, punt in some reasonable manner.
>
>
>
>> 
>> For speed cost I primarily use "type", modified by the number of machine
>> insns a pattern generates (quite a few are split); and I get the number
>> of machine insns just from "length" again, which for rs6000 is easy and
>> correct in most cases.  Some other targets may need something else.
>
>> 
>> I also have an attribute "cost" that can be used to override the
>> default calculation; that seems useful to standardise on.  I've pondered
>> a "cost_adjust" that will be added to the calculated cost instead, but
>> it hasn't been useful so far.
> Let's go ahead and "reserve" cost and cost_adjust for this purpose.  If
> we find that cost_adjust isn't actually necessary, then so be it, it's
> not a big deal to me.

I was thinking we should have separate attributes for size and speed
from the outset.   How about size_cost and speed_cost?  It'd be up
to the target to decide whether to define them as the sums of various
sub-attributes (like it's the target's decision how to break "enabled"
down).

The advantage of doing it all in attributes is that the generator might
be able to help optimise the process of checking for costs.  No promises
though :-)

TBH I think we should start with the attributes as well-defined names
and only add the hook if we find we still need it.  I can't really see
what a C function would give over keeping the costs with the instruction
definitions.

Thanks,
Richard


[PATCH, i386]: Make stack canary location customizable (PR target/81708)

2017-08-08 Thread Uros Bizjak
Hello!

Attached patch introduces -mstack-protector-guard-reg=  and
-mstack-protector-guard-offset= options to make stack canary location
customizable. These are the same options powerpc has.

2017-08-08  Uros Bizjak  

PR target/81708
* config/i386/i386.opt (mstack-protector-guard-reg=): New option
(mstack-protector-guard-offset=): Ditto.
* config/i386/i386.c (ix86_option_override): Handle
-mstack-protector-guard-reg= and -mstack-protector-guard-offset=
options.
(ix86_stack_protect_guard): Use ix86_stack_protect_guard_reg and
ix86_stack_protect_guard_offset variables.
(TARGET_STACK_PROTECT_GUARD): Always define.
* doc/invoke.texi (x86 Options): Document -mstack-protector-guard-reg=
and -mstack-protector-guard-offset= options.

testsuite/ChangeLog:

2017-08-08  Uros Bizjak  

PR target/81708
* gcc.target/i386/stack-prot-guard.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN. I will mention new options in gcc-8 changes
webpage later.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 250960)
+++ config/i386/i386.c  (working copy)
@@ -6662,6 +6662,69 @@ ix86_option_override_internal (bool main_args_p,
 opts->x_ix86_stack_protector_guard
   = TARGET_HAS_BIONIC ? SSP_GLOBAL : SSP_TLS;
 
+#ifdef TARGET_THREAD_SSP_OFFSET
+  ix86_stack_protector_guard_offset = TARGET_THREAD_SSP_OFFSET;
+#endif
+
+  if (global_options_set.x_ix86_stack_protector_guard_offset_str)
+{
+  char *endp;
+  const char *str = ix86_stack_protector_guard_offset_str;
+
+  errno = 0;
+  int64_t offset;
+
+#if defined(INT64_T_IS_LONG)
+  offset = strtol (str, , 0);
+#else
+  offset = strtoll (str, , 0);
+#endif
+
+  if (!*str || *endp || errno)
+   error ("%qs is not a valid number "
+  "in -mstack-protector-guard-offset=", str);
+
+  if (!IN_RANGE (offset, HOST_WIDE_INT_C (-0x8000),
+HOST_WIDE_INT_C (0x7fff)))
+   error ("%qs is not a valid offset "
+  "in -mstack-protector-guard-offset=", str);
+
+  ix86_stack_protector_guard_offset = offset;
+}
+
+  ix86_stack_protector_guard_reg = DEFAULT_TLS_SEG_REG;
+
+  /* The kernel uses a different segment register for performance
+ reasons; a system call would not have to trash the userspace
+ segment register, which would be expensive.  */
+  if (ix86_cmodel == CM_KERNEL)
+ix86_stack_protector_guard_reg = ADDR_SPACE_SEG_GS;
+
+  if (global_options_set.x_ix86_stack_protector_guard_reg_str)
+{
+  const char *str = ix86_stack_protector_guard_reg_str;
+  addr_space_t seg = ADDR_SPACE_GENERIC;
+
+  /* Discard optional register prefix.  */
+  if (str[0] == '%')
+   str++;
+
+  if (strlen (str) == 2 && str[1] == 's')
+   {
+ if (str[0] == 'f')
+   seg = ADDR_SPACE_SEG_FS;
+ else if (str[0] == 'g')
+   seg = ADDR_SPACE_SEG_GS;
+   }
+
+  if (seg == ADDR_SPACE_GENERIC)
+   error ("%qs is not a valid base register "
+  "in -mstack-protector-guard-reg=",
+  ix86_stack_protector_guard_reg_str);
+
+  ix86_stack_protector_guard_reg = seg;
+}
+
   /* Handle -mmemcpy-strategy= and -mmemset-strategy=  */
   if (opts->x_ix86_tune_memcpy_strategy)
 {
@@ -45795,7 +45858,6 @@ ix86_mangle_type (const_tree type)
 }
 }
 
-#ifdef TARGET_THREAD_SSP_OFFSET
 static tree
 ix86_stack_protect_guard (void)
 {
@@ -45802,20 +45864,13 @@ ix86_stack_protect_guard (void)
   if (TARGET_SSP_TLS_GUARD)
 {
   tree type_node = lang_hooks.types.type_for_mode (ptr_mode, 1);
-  addr_space_t as = DEFAULT_TLS_SEG_REG;
 
-  /* The kernel uses a different segment register for performance
-reasons; a system call would not have to trash the userspace
-segment register, which would be expensive.  */
-  if (ix86_cmodel == CM_KERNEL)
-   as = ADDR_SPACE_SEG_GS;
+  int qual = ENCODE_QUAL_ADDR_SPACE (ix86_stack_protector_guard_reg);
 
-  int qual = ENCODE_QUAL_ADDR_SPACE (as);
-
   tree type = build_qualified_type (type_node, qual);
   tree asptrtype = build_pointer_type (type);
-  tree sspoff = build_int_cst (asptrtype, TARGET_THREAD_SSP_OFFSET);
-
+  tree sspoff = build_int_cst (asptrtype,
+  ix86_stack_protector_guard_offset);
   tree t = build2 (MEM_REF, asptrtype, sspoff,
   build_int_cst (asptrtype, 0));
   return t;
@@ -45823,7 +45878,6 @@ ix86_stack_protect_guard (void)
 
   return default_stack_protect_guard ();
 }
-#endif
 
 /* For 32-bit code we can save PIC register setup by using
__stack_chk_fail_local hidden function instead of calling
@@ -52831,10 +52885,8 @@ ix86_run_selftests (void)
 #undef TARGET_MANGLE_TYPE
 #define TARGET_MANGLE_TYPE 

Re: [PATCH 0/5] RFC, WIP: RTL cost improvements

2017-08-08 Thread Jeff Law
On 08/05/2017 11:15 AM, Segher Boessenkool wrote:
> On Sat, Aug 05, 2017 at 11:07:20AM +0100, Richard Sandiford wrote:
>> Segher Boessenkool  writes:
>>> This series creates pattern_cost and insn_cost functions that together
>>> replace the existing insn_rtx_cost function.
>>>
>>> pattern_cost is like the old insn_rtx_cost function; insn_cost takes
>>> an actual rtx_insn * as input, not just a pattern.
>>>
>>> Also a targetm.insn_cost is added, which targets can use to implement
>>> a more exact cost more easily.
>>>
>>> The combine patch is pretty gross (but functional), it needs some
>>> refactoring (to not call recog so often).  The rs6000 patch is very
>>> much a work in progress.
>>>
>>> How does this look?  Is this the right direction?
>>
>> Seems good to me FWIW.  Since this hook is entirely new, would it
>> be worth standardising on attribute names for size and speed costs,
>> a bit like "length" and "enabled"?  I think otherwise the target hooks
>> are going to end up with similar boilerplate.
> 
> For size cost I currently use just "length", but I haven't looked at
> size cost much at all yet.
I think that's fine.  "length" is pretty standardized at this point and
it's the right metric.  For ports that don't bother defining a length
attribute, punt in some reasonable manner.



> 
> For speed cost I primarily use "type", modified by the number of machine
> insns a pattern generates (quite a few are split); and I get the number
> of machine insns just from "length" again, which for rs6000 is easy and
> correct in most cases.  Some other targets may need something else.

> 
> I also have an attribute "cost" that can be used to override the
> default calculation; that seems useful to standardise on.  I've pondered
> a "cost_adjust" that will be added to the calculated cost instead, but
> it hasn't been useful so far.
Let's go ahead and "reserve" cost and cost_adjust for this purpose.  If
we find that cost_adjust isn't actually necessary, then so be it, it's
not a big deal to me.

Jeff


Re: [PATCH] i386: Don't use frame pointer without stack access

2017-08-08 Thread H.J. Lu
On Mon, Aug 7, 2017 at 1:05 PM, Richard Sandiford
 wrote:
> Arjan van de Ven  writes:
>> On 8/7/2017 8:43 AM, Jakub Jelinek wrote:
>>> On Mon, Aug 07, 2017 at 08:39:24AM -0700, H.J. Lu wrote:
 When Linux/x86-64 kernel is compiled with -fno-omit-frame-pointer.
 this optimization removes more than 730

 pushq %rbp
 movq %rsp, %rbp
 popq %rbp
>>>
>>> If you don't want the frame pointer, why are you compiling with
>>> -fno-omit-frame-pointer?  Are you going to add
>>> -fforce-no-omit-frame-pointer or something similar so that people can
>>> actually get what they are asking for?  This doesn't really make sense.
>>> It is perfectly fine to omit frame pointer by default, when it isn't
>>> required for something, but if the user asks for it, we shouldn't ignore his
>>> request.
>>>
>>
>>
>> wanting a framepointer is very nice and desired...  ... but if the
>> optimizer/ins scheduler moves instructions outside of the frame'd
>> portion, (it does it for cases like below as well), the value is
>> already negative for these functions that don't have stack use.
>>
>> :
>> movall_schedules@@Base-0x38460,%rax
>> push   %rbp
>> mov%rsp,%rbp
>> pop%rbp
>> cmpq   $0x0,(%rax)
>> setne  %al
>> movzbl %al,%eax
>> retq
>
> Yeah, and it could be even weirder for big single-block functions.
> I think GCC has been doing this kind of scheduling of prologue and
> epilogue instructions for a while, so there hasn*t really been a
> guarantee which parts of the function will have a new FP and which
> will still have the old one.
>
> Also, with an arbitrarily-picked host compiler (GCC 6.3.1), shrink-wrapping
> kicks in when the following is compiled with -O3 -fno-omit-frame-pointer:
>
> void f (int *);
> void
> g (int *x)
> {
>   for (int i = 0; i < 1000; ++i)
> x[i] += 1;
>   if (x[0])
> {
>   int temp;
>   f ();
> }
> }
>
> so only the block with the call to f sets up FP.  The relatively
> long-running loop runs with the caller's FP.
>
> I hope we can go for a target-independent position that what HJ*s
> patch does is OK...
>

In light of this,  I am resubmitting my patch.  I added 3 more testcases
and also handle:

typedef int v8si __attribute__ ((vector_size (32)));

void
foo (v8si *out_start, v8si *out_end, v8si *regions)
{
v8si base = regions[3];
*out_start = base;
*out_end = base;
}

OK for trunk?

Thanks.

-- 
H.J.
From d18008a8052973ab8582a1662f42669a9d318d0d Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Sun, 6 Aug 2017 06:24:36 -0700
Subject: [PATCH] i386: Don't use frame pointer without stack access

When there is no stack access, there is no need to use frame pointer
even if -fno-omit-frame-pointer is used.

gcc/

	PR target/81736
	* config/i386/i386.c (ix86_finalize_stack_realign_flags): Renamed
	to ...
	(ix86_finalize_stack_frame_flags): This.  Also clear
	frame_pointer_needed if -fno-omit-frame-pointer is used without
	stack access.
	(ix86_expand_prologue): Replace ix86_finalize_stack_realign_flags
	with ix86_finalize_stack_frame_flags.
	(ix86_expand_epilogue): Likewise.
	(ix86_expand_split_stack_prologue): Likewise.

gcc/testsuite/

	PR target/81736
	* gcc.target/i386/pr81736-1.c: New test.
	* gcc.target/i386/pr81736-2.c: Likewise.
	* gcc.target/i386/pr81736-3.c: Likewise.
	* gcc.target/i386/pr81736-4.c: Likewise.
	* gcc.target/i386/pr81736-5.c: Likewise.
	* gcc.target/i386/pr81736-6.c: Likewise.
	* gcc.target/i386/pr81736-7.c: Likewise.
---
 gcc/config/i386/i386.c| 23 ---
 gcc/testsuite/gcc.target/i386/pr81736-1.c | 13 +
 gcc/testsuite/gcc.target/i386/pr81736-2.c | 14 ++
 gcc/testsuite/gcc.target/i386/pr81736-3.c | 11 +++
 gcc/testsuite/gcc.target/i386/pr81736-4.c | 11 +++
 gcc/testsuite/gcc.target/i386/pr81736-5.c | 20 
 gcc/testsuite/gcc.target/i386/pr81736-6.c | 16 
 gcc/testsuite/gcc.target/i386/pr81736-7.c | 13 +
 8 files changed, 110 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81736-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81736-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81736-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81736-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81736-5.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81736-6.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr81736-7.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index dfef996e36c..ed51595298d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -14116,10 +14116,11 @@ output_probe_stack_range (rtx reg, rtx end)
   return "";
 }
 
-/* Finalize stack_realign_needed flag, which will guide prologue/epilogue
-   to be generated in correct form.  */
+/* Finalize 

Re: [PATCH][v2] Fix target attribute handling (PR c++/81355).

2017-08-08 Thread Jason Merrill
On Tue, Aug 8, 2017 at 12:59 AM, Martin Liška  wrote:
> On 08/02/2017 09:56 PM, Martin Sebor wrote:
>> On 08/02/2017 01:04 PM, Jeff Law wrote:
>>> On 07/28/2017 05:13 AM, Martin Liška wrote:
 Hello.

 Following patch skips empty strings in 'target' attribute.
 Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

 Ready to be installed?
 Martin

 gcc/ChangeLog:

 2017-07-13  Martin Liska  

 PR c++/81355
 * attribs.c (sorted_attr_string): Skip empty strings.

 gcc/testsuite/ChangeLog:

 2017-07-13  Martin Liska  

 PR c++/81355
 * g++.dg/other/pr81355.C: New test.
>>> OK.  THough one could legitimately ask if this ought to be a parsing error.
>>
>> I would suggest to at least issue a warning.  It seems that
>> the empty string must almost certainly be a bug here, say due
>> to unintended macro expansion.
>>
>> Otherwise, if it should remain silently (and intentionally)
>> accepted, I recommend to document it.  As it is, the manual
>> says that the "string" argument is equivalent to compiling
>> with -mstring which for "" would be rejected by the driver.
>>
>> Martin
>
> Thanks you both for feedback. I decided to come up with an error message for 
> that.
>
> Feedback appreciated.

I'd think to check it in handle_target_argument rather than require
all targets to check it.

Jason


Re: [PATCH 1/2] simplify-rtx: The truncation of an IOR can have all bits set (PR81423)

2017-08-08 Thread Jeff Law
On 08/07/2017 04:33 PM, Segher Boessenkool wrote:
> On Tue, Jul 25, 2017 at 06:25:49AM -0500, Segher Boessenkool wrote:
>> On Mon, Jul 24, 2017 at 04:06:39PM -0600, Jeff Law wrote:
 2017-07-24  Segher Boessenkool  

 gcc/testsuite/
PR rtl-optimization/81423
* gcc.c-torture/execute/pr81423.c: New testcase.
>>> I think int32plus just indicates ints are at least 32 bits. But a long
>>> or long long could still be just 32 bits.  so int32plus && long_neq_int,
>>> to ensure that long/long long are 64 bits?
>>
>> Well, long long is required to be 64 bits or more by the C standard.
>> But some GCC targets do not follow that, with certain options at least.
>>
>> It looks like that test actually requires long long to be *exactly*
>> 64 bits.  I'll modify the test to test for that.
> 
> So I came up with the following.  Is this okay for trunk?  (Tested on
> powerpc64-linux and x86_64-linux, both with both -m32 and -m64, and
> tested it does fail on x86 without the patches to fix the bug).
> 
> 
> Segher
> 
> 
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr81423.c 
> b/gcc/testsuite/gcc.c-torture/execute/pr81423.c
> new file mode 100644
> index 000..731aa8f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr81423.c
> @@ -0,0 +1,36 @@
> +extern void abort (void);
> +
> +unsigned long long int ll = 0;
> +unsigned long long int ull1 = 1ULL;
> +unsigned long long int ull2 = 12008284144813806346ULL;
> +unsigned long long int ull3;
> +
> +unsigned long long int __attribute__ ((noinline))
> +foo (void)
> +{
> +  ll = -5597998501375493990LL;
> +
> +  ll = (5677365550390624949L - ll) - (ull1 > 0);
> +  unsigned long long int ull3;
> +  ull3 = (unsigned int)
> +(2067854353L <<
> + (((ll + -2129105131L) ^ 10280750144413668236ULL) -
> +  10280750143997242009ULL)) >> ((2873442921854271231ULL | ull2)
> + - 12098357307243495419ULL);
> +
> +  return ull3;
> +}
> +
> +int
> +main (void)
> +{
> +  /* We need a long long of exactly 64 bits for this test.  */
> +  ll--;
> +  if (ll != 0xULL)
> +return 0;
I think we've used sizeof to check this in the past.  But I'm not wed to
that approach.  If it's working for you, then let's go with it.

jeff



[PATCH 0/3] improve detection of attribute conflicts (PR 81544)

2017-08-08 Thread Martin Sebor

This patch series improves the attribute checking infrastructure
to make it easy to express mutual exclusivity between attributes,
detect and drop conflicting attributes as they are being added
to declarations, and help find sources of conflicts when they
involve two distinct declarations of the same symbols.

The details are below.

Thanks
Martin

The existing attribute handling infrastructure doesn't make
it possible for an attribute handler to diagnose conflicts
between attributes specified on distinct declarations of
the same function or object.  As a result, the handlers that
make an effort to detect conflicts only diagnose them when
they occur on the same declaration.  Such conflicts are easy
to spot during code review and so the diagnostics are of only
marginal help.  The hard-to-spot conflicts are those that are
between attributes on distinct declarations, and those for
the most part aren't detected.

Marek made a nice improvement in this area last year in r236129
by detecting a small subset of these problems in c-common.c (now
in c-warn.c).  The diagnose_mismatched_attributes function detects
conflicts between attribute always_inline and noinline.  However,
the function issues warnings only after the conflicting attributes
have been applied to the declaration without removing any, so code
that then makes use of the attributes and isn't prepared to deal
with the conflict may work in surprising ways.

To help solve the problem the patch adds to each attribute
an optional array of "exclusions" or names of other attributes
that a given attribute is mutually exclusive with.  It then
modifies the decl_attributes function to traverse the array and
diagnose (and drop) each attribute on a newly seen declaration
that is in conflict with an attribute already applied to it.
When a conflict involves two declarations of the same symbols
decl_attributes also points to the last known declarations to
help find the source of the conflict.

The patch only adds exclusions to the C and C++ front ends, but
other than initializing the new exclusion pointer to NULL doesn't
take advantage of this new feature in the back ends.  I'd like
to go and add exclusions to a small number of back ends as
examples for maintainers to follow after this patch has been
reviewed and approved.  My hope is that the exclusions array
will serve as a reminder to consider potential conflicts when
new attributes are being added, and provide an easier mechanism
to reject them than having to handcode it in each new handler.


[PATCH 3/3] diagnose attribute aligned conflicts (PR 81566)

2017-08-08 Thread Martin Sebor

Patch 3 in the series restores the diagnostics for conflicting
attribute aligned on the same function (this regressed in r192199
in GCC 4.9), and also makes use of the enhanced infrastructure to
enhance the detection of the same conflicts on distinct declarations
of the same function.

Rather than issuing an error for the confluct as was done in GCC 4.8
I made it a -Wattributes warning.  That seems more in line with how
otherwise syntactically valid attributes are handled elsewhere, and
it also lets code developed since the regression was introduced to
temporarily suppress the warning until the problem is conflict is
corrected.

Martin
PR c/81566 - invalid attribute aligned accepted on functions

gcc/c-family/ChangeLog:

	PR c/81566
	c-attribs.c (handle_aligned_attribute): Diagnose conflicting
	attribute specifications.

gcc/testsuite/ChangeLog:

	PR c/8166
	* c-c++-common/Wattributes-2.c: New test.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index ac704bc..5a37dc2 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -1769,14 +1769,18 @@ check_cxx_fundamental_alignment_constraints (tree node,
struct attribute_spec.handler.  */
 
 static tree
-handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args,
+handle_aligned_attribute (tree *node, tree name, tree args,
 			  int flags, bool *no_add_attrs)
 {
   tree decl = NULL_TREE;
   tree *type = NULL;
-  int is_type = 0;
+  bool is_type = false;
   tree align_expr;
-  int i;
+
+  /* The last (already pushed) declaration with all validated attributes
+ merged in or the current about-to-be-pushed one if one hassn't been
+ yet.  */
+  tree last_decl = node[1] ? node[1] : *node;
 
   if (args)
 {
@@ -1795,10 +1799,21 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args,
   is_type = TREE_CODE (*node) == TYPE_DECL;
 }
   else if (TYPE_P (*node))
-type = node, is_type = 1;
+type = node, is_type = true;
+
+  /* Log2 of specified alignment.  */
+  int pow2align = check_user_alignment (align_expr, true);
+
+  /* The alignment in bits corresponding to the specified alignment.  */
+  unsigned bitalign = (1U << pow2align) * BITS_PER_UNIT;
 
-  if ((i = check_user_alignment (align_expr, true)) == -1
-  || !check_cxx_fundamental_alignment_constraints (*node, i, flags))
+  /* The alignment of the current declaration and that of the last
+ pushed declaration, determined on demand below.  */
+  unsigned curalign = 0;
+  unsigned lastalign = 0;
+
+  if (pow2align == -1
+  || !check_cxx_fundamental_alignment_constraints (*node, pow2align, flags))
 *no_add_attrs = true;
   else if (is_type)
 {
@@ -1819,7 +1834,7 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args,
   else
 	*type = build_variant_type_copy (*type);
 
-  SET_TYPE_ALIGN (*type, (1U << i) * BITS_PER_UNIT);
+  SET_TYPE_ALIGN (*type, bitalign);
   TYPE_USER_ALIGN (*type) = 1;
 }
   else if (! VAR_OR_FUNCTION_DECL_P (decl)
@@ -1828,8 +1843,34 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args,
   error ("alignment may not be specified for %q+D", decl);
   *no_add_attrs = true;
 }
+  else if (TREE_CODE (decl) == FUNCTION_DECL
+	   && ((curalign = DECL_ALIGN (decl)) > bitalign
+	   || ((lastalign = DECL_ALIGN (last_decl)) > bitalign)))
+{
+  /* Either a prior attribute on the same declaration or one
+	 on a prior declaration of the same function specifies
+	 stricter alignment than this attribute.  */
+  bool note = lastalign != 0;
+  if (lastalign)
+	curalign = lastalign;
+
+  curalign /= BITS_PER_UNIT;
+  bitalign /= BITS_PER_UNIT;
+
+  if (DECL_USER_ALIGN (decl) || DECL_USER_ALIGN (last_decl))
+	warning (OPT_Wattributes,
+		 "ignoring attribute %<%E (%u)%> because it conflicts with "
+		 "attribute %<%E (%u)%>", name, bitalign, name, curalign);
+  else
+	error ("alignment for %q+D must be at least %d", decl, curalign);
+
+  if (note)
+	inform (DECL_SOURCE_LOCATION (last_decl), "previous declaration here");
+
+  *no_add_attrs = true;
+}
   else if (DECL_USER_ALIGN (decl)
-	   && DECL_ALIGN (decl) > (1U << i) * BITS_PER_UNIT)
+	   && DECL_ALIGN (decl) > bitalign)
 /* C++-11 [dcl.align/4]:
 
 	   When multiple alignment-specifiers are specified for an
@@ -1839,21 +1880,9 @@ handle_aligned_attribute (tree *node, tree ARG_UNUSED (name), tree args,
   This formally comes from the c++11 specification but we are
   doing it for the GNU attribute syntax as well.  */
 *no_add_attrs = true;
-  else if (TREE_CODE (decl) == FUNCTION_DECL
-	   && DECL_ALIGN (decl) > (1U << i) * BITS_PER_UNIT)
-{
-  if (DECL_USER_ALIGN (decl))
-	error ("alignment for %q+D was previously specified as %d "
-	   "and may not be decreased", decl,
-	   DECL_ALIGN (decl) / BITS_PER_UNIT);
-  else
-	error ("alignment for %q+D must be at least %d", decl,
-	   

[PATCH 2/3] improve detection of attribute conflicts (PR 81544)

2017-08-08 Thread Martin Sebor

Part 2 of the series contains the mechanical changes to all
the back ends and to front ends other than C and C++.  There
are no functional changes here.

Martin

PR c/81544 - attribute noreturn and warn_unused_result on the same function accepted

gcc/ChangeLog:

	PR c/81544
	* config/alpha/alpha.c (vms_attribute_table): Initialize new member
	of struct attribute_spec.
	* config/arm/arm.c (arm_attribute_table): Same.
	* config/avr/avr.c ( avr_attribute_table): Same.
	* config/bfin/bfin.c (bfin_attribute_table): Same.
	* config/cr16/cr16.c (cr16_attribute_table): Same.
	* config/epiphany/epiphany.c (epiphany_attribute_table): Same.
	* config/h8300/h8300.c (h8300_attribute_table): Same.
	* config/i386/i386.c (ix86_attribute_table): Same.
	* config/ia64/ia64.c (ia64_attribute_table): Same.
	* config/m32c/m32c.c (m32c_attribute_table): Same.
	* config/m32r/m32r.c (m32r_attribute_table): Same.
	* config/m68k/m68k.c (m68k_attribute_table): Same.
	* config/mcore/mcore.c (mcore_attribute_table): Same.
	* config/mips/mips.c (mips_attribute_table): Same.
	* config/nds32/nds32.c (nds32_attribute_table): Same.
	* config/nvptx/nvptx.c (nvptx_attribute_table): Same.
	* config/powerpcspe/powerpcspe.c (rs6000_attribute_table): Same.
	* config/rs6000/rs6000.c (rs6000_attribute_table): Same.
	* config/s390/s390.c (s390_handle_vectorbool_attribute): Same.
	* config/sh/sh.c (sh_attribute_table): Same.
	* config/sparc/sparc.c (sparc_attribute_table): Same.
	* config/spu/spu.c (spu_attribute_table): Same.
	* config/stormy16/stormy16.c (xstormy16_attribute_table): Same.
	* config/v850/v850.c (v850_attribute_table): Same.
	* config/visium/visium.c (visium_attribute_table): Same.

gcc/ada/ChangeLog:

	PR c/81544
	* gcc-interface/utils.c (gnat_internal_attribute_table): Initialize
	new member of struct attribute_spec.

gcc/fortran/ChangeLog:

	PR c/81544
	* f95-lang.c (gfc_attribute_table): Initialize new member of struct
	attribute_spec.

gcc/lto/ChangeLog:

	PR c/81544
	* lto-lang.c (lto_attribute_table): Initialize new member of struct
	attribute_spec.

diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index 341ec20..06147e6 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -7488,9 +7488,10 @@ common_object_handler (tree *node, tree name ATTRIBUTE_UNUSED,
 static const struct attribute_spec vms_attribute_table[] =
 {
   /* { name, min_len, max_len, decl_req, type_req, fn_type_req, handler,
-   affects_type_identity } */
-  { COMMON_OBJECT,   0, 1, true,  false, false, common_object_handler, false },
-  { NULL,0, 0, false, false, false, NULL, false }
+   affects_type_identity, exclusions } */
+  { COMMON_OBJECT,   0, 1, true,  false, false, common_object_handler, false,
+NULL },
+  { NULL,0, 0, false, false, false, NULL, false, NULL }
 };
 
 void
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 25677d1..f5d30bd 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -317,24 +317,24 @@ static machine_mode arm_floatn_mode (int, bool);
 static const struct attribute_spec arm_attribute_table[] =
 {
   /* { name, min_len, max_len, decl_req, type_req, fn_type_req, handler,
-   affects_type_identity } */
+   affects_type_identity, exclusions } */
   /* Function calls made to this symbol must be done indirectly, because
  it may lie outside of the 26 bit addressing range of a normal function
  call.  */
-  { "long_call",0, 0, false, true,  true,  NULL, false },
+  { "long_call",0, 0, false, true,  true,  NULL, false, NULL },
   /* Whereas these functions are always known to reside within the 26 bit
  addressing range.  */
-  { "short_call",   0, 0, false, true,  true,  NULL, false },
+  { "short_call",   0, 0, false, true,  true,  NULL, false, NULL },
   /* Specify the procedure call conventions for a function.  */
   { "pcs",  1, 1, false, true,  true,  arm_handle_pcs_attribute,
-false },
+false, NULL },
   /* Interrupt Service Routines have special prologue and epilogue requirements.  */
   { "isr",  0, 1, false, false, false, arm_handle_isr_attribute,
-false },
+false, NULL },
   { "interrupt",0, 1, false, false, false, arm_handle_isr_attribute,
-false },
+false, NULL },
   { "naked",0, 0, true,  false, false, arm_handle_fndecl_attribute,
-false },
+false, NULL },
 #ifdef ARM_PE
   /* ARM/PE has three new attributes:
  interfacearm - ?
@@ -345,22 +345,24 @@ static const struct attribute_spec arm_attribute_table[] =
  them with spaces.  We do NOT support this.  Instead, use __declspec
  multiple times.
   */
-  { "dllimport",0, 0, true,  false, false, NULL, false },
-  { "dllexport",0, 0, true,  false, false, NULL, false },
+  { "dllimport",0, 0, true,  false, false, NULL, false, NULL },
+  { "dllexport",0, 0, true,  false, false, NULL, false, NULL },
   { "interfacearm", 0, 0, true,  false, false, arm_handle_fndecl_attribute,

[PATCH 1/3] improve detection of attribute conflicts (PR 81544)

2017-08-08 Thread Martin Sebor

Part 1 of the patch series is the core of the enhancement without
the (trivial) changes to the back ends and front ends other than
C and C++.

Marek, I tried to follow what we discussed in IRC.  Please let
me know if I missed something.

Joseph, I read your comments on Marek's patch for the same bug:
  https://gcc.gnu.org/ml/gcc-patches/2017-07/msg02077.html
I had to make some changes to meet that expectation, in part
because decl_attributes reorders attributes, but with those
I believe it does what you expect.

Jason, I'd appreciate your review of the C++ bits.  I couldn't
find a better way of finding an existing declaration of a function
given one that's about to be introduced, but it feels like there
ought to be an existing API somewhere that I missed.

Martin

PR c/81544 - attribute noreturn and warn_unused_result on the same function accepted

gcc/c/ChangeLog:

	PR c/81544
	* c-decl.c (c_decl_attributes): Look up existing declaration and
	pass it to decl_attributes.

gcc/c-family/ChangeLog:

	PR c/81544
	* c-attribs.c (attr_aligned_exclusions): New array.
	(attr_alloc_exclusions, attr_cold_hot_exclusions): Same.
	(attr_common_exclusions, attr_const_pure_exclusions): Same.
	(attr_gnu_inline_exclusions, attr_inline_exclusions): Same.
	(attr_noreturn_exclusions, attr_returns_twice_exclusions): Same.
	(attr_warn_unused_result_exclusions): Same.
	(handle_hot_attribute, handle_cold_attribute): Simplify.
	(handle_const_attribute): Warn on function returning void.
	(handle_pure_attribute): Same.
	(handle_aligned_attribute): Warn on conflicting specifications.
	* c-warn.c (diagnose_mismatched_attributes): Simplify.

gcc/cp/ChangeLog:

	PR c/81544
	* decl2.c (cplus_decl_attributes): Look up existing declaration and
	pass it to decl_attributes.
	* name-lookup.c (find_decl): New function.
	* name-lookup.h (find_decl): Declare it.
	* tree.c (cxx_attribute_table): Initialize new member of struct
	attribute_spec.

gcc/testsuite/ChangeLog:

	PR c/81544
	* c-c++-common/Wattributes-2.c: New test.
	* c-c++-common/Wattributes.c: New test.
	* c-c++-common/attributes-3.c: Adjust.
	* g++.dg/ext/mv11.s: New test.
	* gcc.dg/attr-noinline.c: Adjust.
	* gcc.dg/pr44964.c: Same.
	* gcc.dg/torture/pr42363.c: Same.
	* gcc.dg/tree-ssa/ssa-ccp-2.c: Same.

libstdc++-v3/ChangeLog:

	PR c/81544
	* include/ext/mt_allocator.h (_M_destroy_thread_key): Remove
	pointless attribute const.

gcc/ChangeLog:

	PR c/81544
	* attribs.c (empty_attribute_table): Initialize new member of
	struct attribute_spec.
	(decl_attributes): Add argument.  Handle mutually exclusive
	combinations of attributes.
	* attribs.h (decl_attributes): Add default argument.
	* tree-core.h (attribute_spec::exclusions, exclude): New type and
	member.

 /* Associates a GNAT tree node to a GCC tree node. It is used in
diff --git a/gcc/attribs.c b/gcc/attribs.c
index 05fa8ef..df3104b 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -94,7 +94,7 @@ static bool attributes_initialized = false;
 
 static const struct attribute_spec empty_attribute_table[] =
 {
-  { NULL, 0, 0, false, false, false, NULL, false }
+  { NULL, 0, 0, false, false, false, NULL, false, NULL }
 };
 
 /* Return base name of the attribute.  Ie '__attr__' is turned into 'attr'.
@@ -343,6 +343,97 @@ get_attribute_namespace (const_tree attr)
   return get_identifier ("gnu");
 }
 
+/* Check LAST_DECL and NODE of the same symbol for attributes that are
+   recorded in EXCL to be mutually exclusive with ATTRNAME, diagnose
+   them, and return true if any have been found.  NODE can be a DECL
+   or a TYPE.  */
+
+static bool
+diag_attr_exclusions (tree last_decl, tree node, tree attrname,
+		  const attribute_spec *spec)
+{
+  const attribute_spec::exclusions *excl = spec->exclude;
+
+  tree_code code = TREE_CODE (node);
+
+  if ((code == FUNCTION_DECL && !excl->function
+   && (!excl->type || !spec->affects_type_identity))
+  || (code == VAR_DECL && !excl->variable
+	  && (!excl->type || !spec->affects_type_identity))
+  || (((code == TYPE_DECL || RECORD_OR_UNION_TYPE_P (node)) && !excl->type)))
+return false;
+
+  /* True if an attribute that's mutually exclusive with ATTRNAME
+ has been found.  */
+  bool found = false;
+
+  if (last_decl && last_decl != node && TREE_TYPE (last_decl) != node)
+{
+  /* Check both the last DECL and its type for conflicts with
+	 the attribute being added to the current decl or type.  */
+  found |= diag_attr_exclusions (last_decl, last_decl, attrname, spec);
+  tree decl_type = TREE_TYPE (last_decl);
+  found |= diag_attr_exclusions (last_decl, decl_type, attrname, spec);
+}
+
+  /* NODE is either the current DECL to which the attribute is being
+ applied or its TYPE.  For the former, consider the attributes on
+ both the DECL and its type.  */
+  tree attrs[2];
+
+  if (DECL_P (node))
+{
+  attrs[0] = DECL_ATTRIBUTES (node);
+  attrs[1] = TYPE_ATTRIBUTES (TREE_TYPE (node));
+}
+  else
+{
+  attrs[0] 

[PATCH GCC][OBVIOUS]Handle boundary case for last iv candidate

2017-08-08 Thread Bin Cheng
Hi,
When investigate issues, I ran into this obvious issue that the last candidate 
is not related to compare type iv_use.
This patch fixes it.  Will apply later.

Thanks,
bin
2017-08-08  Bin Cheng  

* tree-ssa-loop-ivopts.c (relate_compare_use_with_all_cands): Handle
boundary case for the last candidate.diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 1cbff04..b65cd96 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -5284,13 +5284,13 @@ set_autoinc_for_original_candidates (struct ivopts_data 
*data)
 static void
 relate_compare_use_with_all_cands (struct ivopts_data *data)
 {
-  unsigned i, max_id = data->vcands.length () - 1;
+  unsigned i, count = data->vcands.length ();
   for (i = 0; i < data->vgroups.length (); i++)
 {
   struct iv_group *group = data->vgroups[i];
 
   if (group->type == USE_COMPARE)
-   bitmap_set_range (group->related_cands, 0, max_id);
+   bitmap_set_range (group->related_cands, 0, count);
 }
 }
 


Re: [PATCH] Convert character arrays to string csts

2017-08-08 Thread Richard Biener
On Tue, Aug 8, 2017 at 1:47 PM, Martin Liška  wrote:
> On 11/09/2016 11:22 AM, Richard Biener wrote:
>> On Fri, Nov 4, 2016 at 2:33 PM, Martin Liška  wrote:
>>> On 11/03/2016 02:00 PM, Jan Hubicka wrote:
> On 11/03/2016 01:12 PM, Martin Liška wrote:
>> +  tree init = DECL_INITIAL (decl);
>> +  if (init
>> +  && TREE_READONLY (decl)
>> +  && can_convert_ctor_to_string_cst (init))
>> +DECL_INITIAL (decl) = build_string_cst_from_ctor (init);
>
> I'd merge these two new functions since they're only ever called
> together. We'd then have something like
>
> if (init && TREE_READONLY (decl))
>   init = convert_ctor_to_string_cst (init);
> if (init)
>   DECL_INITIAL (decl) = init;
>>>
>>> Done.
>>>
>
> I'll defer to Jan on whether finalize_decl seems like a good place
> to do this.

 I think finalize_decl may be bit too early because frontends may expects 
 the
 ctors to be in a way they produced them.  We only want to convert those 
 arrays
 seen by middle-end so I would move the logic to varpool_node::analyze

 Otherwise the patch seems fine to me.

 Honza
>
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c 
>> b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
>> index 283bd1c..b2d1fd5 100644
>> --- a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
>> @@ -4,12 +4,15 @@
>> char *buffer1;
>> char *buffer2;
>>
>> +const char global[] = {'a', 'b', 'c', 'd', '\0'};
>> +
>> #define SIZE 1000
>>
>> int
>> main (void)
>> {
>>   const char* const foo1 = "hello world";
>> +  const char local[] = "abcd";
>>
>>   buffer1 = __builtin_malloc (SIZE);
>>   __builtin_strcpy (buffer1, foo1);
>> @@ -45,6 +48,10 @@ main (void)
>> __builtin_abort ();
>>   if (__builtin_memchr (foo1, null, 12) != foo1 + 11)
>> __builtin_abort ();
>> +  if (__builtin_memchr (global, null, 5) == 0)
>> +__builtin_abort ();
>> +  if (__builtin_memchr (local, null, 5) == 0)
>> +__builtin_abort ();
>
> How is that a meaningful test? This seems to work even with an
> unpatched gcc. I'd have expected something that shows a benefit for
> doing this conversion, and maybe also a test that shows it isn't
> done in cases where it's not allowed.
>>>
>>> It's meaningful as it scans that there's no __builtin_memchr in optimized 
>>> dump.
>>> I'm adding new tests that does the opposite test.
>>>
>
>> tree
>> -build_string_literal (int len, const char *str)
>> +build_string_literal (int len, const char *str, bool build_addr_expr)
>
> New arguments should be documented in the function comment.
>>>
>>> Yep, improved.
>>>
>
>> +/* Return TRUE when CTOR can be converted to a string constant.  */
>
> "if", not "when".
>>>
>>> Done.
>>>
>
>> +  unsigned HOST_WIDE_INT elements = CONSTRUCTOR_NELTS (ctor);
>> +  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), idx, key, value)
>> +{
>> +  if (key == NULL_TREE
>> + || TREE_CODE (key) != INTEGER_CST
>> + || !tree_fits_uhwi_p (value)
>> + || !useless_type_conversion_p (TREE_TYPE (value), char_type_node))
>> +   return false;
>
> Shouldn't all elements have the same type, or do you really have to
> call useless_type_conversion in a loop?
>
>> +  /* Allow zero character just at the end of a string.  */
>> +  if (integer_zerop (value))
>> +   return idx == elements - 1;
>
> Don't you also have to explicitly check it's there?
>
>
> Bernd
>>>
>>>
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> I'm curious about the
>>
>> @@ -1495,7 +1495,8 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
>> {
>>   if (!TREE_STATIC (decl))
>> {
>> - DECL_INITIAL (decl) = NULL_TREE;
>> + if (!TREE_READONLY (decl) || TREE_CODE (init) != STRING_CST)
>> +   DECL_INITIAL (decl) = NULL_TREE;
>>   init = build2 (INIT_EXPR, void_type_node, decl, init);
>>   gimplify_and_add (init, seq_p);
>>   ggc_free (init);
>>
>> change.  Why keep DECL_INITIAL if you build a INIT_EXPR anyway?
>>
>> @@ -4438,6 +4439,19 @@ gimplify_init_constructor (tree *expr_p,
>> gimple_seq *pre_p, gimple_seq *post_p,
>> break;
>>   }
>>
>> +   /* Replace a ctor with a string constant with possible.  */
>> +   if (TREE_READONLY (object)
>> +   && VAR_P (object))
>> + {
>> +   tree string_ctor = convert_ctor_to_string_cst (ctor);
>> +   if (string_ctor)
>> + {
>> +   TREE_OPERAND (*expr_p, 1) = string_ctor;
>> 

Re: [PATCH GCC][5/5]Enable tree loop distribution at -O3 and above optimization levels.

2017-08-08 Thread Richard Biener
On Mon, Aug 7, 2017 at 11:10 AM, Bin.Cheng  wrote:
> On Fri, Jun 23, 2017 at 12:04 PM, Richard Biener
>  wrote:
>> On Fri, Jun 23, 2017 at 10:47 AM, Bin.Cheng  wrote:
>>> On Fri, Jun 23, 2017 at 6:04 AM, Jeff Law  wrote:
 On 06/07/2017 02:07 AM, Bin.Cheng wrote:
> On Tue, Jun 6, 2017 at 6:47 PM, Jeff Law  wrote:
>> On 06/02/2017 05:52 AM, Bin Cheng wrote:
>>> Hi,
>>> This patch enables -ftree-loop-distribution by default at -O3 and above 
>>> optimization levels.
>>> Bootstrap and test at O2/O3 on x86_64 and AArch64.  is it OK?
>>>
>>> Note I don't have strong opinion here and am fine with either it's 
>>> accepted or rejected.
>>>
>>> Thanks,
>>> bin
>>> 2017-05-31  Bin Cheng  
>>>
>>>   * opts.c (default_options_table): Enable 
>>> OPT_ftree_loop_distribution
>>>   for -O3 and above levels.
>> I think the question is how does this generally impact the performance
>> of the generated code and to a lesser degree compile-time.
>>
>> Do you have any performance data?
> Hi Jeff,
> At this stage of the patch, only hmmer is impacted and improved
> obviously in my local run of spec2006 for x86_64 and AArch64.  In long
> term, loop distribution is also one prerequisite transformation to
> handle bwaves (at least).  For these two impacted cases, it helps to
> resolve the gap against ICC.  I didn't check compilation time slow
> down, we can restrict it to problem with small partition number if
> that's a problem.
 Just a note. I know you've iterated further with Richi -- I'm not
 objecting to the patch, nor was I ready to approve.

 Are you and Richi happy with this as-is or are you looking to submit
 something newer based on the conversation the two of you have had?
>>> Hi Jeff,
>>> The patch series is updated in various ways according to review
>>> comments, for example, it restricts compilation time by checking
>>> number of data references against MAX_DATAREFS_FOR_DATADEPS as well as
>>> restores data dependence cache.  There are still two missing parts I'd
>>> like to do as followup patches: one is loop nest distribution and the
>>> other is a data-locality cost model (at least) for small cases.  Now
>>> Richi approved most patches except the last major one, but I still
>>> need another iterate for some (approved) patches in order to fix
>>> mistake/typo introduced when I separating the patch.
>>
>> The patch is ok after the approved parts of the ldist series has been 
>> committed.
>> Note your patch lacks updates to invoke.texi (what options are enabled at 
>> -O3).
>> Please adjust that before committing.
> Hi All,
> Given the loop distribution patches have been merged for a while and
> couple of issues fixed.  I am submitting updated patch to enable the
> pass by default at O3/above levels.
> Bootstrap and test on x86_64 and AArch64 ongoing.  Hmmer still can be
> improved.  Is it OK if no failure?

Ok.

Thanks,
Richard.

> Thanks,
> bin
> 2017-08-07  Bin Cheng  
>
> * doc/invoke.texi: Document -ftree-loop-distribution for O3.
> * opts.c (default_options_table): Add OPT_ftree_loop_distribution.


Re: [PING #2] [PATCH] enhance -Wrestrict to handle string built-ins (PR 78918)

2017-08-08 Thread Richard Biener
On Mon, Aug 7, 2017 at 1:08 AM, Martin Sebor  wrote:
> On 08/03/2017 02:45 AM, Richard Biener wrote:
>>
>> On Wed, Aug 2, 2017 at 7:10 PM, Jeff Law  wrote:
>>>
>>> On 08/01/2017 03:25 AM, Richard Biener wrote:

 On Tue, Aug 1, 2017 at 11:23 AM, Richard Biener
  wrote:
>
> On Tue, Aug 1, 2017 at 4:27 AM, Martin Sebor  wrote:
>>
>> Richard,
>>
>> in discussing this work Jeff mentioned that your comments on
>> the tree-ssa-alias.c parts would be helpful.  When you have
>> a chance could you please give it a once over and let me know
>> if you have any suggestions or concerns?  There are no visible
>> changes to existing clients of the pass, just extensions that
>> are relied on only by the new diagnostics.
>>
>>   https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01264.html
>>
>> I expect to revisit the sprintf %s patch mentioned below and
>> see how to simplify it by taking advantage of the changes
>> implemented here.
>
>
> Your ptr_deref_alias_decl_p returns true when must_alias is true
> but there is no must-alias relationship.
>
> The ao_ref_init_from_ptr_and_size doesn't belong there.
>
> I dislike all of the changes related to returning an alias "size".
>
> Please keep away from the alias machinery.
>
> Warning during folding is similarly bad design.  Please don't add
> more such things.
>
> Thanks for putting this on my radar though.
> Richard.


 Oh, for a constructive comment this all feels like sth for either
 sanitizers or a proper static analysis tool.  As I outlined repeatedly
 I belive GCC could host a static analysis tool but it surely should
 not be interwinded into it's optimization passes or tools.
>>>
>>> I disagree strongly here.
>>>
>>> Sanitiers catch problems after the fact and only if you've compiled your
>>> code with sanitization and manage to find a way to trigger the problem
>>> path.
>>>
>>> Other static analysis tools are useful, but they aren't an integral part
>>> of the edit, compile, debug cycle and due to their cost are often run
>>> long after code was committed.
>>>
>>> Finding useful warnings that can be issued as part of the compile, edit,
>>> debug cycle is, IMHO, far more useful than sanitizers or independent
>>> static checkers.
>>>
>>> --
>>>
>>>
>>> I think finding a way to exploit information that our various analyzers
>>> can provide to give precise warnings is a good thing.  So it seemed
>>> natural that if we wanted to ask a must-alias question that we should be
>>> querying the alias oracle rather than implementing that kind of query
>>> within the sprintf warnings.  I'm not sure why you'd say "Please keep
>>> away from the alias machinery".
>>>
>>> I'm a little confused here...
>>
>>
>> Well, simply because the way as implemented isn't a must-alias query
>> but maybe one that's good enough for warnings (reduces false positives
>> but surely doesn't eliminate them).
>
>
> I'm very interested in reducing the rate of false positives in
> these and all other warnings.  As I mentioned in my comments,
> I did my best to weed them out of the implementation by building
> GDB, Glibc, Busybox, and the Linux kernel.  That of course isn't
> a guarantee that there aren't any.  But the first implementation
> of any non-trivial feature is never perfect, and hardly any
> warning of sufficient complexity is free of false positives, no
> matter here it's implemented (the middle-end, front-end, or
> a standalone static analysis tool).  If you spotted some cases
> I had missed I'd certainly be grateful for examples.  Otherwise,
> they will undoubtedly be reported as more software is exposed
> to the warning and, if possible, fixed, as happens with all
> other warnings.
>
>> There's a must alias query already, in stmt_kills_ref_p.  It's a matter
>> of refactoring to make a refs_must_alias_p.
>>
>> Then propose that "overlap range" stuff separately.
>
>
> I appreciate constructive suggestions for improvements  and I will
> look into the stmt_kills_ref_p suggestion.  But since my work
> elicited such an strong response from you I feel I should explain:
> I tried to make my changes as unintrusive as possible.  The alias
> oracle is a new area for me and I didn't want to make mistakes in
> the process of making overly extensive modifications to it.
>
>> But I'm against lumping this all in as an innocent change suggesting
>> the machinery can do sth (must alias) when it really can't.  I also
>> do not like adding a "must alias" bool to the may-alias APIs as
>> the implementation is fundamentally may-alias, must-alias really
>> is very different.
>
>
> I certainly want to do the right thing and implement the warning
> in a way that makes the most sense.  As I said, I'll look into
> the refactoring, but since my testing shows the current code to

Re: [PATCH] Fix PR81719

2017-08-08 Thread Richard Biener
On Tue, 8 Aug 2017, Bin.Cheng wrote:

> On Tue, Aug 8, 2017 at 1:20 PM, Richard Biener  wrote:
> >
> > The following improves niter analysis for range-based for loops
> > by handling ADDR_EXPR in expand_simple_operations.
> >
> > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> >
> > Richard.
> >
> > 2017-08-08  Richard Biener  
> >
> > PR middle-end/81719
> > * tree-ssa-loop-niter.c: Include tree-dfa.h.
> > (expand_simple_operations): Also look through ADDR_EXPRs with
> > MEM_REF bases treating them as POINTER_PLUS_EXPR.
> >
> > * g++.dg/tree-ssa/pr81719.C: New testcase.
> >
> > Index: gcc/tree-ssa-loop-niter.c
> > ===
> > *** gcc/tree-ssa-loop-niter.c   (revision 250813)
> > --- gcc/tree-ssa-loop-niter.c   (working copy)
> > *** along with GCC; see the file COPYING3.
> > *** 42,47 
> > --- 42,48 
> >   #include "tree-chrec.h"
> >   #include "tree-scalar-evolution.h"
> >   #include "params.h"
> > + #include "tree-dfa.h"
> >
> >
> >   /* The maximum number of dominator BBs we search for conditions
> > *** expand_simple_operations (tree expr, tre
> > *** 1980,1985 
> > --- 1981,2001 
> >
> > if (code == SSA_NAME)
> > return expand_simple_operations (e, stop);
> > +   else if (code == ADDR_EXPR)
> > +   {
> > + HOST_WIDE_INT offset;
> > + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (e, 0),
> > +);
> > + if (base
> > + && TREE_CODE (base) == MEM_REF)
> > +   {
> > + ee = expand_simple_operations (TREE_OPERAND (base, 0), stop);
> > + return fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (expr), ee,
> > + wide_int_to_tree (sizetype,
> > +   mem_ref_offset (base)
> > +   + offset));
> > +   }
> > +   }
> >
> > return expr;
> >   }
> There are various places we need to look into ADDR_EXPR (MEM_REF), is
> it possible/beneficial to generally simplify _REF[base + offset]?

Not sure - we already do this to some extent in pass_laddress, but
only for the non-constant offset case.

Note for the testcase we have [ptr + 8] and >foo.bar.

Richard.

> Thanks,
> bin
> > Index: gcc/testsuite/g++.dg/tree-ssa/pr81719.C
> > ===
> > *** gcc/testsuite/g++.dg/tree-ssa/pr81719.C (nonexistent)
> > --- gcc/testsuite/g++.dg/tree-ssa/pr81719.C (working copy)
> > ***
> > *** 0 
> > --- 1,24 
> > + /* { dg-do compile { target c++11 } } */
> > + /* { dg-options "-O3 -fdump-tree-optimized" } */
> > +
> > + typedef int Items[2];
> > +
> > + struct ItemArray
> > + {
> > +   Items items;
> > +   int sum_x2() const;
> > + };
> > +
> > + int ItemArray::sum_x2() const
> > + {
> > +   int total = 0;
> > +   for (int item : items)
> > + {
> > +   total += item;
> > + }
> > +   return total;
> > + }
> > +
> > + /* We should be able to compute the number of iterations to two, unroll
> > +the loop and end up with a single basic-block in sum_x2.  */
> > + /* { dg-final { scan-tree-dump-times "bb" 1 "optimized" } } */
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH, rs6000] Clean up capitalized diagnostic messages

2017-08-08 Thread Bill Schmidt
Fixed with r250952.

Bill

> On Aug 6, 2017, at 3:08 AM, Andreas Schwab  wrote:
> 
> FAIL: gcc.target/powerpc/bfp/scalar-extract-exp-2.c  (test for errors, line 
> 18)
> FAIL: gcc.target/powerpc/bfp/scalar-extract-exp-2.c (test for excess errors)
> FAIL: gcc.target/powerpc/bfp/scalar-extract-exp-5.c  (test for errors, line 
> 18)
> FAIL: gcc.target/powerpc/bfp/scalar-extract-exp-5.c (test for excess errors)
> FAIL: gcc.target/powerpc/bfp/scalar-extract-sig-2.c  (test for errors, line 
> 16)
> FAIL: gcc.target/powerpc/bfp/scalar-extract-sig-2.c (test for excess errors)
> FAIL: gcc.target/powerpc/bfp/scalar-extract-sig-5.c  (test for errors, line 
> 16)
> FAIL: gcc.target/powerpc/bfp/scalar-extract-sig-5.c (test for excess errors)
> FAIL: gcc.target/powerpc/bfp/scalar-insert-exp-11.c  (test for errors, line 
> 20)
> FAIL: gcc.target/powerpc/bfp/scalar-insert-exp-11.c (test for excess errors)
> FAIL: gcc.target/powerpc/bfp/scalar-insert-exp-2.c  (test for errors, line 20)
> FAIL: gcc.target/powerpc/bfp/scalar-insert-exp-2.c (test for excess errors)
> FAIL: gcc.target/powerpc/bfp/scalar-insert-exp-5.c  (test for errors, line 20)
> FAIL: gcc.target/powerpc/bfp/scalar-insert-exp-5.c (test for excess errors)
> FAIL: gcc.target/powerpc/bfp/scalar-insert-exp-8.c  (test for errors, line 20)
> FAIL: gcc.target/powerpc/bfp/scalar-insert-exp-8.c (test for excess errors)
> FAIL: gcc.target/powerpc/cmpb-3.c  (test for errors, line 12)
> FAIL: gcc.target/powerpc/cmpb-3.c (test for excess errors)
> FAIL: gcc.target/powerpc/byte-in-set-2.c  (test for errors, line 15)
> FAIL: gcc.target/powerpc/byte-in-set-2.c (test for excess errors)
> FAIL: gcc.target/powerpc/vsu/vec-xl-len-13.c  (test for errors, line 17)
> FAIL: gcc.target/powerpc/vsu/vec-xl-len-13.c (test for excess errors)
> FAIL: gcc.target/powerpc/vsu/vec-xst-len-13.c  (test for errors, line 18)
> FAIL: gcc.target/powerpc/vsu/vec-xst-len-13.c (test for excess errors)
> 
> Andreas.
> 
> -- 
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."
> 



Re: [PATCH] Fix PR81719

2017-08-08 Thread Bin.Cheng
On Tue, Aug 8, 2017 at 1:20 PM, Richard Biener  wrote:
>
> The following improves niter analysis for range-based for loops
> by handling ADDR_EXPR in expand_simple_operations.
>
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>
> Richard.
>
> 2017-08-08  Richard Biener  
>
> PR middle-end/81719
> * tree-ssa-loop-niter.c: Include tree-dfa.h.
> (expand_simple_operations): Also look through ADDR_EXPRs with
> MEM_REF bases treating them as POINTER_PLUS_EXPR.
>
> * g++.dg/tree-ssa/pr81719.C: New testcase.
>
> Index: gcc/tree-ssa-loop-niter.c
> ===
> *** gcc/tree-ssa-loop-niter.c   (revision 250813)
> --- gcc/tree-ssa-loop-niter.c   (working copy)
> *** along with GCC; see the file COPYING3.
> *** 42,47 
> --- 42,48 
>   #include "tree-chrec.h"
>   #include "tree-scalar-evolution.h"
>   #include "params.h"
> + #include "tree-dfa.h"
>
>
>   /* The maximum number of dominator BBs we search for conditions
> *** expand_simple_operations (tree expr, tre
> *** 1980,1985 
> --- 1981,2001 
>
> if (code == SSA_NAME)
> return expand_simple_operations (e, stop);
> +   else if (code == ADDR_EXPR)
> +   {
> + HOST_WIDE_INT offset;
> + tree base = get_addr_base_and_unit_offset (TREE_OPERAND (e, 0),
> +);
> + if (base
> + && TREE_CODE (base) == MEM_REF)
> +   {
> + ee = expand_simple_operations (TREE_OPERAND (base, 0), stop);
> + return fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (expr), ee,
> + wide_int_to_tree (sizetype,
> +   mem_ref_offset (base)
> +   + offset));
> +   }
> +   }
>
> return expr;
>   }
There are various places we need to look into ADDR_EXPR (MEM_REF), is
it possible/beneficial to generally simplify _REF[base + offset]?

Thanks,
bin
> Index: gcc/testsuite/g++.dg/tree-ssa/pr81719.C
> ===
> *** gcc/testsuite/g++.dg/tree-ssa/pr81719.C (nonexistent)
> --- gcc/testsuite/g++.dg/tree-ssa/pr81719.C (working copy)
> ***
> *** 0 
> --- 1,24 
> + /* { dg-do compile { target c++11 } } */
> + /* { dg-options "-O3 -fdump-tree-optimized" } */
> +
> + typedef int Items[2];
> +
> + struct ItemArray
> + {
> +   Items items;
> +   int sum_x2() const;
> + };
> +
> + int ItemArray::sum_x2() const
> + {
> +   int total = 0;
> +   for (int item : items)
> + {
> +   total += item;
> + }
> +   return total;
> + }
> +
> + /* We should be able to compute the number of iterations to two, unroll
> +the loop and end up with a single basic-block in sum_x2.  */
> + /* { dg-final { scan-tree-dump-times "bb" 1 "optimized" } } */


[PATCH] Fix PR81719

2017-08-08 Thread Richard Biener

The following improves niter analysis for range-based for loops
by handling ADDR_EXPR in expand_simple_operations.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2017-08-08  Richard Biener  

PR middle-end/81719
* tree-ssa-loop-niter.c: Include tree-dfa.h.
(expand_simple_operations): Also look through ADDR_EXPRs with
MEM_REF bases treating them as POINTER_PLUS_EXPR.

* g++.dg/tree-ssa/pr81719.C: New testcase.

Index: gcc/tree-ssa-loop-niter.c
===
*** gcc/tree-ssa-loop-niter.c   (revision 250813)
--- gcc/tree-ssa-loop-niter.c   (working copy)
*** along with GCC; see the file COPYING3.
*** 42,47 
--- 42,48 
  #include "tree-chrec.h"
  #include "tree-scalar-evolution.h"
  #include "params.h"
+ #include "tree-dfa.h"
  
  
  /* The maximum number of dominator BBs we search for conditions
*** expand_simple_operations (tree expr, tre
*** 1980,1985 
--- 1981,2001 
  
if (code == SSA_NAME)
return expand_simple_operations (e, stop);
+   else if (code == ADDR_EXPR)
+   {
+ HOST_WIDE_INT offset;
+ tree base = get_addr_base_and_unit_offset (TREE_OPERAND (e, 0),
+);
+ if (base
+ && TREE_CODE (base) == MEM_REF)
+   {
+ ee = expand_simple_operations (TREE_OPERAND (base, 0), stop);
+ return fold_build2 (POINTER_PLUS_EXPR, TREE_TYPE (expr), ee,
+ wide_int_to_tree (sizetype,
+   mem_ref_offset (base)
+   + offset));
+   }
+   }
  
return expr;
  }
Index: gcc/testsuite/g++.dg/tree-ssa/pr81719.C
===
*** gcc/testsuite/g++.dg/tree-ssa/pr81719.C (nonexistent)
--- gcc/testsuite/g++.dg/tree-ssa/pr81719.C (working copy)
***
*** 0 
--- 1,24 
+ /* { dg-do compile { target c++11 } } */
+ /* { dg-options "-O3 -fdump-tree-optimized" } */
+ 
+ typedef int Items[2];
+ 
+ struct ItemArray
+ {
+   Items items;
+   int sum_x2() const;
+ };
+ 
+ int ItemArray::sum_x2() const
+ {
+   int total = 0;
+   for (int item : items)
+ {
+   total += item;
+ }
+   return total;
+ }
+ 
+ /* We should be able to compute the number of iterations to two, unroll
+the loop and end up with a single basic-block in sum_x2.  */
+ /* { dg-final { scan-tree-dump-times "bb" 1 "optimized" } } */


[PATCH] Fix PR81723

2017-08-08 Thread Richard Biener

I am testing the following patch to reduce SLP vectorization SLP build
time which can be exponential in some cases due to us now aggressively
allowing build-from-scalars as fallback.  The patch fixes this by
avoiding repeated (failing and thus not accounted with max_tree_size)
SLP builds on the same stmt when the build fails.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2017-08-08  Richard Biener  

PR tree-optimization/81723
* tree-vect-slp.c (struct bst_traits): New hash traits.
(bst_fail): New global.
(vect_build_slp_tree_2): New worker, split out from ...
(vect_build_slp_tree): ... this now wrapping it with using
bst_fail set to cache SLP tree build fails.  Properly handle
max_tree_size.
(vect_analyze_slp_instance): Allocate and free bst_fail.

* gfortran.dg/pr81723.f: New testcase.

Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 250947)
+++ gcc/tree-vect-slp.c (working copy)
@@ -905,12 +905,49 @@ vect_build_slp_tree_1 (vec_info *vinfo,
   return true;
 }
 
-/* Recursively build an SLP tree starting from NODE.
-   Fail (and return a value not equal to zero) if def-stmts are not
-   isomorphic, require data permutation or are of unsupported types of
-   operation.  Otherwise, return 0.
-   The value returned is the depth in the SLP tree where a mismatch
-   was found.  */
+/* Traits for the hash_set to record failed SLP builds for a stmt set.
+   Note we never remove apart from at destruction time so we do not
+   need a special value for deleted that differs from empty.  */
+struct bst_traits
+{
+  typedef vec  value_type;
+  typedef vec  compare_type;
+  static inline hashval_t hash (value_type);
+  static inline bool equal (value_type existing, value_type candidate);
+  static inline bool is_empty (value_type x) { return !x.exists (); }
+  static inline bool is_deleted (value_type x) { return !x.exists (); }
+  static inline void mark_empty (value_type ) { x.release (); }
+  static inline void mark_deleted (value_type ) { x.release (); }
+  static inline void remove (value_type ) { x.release (); }
+};
+inline hashval_t
+bst_traits::hash (value_type x)
+{
+  inchash::hash h;
+  for (unsigned i = 0; i < x.length (); ++i)
+h.add_int (gimple_uid (x[i]));
+  return h.end ();
+}
+inline bool
+bst_traits::equal (value_type existing, value_type candidate)
+{
+  if (existing.length () != candidate.length ())
+return false;
+  for (unsigned i = 0; i < existing.length (); ++i)
+if (existing[i] != candidate[i])
+  return false;
+  return true;
+}
+
+static hash_set , bst_traits> *bst_fail;
+
+static slp_tree
+vect_build_slp_tree_2 (vec_info *vinfo,
+  vec stmts, unsigned int group_size,
+  unsigned int *max_nunits,
+  vec *loads,
+  bool *matches, unsigned *npermutes, unsigned *tree_size,
+  unsigned max_tree_size);
 
 static slp_tree
 vect_build_slp_tree (vec_info *vinfo,
@@ -920,6 +957,39 @@ vect_build_slp_tree (vec_info *vinfo,
 bool *matches, unsigned *npermutes, unsigned *tree_size,
 unsigned max_tree_size)
 {
+  if (bst_fail->contains (stmts))
+return NULL;
+  slp_tree res = vect_build_slp_tree_2 (vinfo, stmts, group_size, max_nunits,
+   loads, matches, npermutes, tree_size,
+   max_tree_size);
+  /* When SLP build fails for stmts record this, otherwise SLP build
+ can be exponential in time when we allow to construct parts from
+ scalars, see PR81723.  */
+  if (! res)
+{
+  vec  x;
+  x.create (stmts.length ());
+  x.splice (stmts);
+  bst_fail->add (x);
+}
+  return res;
+}
+
+/* Recursively build an SLP tree starting from NODE.
+   Fail (and return a value not equal to zero) if def-stmts are not
+   isomorphic, require data permutation or are of unsupported types of
+   operation.  Otherwise, return 0.
+   The value returned is the depth in the SLP tree where a mismatch
+   was found.  */
+
+static slp_tree
+vect_build_slp_tree_2 (vec_info *vinfo,
+  vec stmts, unsigned int group_size,
+  unsigned int *max_nunits,
+  vec *loads,
+  bool *matches, unsigned *npermutes, unsigned *tree_size,
+  unsigned max_tree_size)
+{
   unsigned nops, i, this_tree_size = 0, this_max_nunits = *max_nunits;
   gimple *stmt;
   slp_tree node;
@@ -979,6 +1049,9 @@ vect_build_slp_tree (vec_info *vinfo,
 
   stmt = stmts[0];
 
+  if (tree_size)
+max_tree_size -= *tree_size;
+
   /* Create SLP_TREE nodes for the definition node/s.  */
   FOR_EACH_VEC_ELT (oprnds_info, i, oprnd_info)
 {
@@ -1874,9 +1947,11 @@ vect_analyze_slp_instance (vec_info *vin
   /* Build 

RE: [PATCH] [Aarch64] Optimize subtract in shift counts

2017-08-08 Thread Richard Kenner
> The pattern will only be matched if the value is positive. More
> specifically if the constant value is 32 (SImode) or 64 (DImode).

I don't mean the constant, but the value subtracted from it.
If that's negative, then we have a shift count larger than the
wordsize.



[PATCH][RFC] Patch candidate for other/39851

2017-08-08 Thread Martin Liška
Hi.

As mentioned in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39851#c0 we need
to call targetm.target_option.override () in order to properly report which
ISA options are enabled for a -march/-mtune. Currently, opts.c uses just
#include "common/common-target.h" we are unable to call the function direct.
One solution might be to put the hook to gcc/common/common-target.def, but
that would require huge refactoring of i386.c and i386-common.c files.

Thus I came with a small hook that lives in cl_option_handler_func.
With that I see proper results for test-case mentioned in the PR.

Thoughts?
Martin

>From 1bd4916c0c5594162d6a34c9eb4c7202931260df Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 8 Aug 2017 13:09:12 +0200
Subject: [PATCH] Patch candidate.

---
 gcc/c-family/c-common.c |  2 +-
 gcc/c-family/c-pragma.c |  2 +-
 gcc/gcc.c   |  3 ++-
 gcc/opts-common.c   |  3 ++-
 gcc/opts-global.c   | 12 
 gcc/opts.c  | 14 ++
 gcc/opts.h  | 18 +-
 gcc/toplev.c|  3 ++-
 8 files changed, 39 insertions(+), 18 deletions(-)

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index feb0904bcbf..a2469471fe5 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5496,7 +5496,7 @@ parse_optimize_options (tree args, bool attr_p)
   /* And apply them.  */
   decode_options (_options, _options_set,
 		  decoded_options, decoded_options_count,
-		  input_location, global_dc);
+		  input_location, global_dc, NULL);
 
   targetm.override_options_after_change();
 
diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index 48b02b88bb5..3b49aefc6ff 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -815,7 +815,7 @@ handle_pragma_diagnostic(cpp_reader *ARG_UNUSED(dummy))
 }
 
   struct cl_option_handlers handlers;
-  set_default_handlers ();
+  set_default_handlers (, NULL);
   const char *arg = NULL;
   if (cl_options[option_index].flags & CL_JOINED)
 arg = option_string + 1 + cl_options[option_index].opt_len;
diff --git a/gcc/gcc.c b/gcc/gcc.c
index d8c5260e36b..fde68809d68 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -3745,7 +3745,8 @@ driver_handle_option (struct gcc_options *opts,
 		  unsigned int lang_mask ATTRIBUTE_UNUSED, int kind,
 		  location_t loc,
 		  const struct cl_option_handlers *handlers ATTRIBUTE_UNUSED,
-		  diagnostic_context *dc)
+		  diagnostic_context *dc,
+		  void (*) (void))
 {
   size_t opt_index = decoded->opt_index;
   const char *arg = decoded->arg;
diff --git a/gcc/opts-common.c b/gcc/opts-common.c
index 0cab42a021c..d7568145768 100644
--- a/gcc/opts-common.c
+++ b/gcc/opts-common.c
@@ -993,7 +993,8 @@ handle_option (struct gcc_options *opts,
   {
 	if (!handlers->handlers[i].handler (opts, opts_set, decoded,
 	lang_mask, kind, loc,
-	handlers, dc))
+	handlers, dc,
+	handlers->target_option_override_hook))
 	  return false;
   }
   
diff --git a/gcc/opts-global.c b/gcc/opts-global.c
index fc55512e554..343dbd3ac2c 100644
--- a/gcc/opts-global.c
+++ b/gcc/opts-global.c
@@ -169,7 +169,8 @@ lang_handle_option (struct gcc_options *opts,
 		unsigned int lang_mask ATTRIBUTE_UNUSED, int kind,
 		location_t loc,
 		const struct cl_option_handlers *handlers,
-		diagnostic_context *dc)
+		diagnostic_context *dc,
+		void (*) (void))
 {
   gcc_assert (opts == _options);
   gcc_assert (opts_set == _options_set);
@@ -269,10 +270,12 @@ decode_cmdline_options_to_array_default_mask (unsigned int argc,
 /* Set *HANDLERS to the default set of option handlers for use in the
compilers proper (not the driver).  */
 void
-set_default_handlers (struct cl_option_handlers *handlers)
+set_default_handlers (struct cl_option_handlers *handlers,
+		  void (*target_option_override_hook) (void))
 {
   handlers->unknown_option_callback = unknown_option_callback;
   handlers->wrong_lang_callback = complain_wrong_lang;
+  handlers->target_option_override_hook = target_option_override_hook;
   handlers->num_handlers = 3;
   handlers->handlers[0].handler = lang_handle_option;
   handlers->handlers[0].mask = initial_lang_mask;
@@ -290,7 +293,8 @@ void
 decode_options (struct gcc_options *opts, struct gcc_options *opts_set,
 		struct cl_decoded_option *decoded_options,
 		unsigned int decoded_options_count,
-		location_t loc, diagnostic_context *dc)
+		location_t loc, diagnostic_context *dc,
+		void (*target_option_override_hook) (void))
 {
   struct cl_option_handlers handlers;
 
@@ -298,7 +302,7 @@ decode_options (struct gcc_options *opts, struct gcc_options *opts_set,
 
   lang_mask = initial_lang_mask;
 
-  set_default_handlers ();
+  set_default_handlers (, target_option_override_hook);
 
   default_options_optimization (opts, opts_set,
 decoded_options, decoded_options_count,
diff --git a/gcc/opts.c b/gcc/opts.c
index 989cc6b6dec..5bc17be14b8 100644
--- 

Re: [PATCH] ICF: properly handle LABEL_DECLs (PR tree-opt/81696).

2017-08-08 Thread Martin Liška
On 08/08/2017 01:17 PM, Richard Biener wrote:
> On Tue, Aug 8, 2017 at 1:11 PM, Martin Liška  wrote:
>> Hello.
>>
>> As LABEL_DECL can point to another function (non-local goto), we must 
>> properly
>> compare them.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Ready to be installed ?
> 
> Ok.

Thanks.

> 
> So with bb1 == bb2 == NULL could you equate them in case the DECL_CONTEXT
> functions were merged by ICF (and the labels)?  (I guess it's not
> worth the trouble)

No, that would be very rare case.

Martin

> 
> Thanks,
> Richard.
> 
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2017-08-08  Martin Liska  
>>
>> PR tree-opt/81696
>> * ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Consider
>> LABEL_DECLs that can be from a different function.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2017-08-08  Martin Liska  
>>
>> PR tree-opt/81696
>> * gcc.dg/ipa/pr81696.c: New test.
>> ---
>>  gcc/ipa-icf-gimple.c   |  6 +-
>>  gcc/testsuite/gcc.dg/ipa/pr81696.c | 26 ++
>>  2 files changed, 31 insertions(+), 1 deletion(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr81696.c
>>
>>



Re: [PATCH] Convert character arrays to string csts

2017-08-08 Thread Martin Liška
On 11/09/2016 11:22 AM, Richard Biener wrote:
> On Fri, Nov 4, 2016 at 2:33 PM, Martin Liška  wrote:
>> On 11/03/2016 02:00 PM, Jan Hubicka wrote:
 On 11/03/2016 01:12 PM, Martin Liška wrote:
> +  tree init = DECL_INITIAL (decl);
> +  if (init
> +  && TREE_READONLY (decl)
> +  && can_convert_ctor_to_string_cst (init))
> +DECL_INITIAL (decl) = build_string_cst_from_ctor (init);

 I'd merge these two new functions since they're only ever called
 together. We'd then have something like

 if (init && TREE_READONLY (decl))
   init = convert_ctor_to_string_cst (init);
 if (init)
   DECL_INITIAL (decl) = init;
>>
>> Done.
>>

 I'll defer to Jan on whether finalize_decl seems like a good place
 to do this.
>>>
>>> I think finalize_decl may be bit too early because frontends may expects the
>>> ctors to be in a way they produced them.  We only want to convert those 
>>> arrays
>>> seen by middle-end so I would move the logic to varpool_node::analyze
>>>
>>> Otherwise the patch seems fine to me.
>>>
>>> Honza

> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
> index 283bd1c..b2d1fd5 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/builtins-folding-gimple.c
> @@ -4,12 +4,15 @@
> char *buffer1;
> char *buffer2;
>
> +const char global[] = {'a', 'b', 'c', 'd', '\0'};
> +
> #define SIZE 1000
>
> int
> main (void)
> {
>   const char* const foo1 = "hello world";
> +  const char local[] = "abcd";
>
>   buffer1 = __builtin_malloc (SIZE);
>   __builtin_strcpy (buffer1, foo1);
> @@ -45,6 +48,10 @@ main (void)
> __builtin_abort ();
>   if (__builtin_memchr (foo1, null, 12) != foo1 + 11)
> __builtin_abort ();
> +  if (__builtin_memchr (global, null, 5) == 0)
> +__builtin_abort ();
> +  if (__builtin_memchr (local, null, 5) == 0)
> +__builtin_abort ();

 How is that a meaningful test? This seems to work even with an
 unpatched gcc. I'd have expected something that shows a benefit for
 doing this conversion, and maybe also a test that shows it isn't
 done in cases where it's not allowed.
>>
>> It's meaningful as it scans that there's no __builtin_memchr in optimized 
>> dump.
>> I'm adding new tests that does the opposite test.
>>

> tree
> -build_string_literal (int len, const char *str)
> +build_string_literal (int len, const char *str, bool build_addr_expr)

 New arguments should be documented in the function comment.
>>
>> Yep, improved.
>>

> +/* Return TRUE when CTOR can be converted to a string constant.  */

 "if", not "when".
>>
>> Done.
>>

> +  unsigned HOST_WIDE_INT elements = CONSTRUCTOR_NELTS (ctor);
> +  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (ctor), idx, key, value)
> +{
> +  if (key == NULL_TREE
> + || TREE_CODE (key) != INTEGER_CST
> + || !tree_fits_uhwi_p (value)
> + || !useless_type_conversion_p (TREE_TYPE (value), char_type_node))
> +   return false;

 Shouldn't all elements have the same type, or do you really have to
 call useless_type_conversion in a loop?

> +  /* Allow zero character just at the end of a string.  */
> +  if (integer_zerop (value))
> +   return idx == elements - 1;

 Don't you also have to explicitly check it's there?


 Bernd
>>
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> I'm curious about the
> 
> @@ -1495,7 +1495,8 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p)
> {
>   if (!TREE_STATIC (decl))
> {
> - DECL_INITIAL (decl) = NULL_TREE;
> + if (!TREE_READONLY (decl) || TREE_CODE (init) != STRING_CST)
> +   DECL_INITIAL (decl) = NULL_TREE;
>   init = build2 (INIT_EXPR, void_type_node, decl, init);
>   gimplify_and_add (init, seq_p);
>   ggc_free (init);
> 
> change.  Why keep DECL_INITIAL if you build a INIT_EXPR anyway?
> 
> @@ -4438,6 +4439,19 @@ gimplify_init_constructor (tree *expr_p,
> gimple_seq *pre_p, gimple_seq *post_p,
> break;
>   }
> 
> +   /* Replace a ctor with a string constant with possible.  */
> +   if (TREE_READONLY (object)
> +   && VAR_P (object))
> + {
> +   tree string_ctor = convert_ctor_to_string_cst (ctor);
> +   if (string_ctor)
> + {
> +   TREE_OPERAND (*expr_p, 1) = string_ctor;
> +   DECL_INITIAL (object) = string_ctor;
> +   break;
> + }
> + }
> +
> /* Fetch information about the constructor to direct later processing.
>We 

[PATCH] Fix PR81766

2017-08-08 Thread Richard Biener

The following patch is a partial reversal of r250815 (on the branch)
which restores behavior of always calling find_many_sub_basic_blocks
on prologue inserted code.  Doing that probably masks an issue with
either scheduling or the way x86 sets up the PIC register.

The patch allows grub2 to successfully test the availability of
-mcmodel=large.

Bootstrap / regtest running on x86_64-unknown-linux-gnu, ok for
the branch?

I'd appreciate more analysis for a trunk fix from someone who knows RTL
constraints better than me.  This patch looks better than full reversal
of the patch.

I'll do RC2 when this is on the branch, the 7.2 release will be delayed
until next week (I'm on PTO this week apart from today).

Thanks,
Richard.

2017-08-08  Richard Biener  

PR middle-end/81766
* function.c (thread_prologue_and_epilogue_insns): Restore
behavior of always calling find_many_sub_basic_blocks on
the inserted prologue.

* gcc.target/i386/pr81766.c: New testcase.

Index: gcc/function.c
===
--- gcc/function.c  (revision 250947)
+++ gcc/function.c  (working copy)
@@ -6082,17 +6082,16 @@ thread_prologue_and_epilogue_insns (void
   if (prologue_insn
  && BLOCK_FOR_INSN (prologue_insn) == NULL)
prologue_insn = NULL;
-  if (split_prologue_insn || prologue_insn)
-   {
- auto_sbitmap blocks (last_basic_block_for_fn (cfun));
- bitmap_clear (blocks);
- if (split_prologue_insn)
-   bitmap_set_bit (blocks,
-   BLOCK_FOR_INSN (split_prologue_insn)->index);
- if (prologue_insn)
-   bitmap_set_bit (blocks, BLOCK_FOR_INSN (prologue_insn)->index);
- find_many_sub_basic_blocks (blocks);
-   }
+  auto_sbitmap blocks (last_basic_block_for_fn (cfun));
+  bitmap_clear (blocks);
+  if (split_prologue_insn)
+   bitmap_set_bit (blocks,
+   BLOCK_FOR_INSN (split_prologue_insn)->index);
+  if (prologue_insn)
+   bitmap_set_bit (blocks, BLOCK_FOR_INSN (prologue_insn)->index);
+  bitmap_set_bit (blocks, entry_edge->dest->index);
+  bitmap_set_bit (blocks, orig_entry_edge->dest->index);
+  find_many_sub_basic_blocks (blocks);
 }
 
   default_rtl_profile ();
Index: gcc/testsuite/gcc.target/i386/pr81766.c
===
--- gcc/testsuite/gcc.target/i386/pr81766.c (nonexistent)
+++ gcc/testsuite/gcc.target/i386/pr81766.c (working copy)
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O2 -fPIE -mcmodel=large" } */
+
+int main() { return 0; }


Re: C PATCH to display types when printing a conversion warning (PR c/81233)

2017-08-08 Thread Marek Polacek
Ping.

On Mon, Jul 31, 2017 at 01:27:01PM +0200, Marek Polacek wrote:
> Ping.
> 
> On Thu, Jul 20, 2017 at 12:53:10PM +0200, Marek Polacek wrote:
> > On Wed, Jul 19, 2017 at 10:51:33AM -0400, David Malcolm wrote:
> > > The changes to diagnostic-core.h and diagnostic.c are OK.
> > 
> > Thanks.
> >  
> > > > Also,
> > > > PEDWARN_FOR_ASSIGNMENT didn't work with the addition of printing TYPE
> > > > and
> > > > RHSTYPE so I just decided to unroll the macro instead of making it
> > > > even more
> > > > ugly.
> > > > This patch is long but it's mainly because of the testsuite fallout.
> > > 
> > > The comment by PEDWARN_FOR_ASSIGNMENT says:
> > > 
> > > 
> > >   /* This macro is used to emit diagnostics to ensure that all format
> > >  strings are complete sentences, visible to gettext and checked
> > > at
> > >  compile time.  */
> > > 
> > > I wonder if it's possible to convert it to an inline function to get
> > > the same test coverage, without unrolling the macro?
> > 
> > Yeah, I tried, but the resulting inline function would have to have 12
> > parameters if I count well and that didn't seem like a win.  Perhaps
> > splitting convert_for_assignment would make sense, but likely not as
> > part of this patch.

Marek


Re: C PATCH to further improve diagnostic for -Wsign-compare (PR c/81417)

2017-08-08 Thread Marek Polacek
Ping.

The make_location change should allow us to do really cool things, if we
pass c_exprs to build_binary_loc, btw.  Like all the ranges for diagnostic
and similar.

On Wed, Aug 02, 2017 at 02:43:39PM +0200, Marek Polacek wrote:
> On Tue, Aug 01, 2017 at 04:48:01PM -0400, David Malcolm wrote:
> > I'm wondering if the messages could use a slight rewording, to give a
> > clue to the user about the reason *why* the expression has changed
> > signedness.  The old message "signed and unsigned type in conditional
> > expression" gave the clue (but failed to underline the subexpression
> > changing sign, and tell what the old/new types were).
> > 
> > A horribly verbose way to put it would be something like:
> > 
> > "operand of conditional expression with mixed signedness changes
> > signedness from %qT to %qT due to promotion to unsigned to match
> > unsignedness of other operand" (ugh)
> > 
> > (assuming I'm understanding the logic correctly)
> > 
> > or something like:
> > 
> > "operand of conditional expression changes signedness from %qT to %qT
> > due to unsignedness of other operand"
> > 
> > or somesuch (am not 100% happy with that either).
> 
> Hmm, how about this, then?
> 
> "operand of ?: changes signedness from %qT to %qT due to unsignedness of 
> other operand"
> 
> I couldn't come up with anything more brief yet conveying all the information.
> I don't like adding "second"/"third"/... very much; we should offer a good
> location already.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2017-08-02  Marek Polacek  
> 
>   PR c/81417
>   * c-array-notation.c (fix_builtin_array_notation_fn): Update calls to
>   build_conditional_expr. 
>   * c-parser.c (c_parser_conditional_expression): Create locations for
>   EXP1 and EXP2 from their source ranges.  Pass the locations down to
>   build_conditional_expr.
>   * c-tree.h (build_conditional_expr): Update declaration.
>   * c-typeck.c (build_conditional_expr): Add location_t parameters.
>   For -Wsign-compare, also print the types.
> 
>   * input.c (make_location): New overload.
>   * input.h (make_location): Declare.
> 
>   * objc-next-runtime-abi-02.c (build_v2_build_objc_method_call): Update
>   a call to build_conditional_expr.
> 
>   * Wsign-compare-1.c: New test.
>   * gcc.dg/compare1.c: Adjust dg-bogus.
>   * gcc.dg/compare2.c: Likewise.
>   * gcc.dg/compare3.c: Likewise.
>   * gcc.dg/compare7.c: Likewise.
>   * gcc.dg/compare8.c: Likewise.
>   * gcc.dg/compare9.c: Likewise.
>   * gcc.dg/pr11492.c: Likewise.
> 
> diff --git gcc/c/c-array-notation.c gcc/c/c-array-notation.c
> index e430f5c681b..40f1cfdabb8 100644
> --- gcc/c/c-array-notation.c
> +++ gcc/c/c-array-notation.c
> @@ -355,8 +355,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
> *new_var)
>new_cond_expr = build2 (NE_EXPR, TREE_TYPE (func_parm), func_parm,
> build_zero_cst (TREE_TYPE (func_parm)));
>new_expr = build_conditional_expr
> - (location, new_cond_expr, false, new_yes_expr,
> -  TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));
> + (location, new_cond_expr, false,
> +  new_yes_expr, TREE_TYPE (new_yes_expr), location,
> +  new_no_expr, TREE_TYPE (new_no_expr), location);
>break;
>  case BUILT_IN_CILKPLUS_SEC_REDUCE_ALL_NONZERO:
>new_var_init = build_modify_expr
> @@ -375,8 +376,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
> *new_var)
>new_cond_expr = build2 (EQ_EXPR, TREE_TYPE (func_parm), func_parm,
> build_zero_cst (TREE_TYPE (func_parm)));
>new_expr = build_conditional_expr
> - (location, new_cond_expr, false, new_yes_expr,
> -  TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));
> + (location, new_cond_expr, false,
> +  new_yes_expr, TREE_TYPE (new_yes_expr), location,
> +  new_no_expr, TREE_TYPE (new_no_expr), location);
>break;
>  case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_ZERO:
>new_var_init = build_modify_expr
> @@ -394,8 +396,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
> *new_var)
>new_cond_expr = build2 (EQ_EXPR, TREE_TYPE (func_parm), func_parm,
> build_zero_cst (TREE_TYPE (func_parm)));
>new_expr = build_conditional_expr
> - (location, new_cond_expr, false, new_yes_expr,
> -  TREE_TYPE (new_yes_expr), new_no_expr, TREE_TYPE (new_no_expr));   
> + (location, new_cond_expr, false,
> +  new_yes_expr, TREE_TYPE (new_yes_expr), location,
> +  new_no_expr, TREE_TYPE (new_no_expr), location);
>break;
>  case BUILT_IN_CILKPLUS_SEC_REDUCE_ANY_NONZERO:
>new_var_init = build_modify_expr
> @@ -413,8 +416,9 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
> *new_var)
>new_cond_expr = build2 (NE_EXPR, TREE_TYPE (func_parm), 

RE: [GCC][PATCH][mid-end][Version 3] Optimize x * copysign (1.0, y) [Patch (1/2)]

2017-08-08 Thread Richard Biener
On Mon, 7 Aug 2017, Tamar Christina wrote:

> Hi Richard,
> 
> >   switch (code)
> > {
> > case MULT_EXPR:
> >   if (!convert_mult_to_widen (stmt, )
> >   && !convert_expand_mult_copysign (stmt, )
> >   && convert_mult_to_fma (stmt,
> >   gimple_assign_rhs1 (stmt),
> >   gimple_assign_rhs2 (stmt)))
> > 
> > given you likely do not want x * copysign (1, y) + z to be transformed into
> > FMA but still use xorsign?
> 
> Ah yes, that's correct, thanks!
> 
> > 
> > I am also eventually missing a single-use check on the copysign result.  If 
> > the
> > result of copysign is used in multiple places do we want to keep the 
> > multiply
> > or is the xorsign much cheaper?
> > The only eventual disadvantage would be higher register pressure if y and
> > copysing (1, y) were not live at the same time before.
> 
> No the transformation should be limited for single use, I've updated the 
> patch accordingly.

+static bool
+is_copysign_call_with_1 (gimple *call)
+{
+  gcall *c = dyn_cast  (call);
+  if (! c)
+return false;
+
+  enum combined_fn code = gimple_call_combined_fn (call);

use c instead of call.

Ok with that change.

Thanks,
Richard.

> Thanks,
> Tamar
> 
> > 
> > 
> > > gcc/
> > > 2017-08-03  Tamar Christina  
> > >   Andrew Pinski 
> > >
> > >   PR middle-end/19706
> > >   * internal-fn.def (XORSIGN): New.
> > >   * optabs.def (xorsign_optab): New.
> > >   * tree-ssa-math-opts.c (is_copysign_call_with_1): New.
> > >   (convert_expand_mult_copysign): New.
> > >   (pass_optimize_widening_mul::execute): Call
> > convert_expand_mult_copysign.
> > >
> > >
> > 
> > --
> > Richard Biener 
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> > HRB 21284 (AG Nuernberg)
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH] ICF: properly handle LABEL_DECLs (PR tree-opt/81696).

2017-08-08 Thread Richard Biener
On Tue, Aug 8, 2017 at 1:11 PM, Martin Liška  wrote:
> Hello.
>
> As LABEL_DECL can point to another function (non-local goto), we must properly
> compare them.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Ready to be installed ?

Ok.

So with bb1 == bb2 == NULL could you equate them in case the DECL_CONTEXT
functions were merged by ICF (and the labels)?  (I guess it's not
worth the trouble)

Thanks,
Richard.

> Martin
>
> gcc/ChangeLog:
>
> 2017-08-08  Martin Liska  
>
> PR tree-opt/81696
> * ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Consider
> LABEL_DECLs that can be from a different function.
>
> gcc/testsuite/ChangeLog:
>
> 2017-08-08  Martin Liska  
>
> PR tree-opt/81696
> * gcc.dg/ipa/pr81696.c: New test.
> ---
>  gcc/ipa-icf-gimple.c   |  6 +-
>  gcc/testsuite/gcc.dg/ipa/pr81696.c | 26 ++
>  2 files changed, 31 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr81696.c
>
>


Re: [PATCH PR81744]Fix ICE by deep copying expression of loop's number of iterations

2017-08-08 Thread Richard Biener
On Tue, Aug 8, 2017 at 12:35 PM, Bin Cheng  wrote:
> Hi,
> This is an obvious patch.  It fixes ICE in PR81744 by deep copying expression
> of loop's number of iterations.
> Test result checked.  Is it OK?

Ok.

Thanks,
Richard.

> Thanks,
> bin
> 2017-08-07  Bin Cheng  
>
> PR tree-optimization/81744
> * tree-predcom.c (prepare_finalizers_chain): Deep copy expr of
> loop's number of iterations.
>
> gcc/testsuite/ChangeLog
> 2017-08-07  Bin Cheng  
>
> PR tree-optimization/81744
> * gcc.dg/tree-ssa/pr81744.c: New.


[PATCH] ICF: properly handle LABEL_DECLs (PR tree-opt/81696).

2017-08-08 Thread Martin Liška
Hello.

As LABEL_DECL can point to another function (non-local goto), we must properly
compare them.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed ?
Martin

gcc/ChangeLog:

2017-08-08  Martin Liska  

PR tree-opt/81696
* ipa-icf-gimple.c (func_checker::compare_cst_or_decl): Consider
LABEL_DECLs that can be from a different function.

gcc/testsuite/ChangeLog:

2017-08-08  Martin Liska  

PR tree-opt/81696
* gcc.dg/ipa/pr81696.c: New test.
---
 gcc/ipa-icf-gimple.c   |  6 +-
 gcc/testsuite/gcc.dg/ipa/pr81696.c | 26 ++
 2 files changed, 31 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr81696.c


diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
index 08dd980fdf3..f44a995f580 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -361,10 +361,14 @@ func_checker::compare_cst_or_decl (tree t1, tree t2)
   }
 case LABEL_DECL:
   {
+	if (t1 == t2)
+	  return true;
+
 	int *bb1 = m_label_bb_map.get (t1);
 	int *bb2 = m_label_bb_map.get (t2);
 
-	return return_with_debug (*bb1 == *bb2);
+	/* Labels can point to another function (non-local GOTOs).  */
+	return return_with_debug (bb1 != NULL && bb2 != NULL && *bb1 == *bb2);
   }
 case PARM_DECL:
 case RESULT_DECL:
diff --git a/gcc/testsuite/gcc.dg/ipa/pr81696.c b/gcc/testsuite/gcc.dg/ipa/pr81696.c
new file mode 100644
index 000..2d3d63ff0bb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr81696.c
@@ -0,0 +1,26 @@
+/* { dg-options "-O2 -fdump-ipa-icf-details"  } */
+
+int
+main (int argc, char **argv)
+{
+  __label__ lab4, lab5, lab6;
+
+  void foo (void) { goto lab4; }
+  void foo2 (void) { goto lab4; }
+  void bar (void) { goto lab5; }
+  void baz (void) { goto lab6; }
+
+  if (argc)
+foo ();
+  else
+foo2 ();
+
+ lab4:;
+  bar ();
+ lab5:;
+  baz ();
+ lab6:;
+  return 0;
+}
+
+/* { dg-final { scan-ipa-dump "Equal symbols: 1" "icf"  } } */



Re: [PATCH] Fix PR81354 (SLSR, insert on edges)

2017-08-08 Thread Richard Biener
On Mon, Aug 7, 2017 at 8:51 PM, Bill Schmidt
 wrote:
> Hi,
>
> PR81354 describes an ICE in SLSR that occurs when a GIMPLE PHI statement 
> changes
> address during the SLSR pass.  SLSR contains a mapping table from GIMPLE 
> statements
> to entries in the candidate table that relies on this address remaining 
> constant.
>
> The address change occurs during gimple_split_edge (), as the PHIs in the dest
> block of the edge first have their number of predecessors grow by 1 and then
> shrink back to their original size.  The growth causes the PHIs to usually be
> reallocated.  We've discussed some attempts to fix this in gimple_split_edge,
> but the simplest solution is to fix this directly in SLSR by deferring the 
> splitting
> of the edge until the end of the pass, using gsi_insert_on_edge and
> gsi_commit_edge_inserts.  That's what this patch does.
>
> Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.  Is 
> this ok
> for trunk, and for backports to GCC 5, 6, and 7 after burn-in?

Ok.

Thanks,
Richard.

> Thanks,
> Bill
>
>
> [gcc]
>
> 2017-08-07  Bill Schmidt  
>
> PR tree-optimization/81354
> * gimple-ssa-strength-reduction.c (create_add_on_incoming_edge):
> Insert on edges rather than explicitly creating landing pads.
> (analyze_candidates_and_replace): Commit edge inserts.
>
>
> [gcc/testsuite]
>
> 2017-08-07  Bill Schmidt  
>
> PR tree-optimization/81354
> * g++.dg/torture/pr81354.C: New file.
>
>
> Index: gcc/gimple-ssa-strength-reduction.c
> ===
> --- gcc/gimple-ssa-strength-reduction.c (revision 250791)
> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> @@ -2224,8 +2224,6 @@ create_add_on_incoming_edge (slsr_cand_t c, tree b
>  widest_int increment, edge e, location_t loc,
>  bool known_stride)
>  {
> -  basic_block insert_bb;
> -  gimple_stmt_iterator gsi;
>tree lhs, basis_type;
>gassign *new_stmt, *cast_stmt = NULL;
>
> @@ -2294,39 +2292,25 @@ create_add_on_incoming_edge (slsr_cand_t c, tree b
>}
>  }
>
> -  insert_bb = single_succ_p (e->src) ? e->src : split_edge (e);
> -  gsi = gsi_last_bb (insert_bb);
> -
> -  if (!gsi_end_p (gsi) && stmt_ends_bb_p (gsi_stmt (gsi)))
> +  if (cast_stmt)
>  {
> -  gsi_insert_before (, new_stmt, GSI_SAME_STMT);
> -  if (cast_stmt)
> -   {
> - gsi_insert_before (, cast_stmt, GSI_SAME_STMT);
> - gimple_set_location (cast_stmt, loc);
> -   }
> +  gimple_set_location (cast_stmt, loc);
> +  gsi_insert_on_edge (e, cast_stmt);
>  }
> -  else
> -{
> -  if (cast_stmt)
> -   {
> - gsi_insert_after (, cast_stmt, GSI_NEW_STMT);
> - gimple_set_location (cast_stmt, loc);
> -   }
> -  gsi_insert_after (, new_stmt, GSI_NEW_STMT);
> -}
>
>gimple_set_location (new_stmt, loc);
> +  gsi_insert_on_edge (e, new_stmt);
>
>if (dump_file && (dump_flags & TDF_DETAILS))
>  {
>if (cast_stmt)
> {
> - fprintf (dump_file, "Inserting cast in block %d: ",
> -  insert_bb->index);
> + fprintf (dump_file, "Inserting cast on edge %d->%d: ",
> +  e->src->index, e->dest->index);
>   print_gimple_stmt (dump_file, cast_stmt, 0);
> }
> -  fprintf (dump_file, "Inserting in block %d: ", insert_bb->index);
> +  fprintf (dump_file, "Inserting on edge %d->%d: ", e->src->index,
> +  e->dest->index);
>print_gimple_stmt (dump_file, new_stmt, 0);
>  }
>
> @@ -3770,6 +3754,10 @@ analyze_candidates_and_replace (void)
>   free (incr_vec);
> }
>  }
> +
> +  /* For conditional candidates, we may have uncommitted insertions
> + on edges to clean up.  */
> +  gsi_commit_edge_inserts ();
>  }
>
>  namespace {
> Index: gcc/testsuite/g++.dg/torture/pr81354.C
> ===
> --- gcc/testsuite/g++.dg/torture/pr81354.C  (nonexistent)
> +++ gcc/testsuite/g++.dg/torture/pr81354.C  (working copy)
> @@ -0,0 +1,24 @@
> +// PR81354 reported this test as crashing in a limited range of revisions.
> +// { dg-do compile }
> +
> +struct T { double a; double b; };
> +
> +void foo(T Ad[], int As[2])
> +{
> +  int j;
> +  int i;
> +  int Bs[2] = {0,0};
> +  T Bd[16];
> +
> +  for (j = 0; j < 4; j++) {
> +for (i = 0; i + 1 <= j + 1; i++) {
> +  Ad[i + As[0] * j] = Bd[i + Bs[0] * j];
> +}
> +
> +i = j + 1;  // <- comment out this line and it does not crash
> +for (; i + 1 < 5; i++) {
> +  Ad[i + As[0] * j].a = 0.0;
> +  Ad[i + As[0] * j].b = 0.0;
> +}
> +  }
> +}
>


[PATCH PR81744]Fix ICE by deep copying expression of loop's number of iterations

2017-08-08 Thread Bin Cheng
Hi,
This is an obvious patch.  It fixes ICE in PR81744 by deep copying expression
of loop's number of iterations.
Test result checked.  Is it OK?

Thanks,
bin
2017-08-07  Bin Cheng  

PR tree-optimization/81744
* tree-predcom.c (prepare_finalizers_chain): Deep copy expr of
loop's number of iterations.

gcc/testsuite/ChangeLog
2017-08-07  Bin Cheng  

PR tree-optimization/81744
* gcc.dg/tree-ssa/pr81744.c: New.diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr81744.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr81744.c
new file mode 100644
index 000..b0f5d38f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr81744.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fno-tree-loop-vectorize -fno-tree-slp-vectorize 
-fno-inline -fdump-tree-pcom-details" } */
+
+typedef struct {
+  int a, b;
+} CompandSegment;
+int a;
+CompandSegment *b;
+void fn1() {
+  for (; a; a++)
+b[a].a = b[a].b = b[a - 1].a = b[a - 1].b = 0;
+}
+/* { dg-final { scan-tree-dump-times "Store-stores chain" 2 "pcom"} } */
diff --git a/gcc/tree-predcom.c b/gcc/tree-predcom.c
index 4538773..e7b10cb 100644
--- a/gcc/tree-predcom.c
+++ b/gcc/tree-predcom.c
@@ -2940,7 +2940,7 @@ prepare_finalizers_chain (struct loop *loop, chain_p 
chain)
 
   if (TREE_CODE (niters) != INTEGER_CST && TREE_CODE (niters) != SSA_NAME)
{
- niters = copy_node (niters);
+ niters = unshare_expr (niters);
  niters = force_gimple_operand (niters, , true, NULL);
  if (stmts)
{


Re: [PATCH] Fix OpenMP/OpenACC/Cilk+ issues with forced/non-local labels (PR c/81687)

2017-08-08 Thread Richard Biener
On Fri, 4 Aug 2017, Jakub Jelinek wrote:

> Hi!
> 
> If there are forced or non-local labels in OpenMP parallel/task/target
> regions (or OpenACC or Cilk+), then we often fail to link as the following
> testcases show - the labels aren't emitted anywhere, just references to
> them.
> 
> While it is true that for forced/non-local labels we can't clone the basic
> blocks containing them, for OpenMP we actually aren't cloning it, but
> outlining the SESE region to a new function, i.e. moving the blocks.
> As such, we can move the label too.  It is UB if a goto to such a label
> (computed or non-local) is performed across boundaries of the region
> (from outside of the region into it or from inside of it to outside),
> but it should be ok to use it within the same region, and if just printing
> address or something similar we can even reference label in one function
> from another one (where one of them is outlined from the other or similar).
> 
> The following patch arranges for this - does not copy such labels during
> lowering, and ensures that their DECL_CONTEXT will be the function that
> contains the label (rather than just any reference to it).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2017-08-04  Jakub Jelinek  
> 
>   PR c/81687
>   * omp-low.c (omp_copy_decl): Don't remap FORCED_LABEL or DECL_NONLOCAL
>   LABEL_DECLs.
>   * tree-cfg.c (move_stmt_op): Don't adjust DECL_CONTEXT of FORCED_LABEL
>   or DECL_NONLOCAL labels.
>   (move_stmt_r) : Adjust DECL_CONTEXT of FORCED_LABEL
>   or DECL_NONLOCAL labels here.
> 
>   * testsuite/libgomp.c/pr81687-1.c: New test.
>   * testsuite/libgomp.c/pr81687-2.c: New test.
> 
> --- gcc/omp-low.c.jj  2017-08-03 10:33:41.0 +0200
> +++ gcc/omp-low.c 2017-08-03 15:37:52.998873566 +0200
> @@ -796,6 +796,8 @@ omp_copy_decl (tree var, copy_body_data
>  
>if (TREE_CODE (var) == LABEL_DECL)
>  {
> +  if (FORCED_LABEL (var) || DECL_NONLOCAL (var))
> + return var;
>new_var = create_artificial_label (DECL_SOURCE_LOCATION (var));
>DECL_CONTEXT (new_var) = current_function_decl;
>insert_decl_map (>cb, var, new_var);
> --- gcc/tree-cfg.c.jj 2017-07-29 09:48:43.0 +0200
> +++ gcc/tree-cfg.c2017-08-03 15:37:40.395021370 +0200
> @@ -6718,7 +6718,15 @@ move_stmt_op (tree *tp, int *walk_subtre
>   *tp = t = out->to;
>   }
>  
> -   DECL_CONTEXT (t) = p->to_context;
> +   /* For FORCED_LABELs we can end up with references from other
> +  functions if some SESE regions are outlined.  It is UB to
> +  jump in between them, but they could be used just for printing
> +  addresses etc.  In that case, DECL_CONTEXT on the label should
> +  be the function containing the glabel stmt with that LABEL_DECL,
> +  rather than whatever function a reference to the label was seen
> +  last time.  */
> +   if (!FORCED_LABEL (t) && !DECL_NONLOCAL (t))
> + DECL_CONTEXT (t) = p->to_context;
>   }
>else if (p->remap_decls_p)
>   {
> @@ -6836,6 +6844,21 @@ move_stmt_r (gimple_stmt_iterator *gsi_p
>  case GIMPLE_OMP_RETURN:
>  case GIMPLE_OMP_CONTINUE:
>break;
> +
> +case GIMPLE_LABEL:
> +  {
> + /* For FORCED_LABEL, move_stmt_op doesn't adjust DECL_CONTEXT,
> +so that such labels can be referenced from other regions.
> +Make sure to update it when seeing a GIMPLE_LABEL though,
> +that is the owner of the label.  */
> + walk_gimple_op (stmt, move_stmt_op, wi);
> + *handled_ops_p = true;
> + tree label = gimple_label_label (as_a  (stmt));
> + if (FORCED_LABEL (label) || DECL_NONLOCAL (label))
> +   DECL_CONTEXT (label) = p->to_context;
> +  }
> +  break;
> +
>  default:
>if (is_gimple_omp (stmt))
>   {
> --- libgomp/testsuite/libgomp.c/pr81687-1.c.jj2017-08-03 
> 15:43:32.832888380 +0200
> +++ libgomp/testsuite/libgomp.c/pr81687-1.c   2017-08-03 15:43:06.0 
> +0200
> @@ -0,0 +1,23 @@
> +/* PR c/81687 */
> +/* { dg-do link } */
> +/* { dg-additional-options "-O2" } */
> +
> +extern int printf (const char *, ...);
> +
> +int
> +main ()
> +{
> +  #pragma omp parallel
> +  {
> +   lab1:
> +printf ("lab1=%p\n", (void *)(&));
> +  }
> + lab2:
> +  #pragma omp parallel
> +  {
> +   lab3:
> +printf ("lab2=%p\n", (void *)(&));
> +  }
> +  printf ("lab3=%p\n", (void *)(&));
> +  return 0;
> +}
> --- libgomp/testsuite/libgomp.c/pr81687-2.c.jj2017-08-03 
> 15:43:36.229848545 +0200
> +++ libgomp/testsuite/libgomp.c/pr81687-2.c   2017-08-03 15:43:15.0 
> +0200
> @@ -0,0 +1,27 @@
> +/* PR c/81687 */
> +/* { dg-do link } */
> +/* { dg-additional-options "-O2" } */
> +
> +int
> +main ()
> +{
> +  __label__ lab4, lab5, lab6;
> +  volatile int l = 0;
> +  int m = l;
> +  void foo (int x) { if (x == 1) goto lab4; }
> +  

RE: [PATCH] [Aarch64] Optimize subtract in shift counts

2017-08-08 Thread Wilco Dijkstra
Richard Kenner wrote:
>Michael Collison wrote:
> > On Aarc64 SHIFT_COUNT_TRUNCATED is only true if SIMD code generation
> > is disabled. This is because the simd instructions can be used for
> > shifting but they do not truncate the shift count.
>
> In that case, the change isn't safe!  Consider if the value was
> negative, for example.  Yes, it's technically undefined, but I'm not
> sure I'd want to rely on that.

No it's perfectly safe - it becomes an integer-only shift after the split since 
it
keeps the masking as part of the pattern.

But generally the SHIFT_COUNT_TRUNCATED is a mess, and so are other ways
of doing this - note the extremely complex subregs in the patch, none of that 
should
be required as there are no QI registers on AArch64! So it would be great if 
there
was a better way to describe the number of bits used by a particular shift 
alternative.

Wilco

Re: [RFC][PATCH] Do refactoring of attribute functions and move them to attribs.[hc].

2017-08-08 Thread Tom de Vries

On 08/08/2017 06:36 AM, Martin Liška wrote:

Hello.

I'm sending final version that I'm going to install. Compared to the previous 
version I tested
all targets in contrib/config-list.mk.



I ran into a build breaker for an nvptx offloading compiler (where the 
ACCEL_COMPILER-guarded code is enabled).


Retrying with patch below.

Thanks,
- Tom

diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
index a3b4d13..31d1488 100644
--- a/gcc/lto/lto.c
+++ b/gcc/lto/lto.c
@@ -53,6 +53,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "lto-symtab.h"
 #include "stringpool.h"
 #include "fold-const.h"
+#include "attribs.h"


 /* Number of parallel tasks to run, -1 if we want to use GNU Make 
jobserver.  */


Re: [committed, nvptx] Add nvptx_override_options_after_change

2017-08-08 Thread Tom de Vries

On 08/08/2017 10:45 AM, Thomas Schwinge wrote:

Hi Tom!

On Fri, 21 Jul 2017 11:49:11 +0200, Tom de Vries  wrote:

this patch adds nvptx_override_options_after_change, containing a
workaround for PR81430.



--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -207,6 +207,17 @@ nvptx_option_override (void)
  target_flags |= MASK_SOFT_STACK | MASK_UNIFORM_SIMT;
  }
  
+/* Implement TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE.  */

+
+static void
+nvptx_override_options_after_change (void)
+{
+  /* This is a workaround for PR81430 - nvptx acceleration compilation broken
+ because of running pass_partition_blocks.  This should be dealt with in 
the
+ common code, not in the target.  */
+  flag_reorder_blocks_and_partition = 0;
+}


Should this be reverted now that r250852 "Apply finish_options on
DECL_FUNCTION_SPECIFIC_OPTIMIZATION for ACCEL_COMPILER" has been
committed?  (I'm confirming this works fine, thanks.)


Hi,

Indeed, it can be reverted.

Thanks,
- Tom


Re: [committed, nvptx] Add nvptx_override_options_after_change

2017-08-08 Thread Thomas Schwinge
Hi Tom!

On Fri, 21 Jul 2017 11:49:11 +0200, Tom de Vries  wrote:
> this patch adds nvptx_override_options_after_change, containing a 
> workaround for PR81430.

> --- a/gcc/config/nvptx/nvptx.c
> +++ b/gcc/config/nvptx/nvptx.c
> @@ -207,6 +207,17 @@ nvptx_option_override (void)
>  target_flags |= MASK_SOFT_STACK | MASK_UNIFORM_SIMT;
>  }
>  
> +/* Implement TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE.  */
> +
> +static void
> +nvptx_override_options_after_change (void)
> +{
> +  /* This is a workaround for PR81430 - nvptx acceleration compilation broken
> + because of running pass_partition_blocks.  This should be dealt with in 
> the
> + common code, not in the target.  */
> +  flag_reorder_blocks_and_partition = 0;
> +}

Should this be reverted now that r250852 "Apply finish_options on
DECL_FUNCTION_SPECIFIC_OPTIMIZATION for ACCEL_COMPILER" has been
committed?  (I'm confirming this works fine, thanks.)


Grüße
 Thomas


Re: [PATCH] Forbid section anchors for ASan build (PR sanitizer/81697)

2017-08-08 Thread Vyacheslav Barinov
Hello,

Andrew Pinski  writes:

> On Mon, Aug 7, 2017 at 11:15 PM, Slava Barinov  wrote:
>>gcc/
>>* varasm.c (use_object_blocks_p): Forbid section anchors for ASan
>>
>>gcc/testsuite/
>>* g++.dg/asan/global-alignment.cc: New test to test global
>>variables alignment.
>
>
> Can you describe this a little bit more?  What is going wrong here?
> Is it because there is no red zone between the variables?
I've described situation in bugzilla PR 81697 with compiled files dumps, but
briefly yes, red zone size is incorrect between global variables.

On certain platforms (we checked at least arm and ppc) the flow is following:
make_decl_rtl() calls create_block_symbol() on decl tree with ASan global
variable describing a string. But then get_variable_section() returns wrong
section (.rodata.str1.4 in my case) because check in asan_protect_global()
returns false due to !DECL_RTL_SET_P check.

When variable is placed not into .rodata, but into .rodata.str ld treats it as
string and just shrinks the large part of red zone silently (gold at least
prints warning about wrong string alignment). But in run time libasan expects
that red zone is still there and reports false positives.

In order to prevent the setting of RTL before ASan handling I tried to
reproduce the x86_64 behavior and found that use_object_blocks_p() returns
false for that platform. Accordingly to the function comment it reports if
'current compilation mode benefits from grouping', and just switching it off
for any ASan builds resolves the issue.

> Also I noticed you are using .cc as the testcase file name, why don't
> you use .C instead and then you won't need the other patch which you
> just posted.
Okay, attaching rebased version.

> Thanks,
> Andrew Pinski
>

Best Regards,
Vyacheslav Barinov
>From 88b27d5f39a5671a429db52750564245a3b35141 Mon Sep 17 00:00:00 2001
From: Slava Barinov 
Date: Thu, 3 Aug 2017 16:14:59 +0300
Subject: [PATCH] Forbid section anchors for ASan build (PR sanitizer/81697)

   gcc/
   * varasm.c (use_object_blocks_p): Forbid section anchors for ASan

   gcc/testsuite/
   * g++.dg/asan/global-alignment.C: New test to test global
   variables alignment.

Signed-off-by: Slava Barinov 
---
 gcc/ChangeLog|  6 ++
 gcc/testsuite/ChangeLog  |  6 ++
 gcc/testsuite/g++.dg/asan/global-alignment.C | 17 +
 gcc/varasm.c |  3 ++-
 4 files changed, 31 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/asan/global-alignment.C

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index dde91ceea5b..d840825e7c8 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2017-08-08  Vyacheslav Barinov  
+
+	PR sanitizer/81697
+	* varasm.c (use_object_blocks_p): Forbid section anchors for ASan
+	build.
+
 2017-08-08  Martin Liska  
 
 	* asan.c: Include header files.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index c2119f478ba..e0d65103d30 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2017-08-08  Vyacheslav Barinov  
+
+	PR sanitizer/81697
+	* g++.dg/asan/global-alignment.C: New test to test global
+	variables alignment.
+
 2017-08-07  Michael Meissner  
 
 	PR target/81593
diff --git a/gcc/testsuite/g++.dg/asan/global-alignment.C b/gcc/testsuite/g++.dg/asan/global-alignment.C
new file mode 100644
index 000..c011c703ea6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/asan/global-alignment.C
@@ -0,0 +1,17 @@
+/* { dg-options "-fmerge-all-constants" } */
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
+
+#include 
+#include 
+
+const char kRecoveryInstallString[] = "NEW";
+const char kRecoveryUpdateString[] = "UPDATE";
+const char kRecoveryUninstallationString[] = "UNINSTALL";
+
+const std::map kStringToRequestMap = {
+  {kRecoveryInstallString, 0},
+  {kRecoveryUpdateString, 0},
+  {kRecoveryUninstallationString, 0},
+};
+/* { dg-final { scan-assembler-times {\.section\s+\.rodata\n(?:(?!\.section).)*\.(string|ascii|asciz)\s+"NEW} 1 } } */
diff --git a/gcc/varasm.c b/gcc/varasm.c
index e0834a1ff3b..dbeb8d9331e 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -346,7 +346,8 @@ get_section (const char *name, unsigned int flags, tree decl)
 static bool
 use_object_blocks_p (void)
 {
-  return flag_section_anchors;
+  return (flag_section_anchors
+	  && !(flag_sanitize & SANITIZE_ADDRESS));
 }
 
 /* Return the object_block structure for section SECT.  Create a new
-- 
2.13.3



Re: [PATCH] Forbid section anchors for ASan build (PR sanitizer/81697)

2017-08-08 Thread Andrew Pinski
On Mon, Aug 7, 2017 at 11:15 PM, Slava Barinov  wrote:
>gcc/
>* varasm.c (use_object_blocks_p): Forbid section anchors for ASan
>
>gcc/testsuite/
>* g++.dg/asan/global-alignment.cc: New test to test global
>variables alignment.


Can you describe this a little bit more?  What is going wrong here?
Is it because there is no red zone between the variables?
Also I noticed you are using .cc as the testcase file name, why don't
you use .C instead and then you won't need the other patch which you
just posted.

Thanks,
Andrew Pinski

>
> Signed-off-by: Slava Barinov 
> ---
>  gcc/ChangeLog |  6 ++
>  gcc/testsuite/ChangeLog   |  3 +++
>  gcc/testsuite/g++.dg/asan/global-alignment.cc | 17 +
>  gcc/varasm.c  |  3 ++-
>  4 files changed, 28 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/asan/global-alignment.cc
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index dde91ceea5b..d840825e7c8 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,9 @@
> +2017-08-08  Vyacheslav Barinov  
> +
> +   PR sanitizer/81697
> +   * varasm.c (use_object_blocks_p): Forbid section anchors for ASan
> +   build.
> +
>  2017-08-08  Martin Liska  
>
> * asan.c: Include header files.
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 315af8361df..0a0a5850c74 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,5 +1,8 @@
>  2017-08-08  Vyacheslav Barinov  
>
> +   PR sanitizer/81697
> +   * g++.dg/asan/global-alignment.cc: New test to test global
> +   variables alignment.
> * g++.dg/asan/asan.exp: Switch on *.cc tests.
>
>  2017-08-07  Michael Meissner  
> diff --git a/gcc/testsuite/g++.dg/asan/global-alignment.cc 
> b/gcc/testsuite/g++.dg/asan/global-alignment.cc
> new file mode 100644
> index 000..c011c703ea6
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/asan/global-alignment.cc
> @@ -0,0 +1,17 @@
> +/* { dg-options "-fmerge-all-constants" } */
> +/* { dg-do compile } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
> +
> +#include 
> +#include 
> +
> +const char kRecoveryInstallString[] = "NEW";
> +const char kRecoveryUpdateString[] = "UPDATE";
> +const char kRecoveryUninstallationString[] = "UNINSTALL";
> +
> +const std::map kStringToRequestMap = {
> +  {kRecoveryInstallString, 0},
> +  {kRecoveryUpdateString, 0},
> +  {kRecoveryUninstallationString, 0},
> +};
> +/* { dg-final { scan-assembler-times 
> {\.section\s+\.rodata\n(?:(?!\.section).)*\.(string|ascii|asciz)\s+"NEW} 1 } 
> } */
> diff --git a/gcc/varasm.c b/gcc/varasm.c
> index e0834a1ff3b..dbeb8d9331e 100644
> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -346,7 +346,8 @@ get_section (const char *name, unsigned int flags, tree 
> decl)
>  static bool
>  use_object_blocks_p (void)
>  {
> -  return flag_section_anchors;
> +  return (flag_section_anchors
> + && !(flag_sanitize & SANITIZE_ADDRESS));
>  }
>
>  /* Return the object_block structure for section SECT.  Create a new
> --
> 2.13.3
>


[PATCH] Forbid section anchors for ASan build (PR sanitizer/81697)

2017-08-08 Thread Slava Barinov
   gcc/
   * varasm.c (use_object_blocks_p): Forbid section anchors for ASan

   gcc/testsuite/
   * g++.dg/asan/global-alignment.cc: New test to test global
   variables alignment.

Signed-off-by: Slava Barinov 
---
 gcc/ChangeLog |  6 ++
 gcc/testsuite/ChangeLog   |  3 +++
 gcc/testsuite/g++.dg/asan/global-alignment.cc | 17 +
 gcc/varasm.c  |  3 ++-
 4 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/asan/global-alignment.cc

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index dde91ceea5b..d840825e7c8 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2017-08-08  Vyacheslav Barinov  
+
+   PR sanitizer/81697
+   * varasm.c (use_object_blocks_p): Forbid section anchors for ASan
+   build.
+
 2017-08-08  Martin Liska  
 
* asan.c: Include header files.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 315af8361df..0a0a5850c74 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,5 +1,8 @@
 2017-08-08  Vyacheslav Barinov  
 
+   PR sanitizer/81697
+   * g++.dg/asan/global-alignment.cc: New test to test global
+   variables alignment.
* g++.dg/asan/asan.exp: Switch on *.cc tests.
 
 2017-08-07  Michael Meissner  
diff --git a/gcc/testsuite/g++.dg/asan/global-alignment.cc 
b/gcc/testsuite/g++.dg/asan/global-alignment.cc
new file mode 100644
index 000..c011c703ea6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/asan/global-alignment.cc
@@ -0,0 +1,17 @@
+/* { dg-options "-fmerge-all-constants" } */
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" } { "" } } */
+
+#include 
+#include 
+
+const char kRecoveryInstallString[] = "NEW";
+const char kRecoveryUpdateString[] = "UPDATE";
+const char kRecoveryUninstallationString[] = "UNINSTALL";
+
+const std::map kStringToRequestMap = {
+  {kRecoveryInstallString, 0},
+  {kRecoveryUpdateString, 0},
+  {kRecoveryUninstallationString, 0},
+};
+/* { dg-final { scan-assembler-times 
{\.section\s+\.rodata\n(?:(?!\.section).)*\.(string|ascii|asciz)\s+"NEW} 1 } } 
*/
diff --git a/gcc/varasm.c b/gcc/varasm.c
index e0834a1ff3b..dbeb8d9331e 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -346,7 +346,8 @@ get_section (const char *name, unsigned int flags, tree 
decl)
 static bool
 use_object_blocks_p (void)
 {
-  return flag_section_anchors;
+  return (flag_section_anchors
+ && !(flag_sanitize & SANITIZE_ADDRESS));
 }
 
 /* Return the object_block structure for section SECT.  Create a new
-- 
2.13.3



[PATCH] Switch on *.cc tests for g++ ASan

2017-08-08 Thread Slava Barinov
* g++.dg/asan/asan.exp: Switch on *.cc tests.

Signed-off-by: Slava Barinov 
---
 gcc/testsuite/ChangeLog| 4 
 gcc/testsuite/g++.dg/asan/asan.exp | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index c2119f478ba..315af8361df 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2017-08-08  Vyacheslav Barinov  
+
+   * g++.dg/asan/asan.exp: Switch on *.cc tests.
+
 2017-08-07  Michael Meissner  
 
PR target/81593
diff --git a/gcc/testsuite/g++.dg/asan/asan.exp 
b/gcc/testsuite/g++.dg/asan/asan.exp
index 124c44e9e9a..620071ba7dc 100644
--- a/gcc/testsuite/g++.dg/asan/asan.exp
+++ b/gcc/testsuite/g++.dg/asan/asan.exp
@@ -26,7 +26,7 @@ asan_init
 
 # Main loop.
 if [check_effective_target_fsanitize_address] {
-  gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C 
$srcdir/c-c++-common/asan/*.c]] "" ""
+  gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.C 
$srcdir/$subdir/*.cc $srcdir/c-c++-common/asan/*.c]] "" ""
 }
 
 # All done.
-- 
2.13.3