Re: [PATCH v2 1/3] or1k: libgcc: initial support for openrisc

2018-10-04 Thread Stafford Horne
On Thu, Oct 04, 2018 at 05:36:58PM +, Joseph Myers wrote:
> On Thu, 4 Oct 2018, Stafford Horne wrote:
> 
> > +or1k-*-linux*)
> > +   tmake_file="$tmake_file or1k/t-or1k"
> > +   tmake_file="$tmake_file t-softfp-sfdf t-softfp-excl t-softfp"
> > +   md_unwind_header=or1k/linux-unwind.h
> > +   ;;
> > +or1k-*-*)
> > +   tmake_file="$tmake_file or1k/t-or1k"
> > +   tmake_file="$tmake_file t-softfp-sfdf t-softfp-excl t-softfp"
> > +   extra_parts="$extra_parts crti.o crtn.o"
> > +   ;;
> 
> Could you clarify why you are using t-softfp-excl?
> 
> In general, for soft-float configurations it's best not to use 
> t-softfp-excl - meaning the floating-point operations from libgcc2.c get 
> implemented directly in soft-fp, rather than through libgcc2.c wrapping 
> other soft-fp operations.  While for hard-float configurations it's best 
> not to use soft-fp at all.
> 
> If you have hard-float configurations using a soft-float ABI, that thus 
> need to provide all the functions in question so soft-float objects / 
> programs can be linked with hard-float libgcc, t-hardfp should be used 
> instead when building a hard-float multilib.  (It's possible to have more 
> complicated mixtures in cases where some operations or types come from 
> hardfp and others from softfp; some mips and powerpcspe configurations 
> do.)

Hello,

I don't know of a reason for using t-softfp-excl, I think Richard or I was just
copying from what we saw similar targets do.  I just removed it and reran some
of the tests and don't see any regressions.

We will be patching in hardfp post upstreaming, so the notes above are very
helpful.

-Stafford


introduce --enable-mingw-full32 to default to --large-address-aware

2018-10-04 Thread Alexandre Oliva
Add a configure knob for mingw32 and 64 toolchains to default passing
--large-address-aware to the linker, when creating 32-bit binaries.
-Wl,--disable-large-address-aware can still reverse its effects.

I've tested this with cross i686-pc-mingw32-gcc and
x86_64-w64-mingw64-gcc (is this the usual triplet name?), observing the
flags passed by gcc to the linker when asked to create an executable
program or a dynamic library, in 32- or, with the latter compiler, in
64-bit mode.


I wonder if it makes any sense to extend/rename the configure flag to
apply to cygwin as well, though it should default to enabled for that
platform.

I also wonder if it makes sense, at this point, for mingw to default to
--large-address-aware (I guess not, but it doesn't hurt to ask, does it?
:-)

Yet another idea that comes to mind is to introduce gcc flags, say
-m32full and -m31, to imply -m32 and also pass either
--large-address-aware or --disable-large-address-aware, respectively, to
the linker.

I suppose it might also make sense to approach this issue from the
linker, rather than from GCC, enabling its default to be configured.
Would that be preferred?  I thought tweaking GCC would be better, for
the flag would be visible with -v, both the one passed to the linker and
the one passed to GCC configure.  It wouldn't be quite as visible as a
linker configuration knob.


Given all this, is this patch below ok to install, or should I make
changes.  I've included the configure and config.in changes because
they're small enough.

Below this first patch, I enclose another patch for cygming.h.


for  gcc/ChangeLog

* configure.ac: Introduce --enable-mingw-full32 to define
MINGW_DEFAULT_LARGE_ADDR_AWARE.
* configure, config.in: Rebuilt.
* config/i386/mingw32.h (LINK_SPEC_LARGE_ADDR_AWARE): Define,
based on MINGW_DEFAULT_LARGE_ADDR_AWARE.
(LINK_SPEC): Insert it.
* config/i386/mingw-264.h: Likewise.
---
 gcc/config.in   |6 ++
 gcc/config/i386/mingw-w64.h |9 +
 gcc/config/i386/mingw32.h   |8 
 gcc/configure   |   26 --
 gcc/configure.ac|7 +++
 5 files changed, 50 insertions(+), 6 deletions(-)

diff --git a/gcc/config.in b/gcc/config.in
index 2856e72d627df..05aa7e94c296d 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -2040,6 +2040,12 @@
 #endif
 
 
+/* Define if we should link with --large-address-aware by default */
+#ifndef USED_FOR_TARGET
+#undef MINGW_DEFAULT_LARGE_ADDR_AWARE
+#endif
+
+
 /* Value to set mingw's _dowildcard to. */
 #ifndef USED_FOR_TARGET
 #undef MINGW_DOWILDCARD
diff --git a/gcc/config/i386/mingw-w64.h b/gcc/config/i386/mingw-w64.h
index 484dc7a9e9f27..00b3f042a36ca 100644
--- a/gcc/config/i386/mingw-w64.h
+++ b/gcc/config/i386/mingw-w64.h
@@ -81,6 +81,14 @@ along with GCC; see the file COPYING3.  If not see
 #define MULTILIB_DEFAULTS { "m32" }
 #endif
 
+#undef LINK_SPEC_LARGE_ADDR_AWARE
+#if MINGW_DEFAULT_LARGE_ADDR_AWARE
+# define LINK_SPEC_LARGE_ADDR_AWARE \
+  "%{!shared:%{!mdll:%{" SPEC_32 ":--large-address-aware}}}"
+#else
+# define LINK_SPEC_LARGE_ADDR_AWARE ""
+#endif
+
 #undef LINK_SPEC
 #define LINK_SPEC SUB_LINK_SPEC " %{mwindows:--subsystem windows} \
   %{mconsole:--subsystem console} \
@@ -88,4 +96,5 @@ along with GCC; see the file COPYING3.  If not see
   %{shared: --shared} %{mdll:--dll} \
   %{static:-Bstatic} %{!static:-Bdynamic} \
   %{shared|mdll: " SUB_LINK_ENTRY " --enable-auto-image-base} \
+  " LINK_SPEC_LARGE_ADDR_AWARE "\
   %(shared_libgcc_undefs)"
diff --git a/gcc/config/i386/mingw32.h b/gcc/config/i386/mingw32.h
index a66010894208b..c9d8a7a31f30e 100644
--- a/gcc/config/i386/mingw32.h
+++ b/gcc/config/i386/mingw32.h
@@ -114,12 +114,20 @@ along with GCC; see the file COPYING3.  If not see
 #define SUBTARGET_EXTRA_SPECS  \
   { "shared_libgcc_undefs", SHARED_LIBGCC_UNDEFS_SPEC }
 
+#if MINGW_DEFAULT_LARGE_ADDR_AWARE
+# define LINK_SPEC_LARGE_ADDR_AWARE \
+  "%{!shared:%{!mdll:--large-address-aware}}"
+#else
+# define LINK_SPEC_LARGE_ADDR_AWARE ""
+#endif
+
 #define LINK_SPEC "%{mwindows:--subsystem windows} \
   %{mconsole:--subsystem console} \
   %{shared: %{mdll: %eshared and mdll are not compatible}} \
   %{shared: --shared} %{mdll:--dll} \
   %{static:-Bstatic} %{!static:-Bdynamic} \
   %{shared|mdll: " SUB_LINK_ENTRY " --enable-auto-image-base} \
+  " LINK_SPEC_LARGE_ADDR_AWARE "\
   %(shared_libgcc_undefs)"
 
 /* Include in the mingw32 libraries with libgcc */
diff --git a/gcc/configure b/gcc/configure
index b7a8e3643778b..21fbda80a3ba7 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -927,6 +927,7 @@ enable_sjlj_exceptions
 with_gcc_major_version_only
 enable_secureplt
 enable_mingw_wildcard
+enable_mingw_full32
 enable_leading_mingw64_underscores
 enable_cld
 enable_frame_pointer
@@ -1644,6 +1645,8 @@ Optional Features:
   --enable-secureplt  enable -msecure-plt by default for 

[PATCH] i386: Don't pass -msse2avx to assembler for -mavx

2018-10-04 Thread H.J. Lu


With

gcc -O2 -fPIC -flto -g -c -o a.o a.c
gcc -O2 -fPIC -flto -g -mavx   -c -o b.o b.c
gcc -shared -O2 -fPIC -flto -g -o lib1.so a.o b.o

LTO correctly generates AVX for b.o and SSE for a.o.  But the GCC driver
passes -msse2avx to assembler, which encodes SSE instructions as AVX
instructions.  We shouldn't pass -msse2avx to assembler for -mavx.

Tested on x86-64.  OK for trunk?

Thanks.

H.J.
---
PR target/87522
* config/i386/gnu-user.h (ASM_SPEC): Don't pass -msse2avx to
assembler for -mavx.
* config/i386/gnu-user64.h (ASM_SPEC): Likewise.
---
 gcc/config/i386/gnu-user.h   | 2 +-
 gcc/config/i386/gnu-user64.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/gnu-user.h b/gcc/config/i386/gnu-user.h
index a922c9b93fa..baed87aa54f 100644
--- a/gcc/config/i386/gnu-user.h
+++ b/gcc/config/i386/gnu-user.h
@@ -67,7 +67,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #undef  ASM_SPEC
 #define ASM_SPEC \
-  "--32 %{!mno-sse2avx:%{mavx:-msse2avx}} %{msse2avx:%{!mavx:-msse2avx}}"
+  "--32 %{msse2avx:%{!mavx:-msse2avx}}"
 
 #undef  SUBTARGET_EXTRA_SPECS
 #define SUBTARGET_EXTRA_SPECS \
diff --git a/gcc/config/i386/gnu-user64.h b/gcc/config/i386/gnu-user64.h
index f7a68fdecf0..09141ce3508 100644
--- a/gcc/config/i386/gnu-user64.h
+++ b/gcc/config/i386/gnu-user64.h
@@ -50,7 +50,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #define ASM_SPEC "%{" SPEC_32 ":--32} \
  %{" SPEC_64 ":--64} \
  %{" SPEC_X32 ":--x32} \
- %{!mno-sse2avx:%{mavx:-msse2avx}} %{msse2avx:%{!mavx:-msse2avx}}"
+ %{msse2avx:%{!mavx:-msse2avx}}"
 
 #define GNU_USER_TARGET_LINK_SPEC \
   "%{" SPEC_64 ":-m " GNU_USER_LINK_EMULATION64 "} \
-- 
2.17.1



Re: [PATCH 2/4] Remove unused functions and fields.

2018-10-04 Thread Bernhard Reutner-Fischer
Hi!

So i just added archive handling to ease looking at more than just the
plain frontends, applied as r264856.

You can now use the exact files passed to the driver when linking e.g. cc1.
We link libcommon.a twice? Didn't look.

e.g.:
me@there:.../gcc$ /scratch/src/gcc-trunk/contrib/unused_functions.py
c/c-lang.o c-family/stub-objc.o attribs.o c/c-errors.o c/c-decl.o
c/c-typeck.o c/c-convert.o c/c-aux-info.o c/c-objc-common.o
c/c-parser.o c/c-fold.o c/gimple-parser.o c-family/c-common.o
c-family/c-cppbuiltin.o c-family/c-dump.o c-family/c-format.o
c-family/c-gimplify.o c-family/c-indentation.o c-family/c-lex.o
c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o
c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o
c-family/c-semantics.o c-family/c-ada-spec.o c-family/c-ubsan.o
c-family/known-headers.o c-family/c-attribs.o c-family/c-warn.o
c-family/c-spellcheck.o i386-c.o glibc-c.o   cc1-checksum.o
libbackend.a main.o libcommon-target.a libcommon.a ../libcpp/libcpp.a
../libdecnumber/libdecnumber.a libcommon.a ../libcpp/libcpp.a
../libbacktrace/.libs/libbacktrace.a ../libiberty/libiberty.a
../libdecnumber/libdecnumber.a

results in the attached output.

This properly flags functions like e.g.: init_branch_prob (dead code),
bitwise_mode_for_mode in stor-layout.c (should be static).

Of course it also complains about cases like supports_one_only() in cc1
where that is only used in cc1plus.
Likewise constant_pool_empty_p() on i386 which is only used by ppc and spe.

HTH,


cc1-unused_funcs.txt.xz
Description: application/xz


Re: [Patch, Aarch64] Fix testsuite regressions related to PR tree-optimization/71625

2018-10-04 Thread Steve Ellcey
Ping.  If I don't hear any objections I will consider this patch as
obvious since the only change is to add noinline attributes to the
tests.

Steve Ellcey
sell...@cavium.com


On Wed, 2018-09-26 at 09:21 -0700, Steve Ellcey wrote:
> A patch for PR tree-optimized/71625 caused regressions in the
> gcc.target/aarch64/vclz.c and gcc.target/aarch64/vneg_s.c tests
> because a couple of routines that were not getting inlined before
> started getting inlined.  The inlining is not a bug, the 
> generated code is now smaller so some functions that were previously
> not being inlined due to being too large are now getting inlined.
> Because we also generate out-of-line code the scan-assembler-times
> checks are failing.  Since inlining or not inlining is not the 
> point of this test I added the noinline attribute to all the test_*
> functions and this fixed the test regressions.
> 
> Tested on aarch64, OK for checkin?
> 
> Steve Ellcey
> sell...@cavium.com
> 
> 
> 2018-09-26  Steve Ellcey  
> 
>   PR tree-optimization/71625
>   * /gcc.target/aarch64/vclz.c (test_vclz_s8): Add noinline
> attribute.
>   (test_vclz_s16): Ditto.
>   (test_vclz_s32): Ditto.
>   (test_vclzq_s8): Ditto.
>   (test_vclzq_s16): Ditto.
>   (test_vclzq_s32): Ditto.
>   (test_vclz_u8): Ditto.
>   (test_vclz_u16): Ditto.
>   (test_vclz_u32): Ditto.
>   (test_vclzq_u8): Ditto.
>   (test_vclzq_u16): Ditto.
>   (test_vclzq_u32): Ditto.
>   * gcc.target/aarch64/vneg_s.c (test_vneg_s8): Ditto.
>   (test_vneg_s16): Ditto.
>   (test_vneg_s32): Ditto.
>   (test_vneg_s64): Ditto.
>   (test_vnegd_s64): Ditto.
>   (test_vnegq_s8): Ditto.
>   (test_vnegq_s16): Ditto.
>   (test_vnegq_s32): Ditto.
>   (test_vnegq_s64): Ditto.
> 
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/vclz.c
> b/gcc/testsuite/gcc.target/aarch64/vclz.c
> index 60494a8..a36ee44 100644
> --- a/gcc/testsuite/gcc.target/aarch64/vclz.c
> +++ b/gcc/testsuite/gcc.target/aarch64/vclz.c
> @@ -75,7 +75,7 @@ extern void abort (void);
>  if (a [i] != b [i])  
>   \
>    return 1;
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclz_s8 ()
>  {
>    int i;
> @@ -107,7 +107,7 @@ test_vclz_s8 ()
>  /* Double scan-assembler-times to take account of unsigned
> functions.  */
>  /* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.8b, v\[0-
> 9\]+\.8b" 4 } } */
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclz_s16 ()
>  {
>    int i;
> @@ -138,7 +138,7 @@ test_vclz_s16 ()
>  /* Double scan-assembler-times to take account of unsigned
> functions.  */
>  /* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.4h, v\[0-
> 9\]+\.4h" 10} } */
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclz_s32 ()
>  {
>    int i;
> @@ -205,7 +205,7 @@ test_vclz_s32 ()
>  /* Double scan-assembler-times to take account of unsigned
> functions.  */
>  /* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.2s, v\[0-
> 9\]+\.2s" 34 } } */
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclzq_s8 ()
>  {
>    int i;
> @@ -226,7 +226,7 @@ test_vclzq_s8 ()
>  /* Double scan-assembler-times to take account of unsigned
> functions.  */
>  /* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.16b, v\[0-
> 9\]+\.16b" 2 } } */
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclzq_s16 ()
>  {
>    int i;
> @@ -262,7 +262,7 @@ test_vclzq_s16 ()
>  /* Double scan-assembler-times to take account of unsigned
> functions.  */
>  /* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.8h, v\[0-
> 9\]+\.8h" 6 } } */
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclzq_s32 ()
>  {
>    int i;
> @@ -303,7 +303,7 @@ test_vclzq_s32 ()
>  
>  /* Unsigned versions.  */
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclz_u8 ()
>  {
>    int i;
> @@ -331,7 +331,7 @@ test_vclz_u8 ()
>  
>  /* ASM scan near test for signed version.  */
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclz_u16 ()
>  {
>    int i;
> @@ -361,7 +361,7 @@ test_vclz_u16 ()
>  
>  /* ASM scan near test for signed version.  */
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclz_u32 ()
>  {
>    int i;
> @@ -427,7 +427,7 @@ test_vclz_u32 ()
>  
>  /* ASM scan near test for signed version.  */
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclzq_u8 ()
>  {
>    int i;
> @@ -448,7 +448,7 @@ test_vclzq_u8 ()
>  
>  /* ASM scan near test for signed version.  */
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclzq_u16 ()
>  {
>    int i;
> @@ -485,7 +485,7 @@ test_vclzq_u16 ()
>  
>  /* ASM scan near test for signed version.  */
>  
> -int
> +int __attribute__ ((noinline))
>  test_vclzq_u32 ()
>  {
>    int i;
> diff --git a/gcc/testsuite/gcc.target/aarch64/vneg_s.c
> b/gcc/testsuite/gcc.target/aarch64/vneg_s.c
> index e7f20f2..6947526 100644
> --- a/gcc/testsuite/gcc.target/aarch64/vneg_s.c
> +++ b/gcc/testsuite/gcc.target/aarch64/vneg_s.c
> @@ -87,7 +87,7 @@ extern void 

Re: [PATCH 15/25] Don't double-count early-clobber matches.

2018-10-04 Thread Richard Sandiford
Andrew Stubbs  writes:
> On 17/09/18 10:18, Richard Sandiford wrote:
>> The idea looks good to me FWIW, but you can't use curr_static_id for
>> the state, since that's a static description of the .md pattern rather
>> than data about this particular instance.
>
> I clearly misunderstood what that was for.
>
> This patch does the same thing, but uses a local variable to store the 
> state. That probably means it does it more correctly, too.
>
> OK?
>
> Andrew
>
> Don't double-count early-clobber matches.
>
> Given a pattern with a number of operands:
>
> (match_operand 0 "" "=")
> (match_operand 1 "" " v0")
> (match_operand 2 "" " v0")
> (match_operand 3 "" " v0")
>
> GCC will currently increment "reject" once, for operand 0, and then decrement
> it once for each of the other operands, ending with reject == -2 and an
> assertion failure.  If there's a conflict then it might try to decrement 
> reject
> yet again.
>
> Incidentally, what these patterns are trying to achieve is an allocation in
> which operand 0 may match one of the other operands, but may not partially
> overlap any of them.  Ideally there'd be a better way to do this.
>
> In any case, it will affect any pattern in which multiple operands may (or
> must) match an early-clobber operand.
>
> The patch only allows a reject-- when one has not already occurred, for that
> operand.
>
> 2018-09-27  Andrew Stubbs  
>
>   gcc/
>   * lra-constraints.c (process_alt_operands): Check
>   matching_early_clobber before decrementing reject, and set
>   matching_early_clobber after.
>   * lra-int.h (struct lra_operand_data): Add matching_early_clobber.
>   * lra.c (setup_operand_alternative): Initialize matching_early_clobber.
>
> diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c
> index 774d1ff..e1d1688 100644
> --- a/gcc/lra-constraints.c
> +++ b/gcc/lra-constraints.c
> @@ -1969,6 +1969,7 @@ process_alt_operands (int only_alternative)
>if (!TEST_BIT (preferred, nalt))
>   continue;
>  
> +  bool matching_early_clobber[MAX_RECOG_OPERANDS] = {};

This is potentially expensive, since MAX_RECOG_OPERANDS >= 30 and
most instructions have operand counts in the low single digits.
(And this is a very compile-time sensitive function -- it often
shows up at the top or near the top of a "cc1 -O0" profile.)

How about clearing it in this loop:

  curr_small_class_check++;
  overall = losers = addr_losers = 0;
  static_reject = reject = reload_nregs = reload_sum = 0;
  for (nop = 0; nop < n_operands; nop++)
{
  ...
}

OK with that change if it works, thanks.

Sorry for the slow reply...

Richard


Re: [PATCH 2/2 v3][IRA,LRA] Fix PR86939, IRA incorrectly creates an interference between a pseudo register and a hard register

2018-10-04 Thread Vladimir Makarov




On 10/03/2018 10:09 AM, Peter Bergner wrote:

Here is another updated PATCH 2 that is the same as the previous version,
but includes the modification to the expected output of i386/pr49095.c
test case, as H.J. has confirmed the code gen changes we are seeing on
are a good thing.

This patch completed bootstrap and regtesting on powerpc64le-linux,
x86_64-linux and i686-linux (Thanks H.J.!) with no regressions.
Ok for trunk?


The patch is ok for the trunk.  Only very minor comments.

IMHO, the name copy_insn_p is too common and confusing (we already have 
functions copy_insn and copy_insn_1 in GCC).  The name does not reflect 
its result meaning.  I would call it something like 
non_conflict_copy_source_reg although it is long.


Also I would rename last_regno to bound_regno because it is better 
reflect variable value meaning or at least to end_regno as it is a value 
of END_REGNO macro.


But you can ignore these insignificant comments.

Peter, thank you for working on the issue.  The patches may be solve 
more PRs you mentioned.



gcc/
PR rtl-optimization/86939
PR rtl-optimization/87479
* ira.h (copy_insn_p): New prototype.
* ira-lives.c (ignore_reg_for_conflicts): New static variable.
(make_hard_regno_dead): Don't add conflicts for register
ignore_reg_for_conflicts.
(make_object_dead): Likewise.
(copy_insn_p): New function.
(process_bb_node_lives): Set ignore_reg_for_conflicts for copies.
Remove special conflict handling of REAL_PIC_OFFSET_TABLE_REGNUM.
* lra-lives.c (ignore_reg_for_conflicts): New static variable.
(make_hard_regno_dead): Don't add conflicts for register
ignore_reg_for_conflicts.  Remove special conflict handling of
REAL_PIC_OFFSET_TABLE_REGNUM.  Remove now unused argument
check_pic_pseudo_p and update callers.
(mark_pseudo_dead): Don't add conflicts for register
ignore_reg_for_conflicts.
(process_bb_lives): Set ignore_reg_for_conflicts for copies.

gcc/testsuite/
* gcc.target/powerpc/pr86939.c: New test.
* gcc/testsuite/gcc.target/i386/pr49095.c: Fix expected results.





[PATCH, i386]: Macroize a couple of x87 fop patterns

2018-10-04 Thread Uros Bizjak
... and reorder insn patterns a bit.

2018-10-04  Uros Bizjak  

* config/i386/i386.md (*fop__2_i387): Macroize insn
from *fop__2_i387 and *fop_xf_2_i387 using
X87MODEF mode iterator.
(*fop__3_i387): Macroize insn from
*fop__3_i387 and *fop_xf_3_i387 using
X87MODEF mode iterator.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 39b220e9a00f..122e57f98cc4 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -14732,6 +14732,20 @@
 ;; Gcc is slightly more smart about handling normal two address instructions
 ;; so use special patterns for add and mull.
 
+(define_insn "*fop_xf_comm_i387"
+  [(set (match_operand:XF 0 "register_operand" "=f")
+   (match_operator:XF 3 "binary_fp_operator"
+   [(match_operand:XF 1 "register_operand" "%0")
+(match_operand:XF 2 "register_operand" "f")]))]
+  "TARGET_80387
+   && COMMUTATIVE_ARITH_P (operands[3])"
+  "* return output_387_binary_op (insn, operands);"
+  [(set (attr "type")
+   (if_then_else (match_operand:XF 3 "mult_operator")
+  (const_string "fmul")
+  (const_string "fop")))
+   (set_attr "mode" "XF")])
+
 (define_insn "*fop__comm"
   [(set (match_operand:MODEF 0 "register_operand" "=f,x,v")
(match_operator:MODEF 3 "binary_fp_operator"
@@ -14780,6 +14794,20 @@
(set_attr "prefix" "maybe_vex")
(set_attr "mode" "SF")])
 
+(define_insn "*fop_xf_1_i387"
+  [(set (match_operand:XF 0 "register_operand" "=f,f")
+   (match_operator:XF 3 "binary_fp_operator"
+   [(match_operand:XF 1 "register_operand" "0,f")
+(match_operand:XF 2 "register_operand" "f,0")]))]
+  "TARGET_80387
+   && !COMMUTATIVE_ARITH_P (operands[3])"
+  "* return output_387_binary_op (insn, operands);"
+  [(set (attr "type")
+   (if_then_else (match_operand:XF 3 "div_operator")
+  (const_string "fdiv")
+  (const_string "fop")))
+   (set_attr "mode" "XF")])
+
 (define_insn "*fop__1"
   [(set (match_operand:MODEF 0 "register_operand" "=f,f,x,v")
(match_operator:MODEF 3 "binary_fp_operator"
@@ -14816,49 +14844,65 @@
 (symbol_ref "true")
 (symbol_ref "false"])
 
-;; ??? Add SSE splitters for these!
-(define_insn "*fop__2_i387"
-  [(set (match_operand:MODEF 0 "register_operand" "=f")
-   (match_operator:MODEF 3 "binary_fp_operator"
- [(float:MODEF
+(define_insn "*fop__2_i387"
+  [(set (match_operand:X87MODEF 0 "register_operand" "=f")
+   (match_operator:X87MODEF 3 "binary_fp_operator"
+ [(float:X87MODEF
 (match_operand:SWI24 1 "nonimmediate_operand" "m"))
-  (match_operand:MODEF 2 "register_operand" "0")]))]
-  "TARGET_80387 && X87_ENABLE_FLOAT (mode, mode)
-   && !(SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
+  (match_operand:X87MODEF 2 "register_operand" "0")]))]
+  "TARGET_80387 && X87_ENABLE_FLOAT (mode, mode)
+   && !(SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
&& (TARGET_USE_MODE_FIOP
|| optimize_function_for_size_p (cfun))"
   "* return output_387_binary_op (insn, operands);"
   [(set (attr "type")
-(cond [(match_operand:MODEF 3 "mult_operator")
- (const_string "fmul")
-   (match_operand:MODEF 3 "div_operator")
- (const_string "fdiv")
-  ]
-  (const_string "fop")))
+   (cond [(match_operand:X87MODEF 3 "mult_operator")
+(const_string "fmul")
+  (match_operand:X87MODEF 3 "div_operator")
+(const_string "fdiv")
+ ]
+ (const_string "fop")))
(set_attr "fp_int_src" "true")
(set_attr "mode" "")])
 
-(define_insn "*fop__3_i387"
-  [(set (match_operand:MODEF 0 "register_operand" "=f")
-   (match_operator:MODEF 3 "binary_fp_operator"
- [(match_operand:MODEF 1 "register_operand" "0")
-  (float:MODEF
+(define_insn "*fop__3_i387"
+  [(set (match_operand:X87MODEF 0 "register_operand" "=f")
+   (match_operator:X87MODEF 3 "binary_fp_operator"
+ [(match_operand:X87MODEF 1 "register_operand" "0")
+  (float:X87MODEF
 (match_operand:SWI24 2 "nonimmediate_operand" "m"))]))]
-  "TARGET_80387 && X87_ENABLE_FLOAT (mode, mode)
-   && !(SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
+  "TARGET_80387 && X87_ENABLE_FLOAT (mode, mode)
+   && !(SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH)
&& (TARGET_USE_MODE_FIOP
|| optimize_function_for_size_p (cfun))"
   "* return output_387_binary_op (insn, operands);"
   [(set (attr "type")
-(cond [(match_operand:MODEF 3 "mult_operator")
- (const_string "fmul")
-   (match_operand:MODEF 3 "div_operator")
- (const_string "fdiv")
-  ]
-  (const_string "fop")))
+   (cond [(match_operand:X87MODEF 3 "mult_operator")

Re: [PATCH v3] Change default to -fno-math-errno

2018-10-04 Thread Joseph Myers
On Thu, 4 Oct 2018, Jeff Law wrote:

> > I doubt you could prove that without LTO of the whole program because an 
> > errno value set by a libm function call could always be checked in the 
> > caller of whatever function is being compiled.
> Right, but if you have a series of calls like
> 
> 
>   t1 = sqrt (x);
>   t2 = sqrt (y);
>   t3 = sqrt (z);
>   return t1 + t2 + t3;
> 
> You know that errno from the first two isn't examined and you could emit
> the hardware insn for sqrt in those cases.  You couldn't do so for the
> 3rd because you don't necessarily have the caller context.

No, you don't know that error from the first two isn't examined.  sqrt 
doesn't set errno on success.  A caller could set errno to 0 before that 
code, then check for errno == EDOM afterwards, meaning that any of the 
sqrt arguments were negative.  (The caller could also check whether the 
result is a NaN and so not need errno, of course.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-10-04 Thread Joseph Myers
On Thu, 4 Oct 2018, Jeff Law wrote:

> With the natural tensions around warning early vs warning late the
> warning analysis phase may choose to mark statements in various ways
> rather than issuing the warning at that time.  ie, it might choose to
> mark the statement with the set of potential issues as well as marking
> those which were proven safe.  A later pass could then refine things and
> issue the actual warning.  Or something like that.

Note that could include front ends marking expressions for warning issues 
as well.  (E.g. the signed/unsigned warnings, which fold and do some crude 
checks for cases where they can prove the signed value is never negative.)  
To avoid that front-end folding you want later code outside the front end 
to have the information that there was a (comparison / implicit conversion 
from signed to unsigned) that should be warned about if the later code 
can't prove the conversion was value-preserving.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v2 2/3] or1k: testsuite: initial support for openrisc

2018-10-04 Thread Mike Stump
On Oct 3, 2018, at 8:48 PM, Stafford Horne  wrote:
> 
> -mm-dd  Stafford Horne  
>   Richard Henderson  
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.c-torture/execute/20101011-1.c: Adjust for OpenRISC.
>   * gcc.dg/20020312-2.c: Likewise.
>   * gcc.dg/attr-alloc_size-11.c: Likewise.
>   * gcc.dg/builtin-apply2.c: Likewise.
>   * gcc.dg/nop.h: Likewise.
>   * gcc.dg/torture/stackalign/builtin-apply-2.c: Likewise.
>   * gcc.dg/tree-ssa/20040204-1.c: Likewise.
>   * gcc.dg/tree-ssa/reassoc-33.c: Likewise.
>   * gcc.dg/tree-ssa/reassoc-34.c: Likewise.
>   * gcc.dg/tree-ssa/reassoc-35.c: Likewise.
>   * gcc.dg/tree-ssa/reassoc-36.c: Likewise.
>   * lib/target-supports.exp
>   (check_effective_target_logical_op_short_circuit): Add or1k*-*-*.
>   * gcc.target/or1k/*: New.

Not sure this needs any extra review, but I did, so, Ok.



Re: [PATCH v3] Change default to -fno-math-errno

2018-10-04 Thread Jeff Law
On 10/4/18 11:29 AM, Joseph Myers wrote:
> On Wed, 3 Oct 2018, Jeff Law wrote:
> 
>> That led me to wonder if we could prove that for the majority of FP
>> intensive codes that even if the library set errno that nobody could
>> possibly be reading it, then we could in large part treat those builtin
>> calls as -fno-math-errno (we'd mark them somehow and check the mark at
>> GIMPLE->RTL expansion time (and possibly other points) to allow us to
>> generate machine instructions rather than library calls).  That would
>> get most of the benefit without the possibility to breaking user code.
> 
> I doubt you could prove that without LTO of the whole program because an 
> errno value set by a libm function call could always be checked in the 
> caller of whatever function is being compiled.
Right, but if you have a series of calls like


  t1 = sqrt (x);
  t2 = sqrt (y);
  t3 = sqrt (z);
  return t1 + t2 + t3;

You know that errno from the first two isn't examined and you could emit
the hardware insn for sqrt in those cases.  You couldn't do so for the
3rd because you don't necessarily have the caller context.

But as Richi reminded us it can be exceedingly painful to find reads of
errno because of the way it's defined by glibc (presumably for thread
safety).  So this may be academic in practice.
Jeff


Re: [PATCH 04/25] SPECIAL_REGNO_P

2018-10-04 Thread Jeff Law
On 9/13/18 4:01 AM, Andrew Stubbs wrote:
> 
> The register that find_rename_reg is considering is SCC, which is one of
> the "special" registers.  There is a short-cut in rename_chains for
> fixed registers, global registers, and frame pointers.  It does not
> check HARD_REGNO_RENAME_OK.
I wonder if it expects the caller to have avoided putting these
registers in the chain.


> 
> The assert is caused because the def-use chains indicate that SCC
> conflicts with itself. I suppose the question is why is it doing that,
> but it's probably do do with that being a special register that gets
> used in split2 (particularly by the addptrdi3 pattern). Although, those
> patterns are careful to save SCC to one side and then restore it again
> after, so I'd have thought the DF analysis would work out?
If you have SCC before its first set, then DF is going to think the SCC
register is live at function entry.

Jeff


Re: Add new warning flag "warn_prio_ctor_dtor"

2018-10-04 Thread Jeff Law
On 9/3/18 8:48 AM, Vinay Kumar wrote:
> Hi Joseph,
> 
>>> The documentation is of a new option, not of -Wreturn-type 
> Done
> 
>>> you should list the option as -Wno-prio-ctor-dtor when documenting 
> Done
> 
>>> You're not meant to have the literal text ", no ()" in the manual 
> Sorry for the confusion. Modified it accordingly
> 
>>> I don't see why a g++.dg one is needed as well 
> Removed.
> 
> Thanks for reviewing the patch and your suggestions.
> Please find attached the modified patch as per your review comments.
> Please review the patch and let me know if its okay?
> 
> Thanks,
> Vinay
> 
> 
> Wprio-ctor-dtor.patch
[ ... ]
THanks.  I've installed this on the trunk.

jeff


-fopt-info-inline and class opt_problem

2018-10-04 Thread David Malcolm
On Thu, 2018-10-04 at 12:52 +0200, Richard Biener wrote:
> On Fri, 28 Sep 2018, David Malcolm wrote:

[...snip...]

> > 
> > OK for trunk?
> 
> OK.  My question on 2/3 leaves Richard time to comment.
> 
> Maybe you can tackle the issue that -fopt-info-inline does
> nothing at the moment and see if opt-problems are a good fit
> there as well (reporting the CIF strings).

Thanks.  I've committed the patch kit.

[CCing Honza]

I'm looking at -fopt-info-inline now (PR ipa/86395).  I might look at
loop optimizations also (there are some fprintfs there also).

Presumably the user-experience for -fopt-info-inline should be
something like:

  : missed: can't inline foo into bar
  : missed: because of 

or:

  : missed: not inlining foo into bar
  : missed: because of 

and:

  : optimized: inlined foo into bar

(gathering all pertinent data into the -fsave-optimization-record
output).

I suspect that the opt_problem class from the above patch kit [1] isn't
a good fit for inlining: the loop vectorizer runs one outer loop at a
time, with a deep callstack, needing to bubble up a single "problem" at
at time.

In contrast, assuming I'm reading the code right, the inliner stores a
status in each cgraph_edge, and walks the cgraph to decide what to do
with all of the edges, with a fairly shallow callstack: I believe
decisions made for one edge can affect other edges etc.

It might still be possible to store and/or emit a bit more per-edge
detail other than the plain enum cgraph_inline_failed_t, if that's
desirable.

Brainstorming:

(a) replace the "inline_failed" enum with a pointer to a class that can
contain more details (perhaps an opt_problem subclass): in a normal
dump-disabled setting, these would point to global constant singletons
(so no allocation is needed); in a dump-enabled setting, these could be
allocated objects containing additional details describing *exactly*
why an edge wasn't inlined (but this adds lots of indirections, ugh)

(b) add a map of cgraph_edge to opt_problem instances; this would only
be populated if dumping is enabled.

(c) if dumping is enabled, generate an opt_problem when an edge
transitions to CIF_FINAL_ERROR, and emit it... somewhere.

I'm not sure if the above are good ideas or not.  (b) and (c) feel more
credible than (a).


Dave

[1] https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01747.html


Re: [PATCH 09/25] Elide repeated RTL elements.

2018-10-04 Thread Jeff Law
On 9/20/18 4:52 AM, Andrew Stubbs wrote:
> On 19/09/18 17:38, Andrew Stubbs wrote:
>> Here's an updated patch incorporating the RTL front-end changes. I had
>> to change from "repeated 2x" to "repeated x2" because the former is
>> not a valid C token, and apparently that's important.
> 
> Here's a patch with self tests added, for both reading and writing.
> 
> It also fixes a bug when the repeat was the last item in a list.
> 
> OK?
> 
> Andrew
> 
> 180920-elide-repeated-RTL-elements.patch
> 
> Elide repeated RTL elements.
> 
> GCN's 64-lane vectors tend to make RTL dumps very long.  This patch makes them
> far more bearable by eliding long sequences of the same element into 
> "repeated"
> messages.
> 
> This also takes care of reading repeated sequences in the RTL front-end.
> 
> There are self tests for both reading and writing.
> 
> 2018-09-20  Andrew Stubbs  
>   Jan Hubicka  
>   Martin Jambor  
> 
>   gcc/
>   * print-rtl.c (print_rtx_operand_codes_E_and_V): Print how many times
>   the same elements are repeated rather than printing all of them.
>   * read-rtl.c (rtx_reader::read_rtx_operand): Recognize and expand
>   "repeated" elements.
>   * read-rtl-function.c (test_loading_repeat): New function.
>   (read_rtl_function_c_tests): Call test_loading_repeat.
>   * rtl-tests.c (test_dumping_repeat): New function.
>   (rtl_tests_c_tests): Call test_dumping_repeat.
> 
>   gcc/testsuite/
>   * selftests/repeat.rtl: New file.
OK.  Thanks for fixing the reader and adding selftests.

Jeff


Re: [PATCH] handle a null data.decl consistently in strnlen (PR 87490)

2018-10-04 Thread Jeff Law
On 10/3/18 11:58 AM, Martin Sebor wrote:
> The recent strnlen changes to detect reading past unterminated
> arrays introduced a couple of bugs:
> 
> 1) ICE due to assuming that the strnlen argument necessarily
>    refers to a known declaration under some conditions.
> 2) Failing to diagnose uses of unterminated arrays in calls
>    with a non-constant bound known to be in excess of the size
>    of the array.
> 
> The attached patch tested on x86_64-linux fixes both of these
> problems.
> 
> Martin
> 
> gcc-87490.diff
> 
> PR tree-optimization/87490 - ICE in expand_builtin_strnlen with a constant 
> argument and non-constant bound
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/87490
>   * builtins.c (expand_builtin_strnlen): Handle a null data.decl
>   consistently.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/87490
>   * gcc.dg/pr87490.c: New test.
>   * gcc.dg/warn-strnlen-no-nul-2.c: Same.
OK.
jeff




Re: [PATCH v2 1/3] or1k: libgcc: initial support for openrisc

2018-10-04 Thread Joseph Myers
On Thu, 4 Oct 2018, Stafford Horne wrote:

> +or1k-*-linux*)
> + tmake_file="$tmake_file or1k/t-or1k"
> + tmake_file="$tmake_file t-softfp-sfdf t-softfp-excl t-softfp"
> + md_unwind_header=or1k/linux-unwind.h
> + ;;
> +or1k-*-*)
> + tmake_file="$tmake_file or1k/t-or1k"
> + tmake_file="$tmake_file t-softfp-sfdf t-softfp-excl t-softfp"
> + extra_parts="$extra_parts crti.o crtn.o"
> + ;;

Could you clarify why you are using t-softfp-excl?

In general, for soft-float configurations it's best not to use 
t-softfp-excl - meaning the floating-point operations from libgcc2.c get 
implemented directly in soft-fp, rather than through libgcc2.c wrapping 
other soft-fp operations.  While for hard-float configurations it's best 
not to use soft-fp at all.

If you have hard-float configurations using a soft-float ABI, that thus 
need to provide all the functions in question so soft-float objects / 
programs can be linked with hard-float libgcc, t-hardfp should be used 
instead when building a hard-float multilib.  (It's possible to have more 
complicated mixtures in cases where some operations or types come from 
hardfp and others from softfp; some mips and powerpcspe configurations 
do.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v3] Change default to -fno-math-errno

2018-10-04 Thread Joseph Myers
On Wed, 3 Oct 2018, Jeff Law wrote:

> That led me to wonder if we could prove that for the majority of FP
> intensive codes that even if the library set errno that nobody could
> possibly be reading it, then we could in large part treat those builtin
> calls as -fno-math-errno (we'd mark them somehow and check the mark at
> GIMPLE->RTL expansion time (and possibly other points) to allow us to
> generate machine instructions rather than library calls).  That would
> get most of the benefit without the possibility to breaking user code.

I doubt you could prove that without LTO of the whole program because an 
errno value set by a libm function call could always be checked in the 
caller of whatever function is being compiled.

I think it's reasonable to have a default (outside C90 mode) of not 
requiring libm functions to set errno - but as discussed that's different 
from the present effects of -fno-math-errno.  There seem to be three 
reasonable modes here:

(a) Assume errno setting from libm functions in error cases may be 
required by the user; avoid inlining them in ways that would lose such 
errno setting.  This is -fmath-errno.

(b) Assume libm functions may set errno in error cases but this is not 
required by the user, so such functions may be inlined in ways that lose 
errno setting.  This might be the new default outside C90 mode.  
Furthermore, for complex.h functions this can be the default even with 
-fmath-errno; complex.h functions are never required to set errno, 
regardless of the value of math_errhandling.

(c) Optimize on the basis that libm functions do not set errno and thus it 
is OK to move what might be loads and stores of errno across calls to libm 
functions.  Maybe -fno-math-errno does this in fact, but it's not the 
documented semantics, and it breaks valid C code in some cases.  This is 
OK as the default on platforms where libm functions never set errno 
(Darwin?).

(None of these modes should have anything to do with whether other 
standard library functions such as malloc set errno; if optimizations for 
that are useful, they should be controlled by an option whose name does 
not involve "math".)

If we distinguish (b) and (c) in order to change the default, we also need 
a new option name for whichever of them doesn't get the name 
-fno-math-errno.  Both options should result in __NO_MATH_ERRNO__ being 
defined.


In general, libm functions are "const/pure with exceptions".  Maybe there 
should be attributes to describe such a function property - of course, 
none of this is needed to change the math-errno default, it's simply a way 
in which more precise optimization in this area might be allowed.  If so, 
it's critical that no warnings (like those for declaring a function with 
both const and pure) should result from libraries declaring a function 
with properties stricter than those required by the standard, but still 
permitted by the standard semantics, when the built-in function is 
declared only with the properties required by the standard.  That is, it 
must be OK for the built-in function to declare that it may set errno but 
is otherwise pure, and for the library, based on headers seeing 
__NO_MATH_ERRNO__ defined, to select a variant that it declares never sets 
errno.

Here are some of the ways in which a function can be const/pure with 
exceptions (which may be converted by the compiler into being entirely 
const or pure based on command-line options).

* Function depends, or does not depend, on the rounding mode (but does not 
read other global state).  What this means in terms of being const versus 
pure depends on -frounding-math (which has no corresponding preprocessor 
macro, so it's not possible for library headers to choose a different 
attribute depending on -frounding-math).

* Function may set exceptions for no inputs (e.g. fabs, copysign), only 
for signaling NaNs (e.g. trunc) or for inputs other than signaling NaNs.  
There's no preprocessor macro for -f(no-)trapping-math.

* Function may set errno in error cases, or is guaranteed not to do so.

Apart from lgamma functions setting signgam, I think all math.h / 
complex.h functions without pointer arguments should be const apart from 
rounding mode, exceptions and errno.  (For this purpose I'm ignoring other 
error-handling mechanisms like SVID matherr / _LIB_VERSION that involve 
other global state - obsoleted in glibc 2.27 so that with glibc you can no 
longer link a new program or library to use that mechanism.  I think if 
you're using such mechanisms it's reasonable to require you to use 
-fno-builtin-* to avoid problematic optimizations, similar to if you're 
using printf hooks or interposed malloc that access global state visible 
in the caller - all of these things are going well outside the standard.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 3/3] v3: Report vectorization problems via a new opt_problem class

2018-10-04 Thread Richard Sandiford
Richard Biener  writes:
> On Fri, 28 Sep 2018, David Malcolm wrote:
>> This is v3 of the patch; previous versions were:
>>   v2: https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00446.html
>>   v1: https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01462.html
>> 
>> This patch introduces a class opt_problem, along with wrapper
>> classes for bool (opt_result) and for pointers (e.g. opt_loop_vec_info
>> for loop_vec_info).
>> 
>> opt_problem instances are created when an optimization problem
>> is encountered, but only if dump_enabled_p.  They are manually
>> propagated up the callstack, and are manually reported at the
>> "top level" of an optimization if dumping is enabled, to give the user
>> a concise summary of the problem *after* the failure is reported.
>> In particular, the location of the problematic statement is
>> captured and emitted, rather than just the loop's location.
>> 
>> For example:
>> 
>> no-vfa-vect-102.c:24:3: missed: couldn't vectorize loop
>> no-vfa-vect-102.c:27:7: missed: statement clobbers memory: __asm__
> __volatile__("" : : : "memory");
>> 
>> Changed in v3:
>> * This version bootstraps and passes regression testing (on
>>   x86_64-pc-linux-gnu).
>> * added selftests, to exercise the opt_problem machinery
>> * removed the "bool to opt_result" ctor, so that attempts to
>>   use e.g. return a bool from an opt_result-returning function
>>   will fail at compile time
>> * use formatted printing within opt_problem ctor to replace the
>>   various dump_printf_loc calls
>> * dropped i18n
>> * changed the sense of vect_analyze_data_ref_dependence's return
>>   value (see the ChangeLog)
>> * add MSG_PRIORITY_REEMITTED, so that -fopt-info can show the
>>   messages, without them messing up the counts in scan-tree-dump-times
>>   in DejaGnu tests
>> 
>> Re Richard Sandiford's feedback on the v2 patch:
>>   https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00560.html
>> > Since the creation of the opt_problem depends on dump_enabled_p, would
>> > it make sense for the dump_printf_loc to happen automatically on
>> > opt_result::failure, rather than have both?
>> 
>> Yes; this v3 patch does that: opt_result::failure_at is passed a format
>> string with variadic args.  If dumping is enabled, it performs the
>> equivalent of dump_printf_loc in a form that will reach dumpfiles
>> (and -fopt-info-internals), stashing the dumped items in the opt_problem.
>> When the opt_problem is emitted at the top-level, the message is re-emitted
>> (but only for -fopt-info, not for dumpfiles, to avoid duplicates that mess
>> up scan-tree-dump-times in DejaGnu tests)
>> 
>> > I guess this is bike-shedding, but personally I'd prefer an explicit
>> > test for success rather than operator bool, so that:
>> >
>> >opt_result foo = ...;
>> >bool bar = foo;
>> >
>> > is ill-formed.  The danger otherwise might be that we drop a useful
>> > opt_problem and replace it with something more generic.  E.g. the
>> > opt_result form of:
>> >   if (!ok)
>> > {
>> >   if (dump_enabled_p ())
>> > {
>> >   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
>> >"not vectorized: relevant stmt not ");
>> >   dump_printf (MSG_MISSED_OPTIMIZATION, "supported: ");
>> >   dump_gimple_stmt (MSG_MISSED_OPTIMIZATION, TDF_SLIM, stmt, 0);
>> > }
>> >
>> >   return false;
>> > }
>> >
>> > in vect_analyze_stmt could silently drop the information provided by
>> > the subroutine if we forgot to change "ok" from "bool" to "opt_result".
>> 
>> I didn't make that change in v3: if the function returns an opt_result, then
>> the "return false;" will be a compile-time failure, alerting us to the
>> problem.
>> 
>> I guess this is a matter of style, whether explicit is better than
>> implicit.  Dropping the operator bool would require an explicit approach,
>> with something like:
>> 
>> // Explicit style:
>> opt_result res = ...;
>> if (res.failed_p ())
>>return res;
>> 
>> and:
>> 
>> // Explicit style:
>> // It's often currently called "ok":
>> opt_result ok = ...;
>> if (ok.failed_p ())
>>return ok;
>> 
>> as opposed to:
>> 
>> // Implicit style:
>> opt_result res = ...;
>> if (!res)
>>return res;
>> 
>> and:
>> 
>> // Implicit style:
>> opt_result ok = ...;
>> if (!ok)
>>return ok;
>> 
>> I think I went with the implicit style to minimize the lines touched by
>> the patch, but I'm happy with either approach.  [If we're bikeshedding,
>> would renaming those "ok" to "res" be acceptable also?  "ok" reads to
>> me like a "success" value for a status variable, rather than the status
>> variable itself; it's presumably meant to be pronounced with a rising
>> interrogative as it were a question - "ok?" - but that's not visible in
>> the code when reading the usage sites].
>> 
>> Similarly, the pointer wrappers use an implicit style:
>> 
>>   // Implicit style:
>> 
>>   opt_loop_vec_info 

Re: [PATCH 2/3] Add -fopt-info-internals

2018-10-04 Thread Richard Biener
On October 4, 2018 6:35:57 PM GMT+02:00, David Malcolm  
wrote:
>On Thu, 2018-10-04 at 12:45 +0200, Richard Biener wrote:
>> On Fri, 28 Sep 2018, David Malcolm wrote:
>> 
>> > This patch introduces a verbosity level to dump messages:
>> > "user-facing" vs "internals".
>> > 
>> > By default, messages at the top-level dump scope are "user-facing",
>> > whereas those that are in nested scopes are implicitly "internals",
>> > and are filtered out by -fopt-info unless a new "-internals" sub-
>> > option
>> > of "-fopt-info" is supplied (intended purely for use by GCC
>> > developers).
>> > Dumpfiles are unaffected by the change.
>> > 
>> > Given that the vectorizer is the only subsystem using
>> > AUTO_DUMP_SCOPE
>> > (via DUMP_VECT_SCOPE), this only affects the vectorizer.
>> > 
>> > Filtering out these implementation-detail messages goes a long way
>> > towards making -fopt-info-vec-all more accessible to advanced end-
>> > users;
>> > the follow-up patch restores the most pertinent missing details.
>> > 
>> > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>> > 
>> > OK for trunk?
>> 
>> What happens for flags & MSG_ALL_PRIORITIES == 0?  That is,
>> do we require two bits for the priority at all?
>
>I didn't want to go through every dump_* call, adding explicit
>priorities, hence I went with the implicit priority scheme.
>In the patch,
>  flags & MSG_ALL_PRIORITIES == 0
>is the case for almost all dump_* calls, and indicates "use implicit
>priority", making use of the nesting.  This is implemented in the patch
>in dump_context::apply_dump_filter_p.
>
>In a few places, some dump_* calls *do* have an explicit priority, in
>which case dump_context::apply_dump_filter_p uses it.
>
>Hence in some ways it's a tri-state:
>
>(a) implicit
>  (flags & MSG_ALL_PRIORITIES == 0)
>
>(b) explicitly "user-facing"
>  (flags & MSG_ALL_PRIORITIES == MSG_PRIORITY_USER_FACING)
>
>(b) explicitly "internals"
>  (flags & MSG_ALL_PRIORITIES == MSG_PRIORITY_INTERNALS)
>
>...and so needs 2 bits.  This scheme also supports messages that are
>both "user-facing" *and* "internals", though I don't know if that's
>useful. selftest::test_capture_of_dump_calls has test coverage for all
>4 combinations of the bits.
>
>I went with the bits scheme for similarity with the existing MSG_ kinds
>(MSG_OPTIMIZED_LOCATIONS, MSG_MISSED_OPTIMIZATION, MSG_NOTE), and the
>way it allows filtering based on masking.

Ah, OK, that makes sense. 

Patch is OK. 

Richard. 

>Dave
>
>> It's an implementation detail of course, generally I like this
>> (maybe not so much conflating with nesting, but ...).
>
>
>> Richard.
>> 
>> > gcc/ChangeLog:
>> >* doc/invoke.texi (-fopt-info): Document new "internals"
>> >sub-option.
>> >* dump-context.h (dump_context::apply_dump_filter_p): New decl.
>> >* dumpfile.c (dump_options): Update for renaming of MSG_ALL to
>> >MSG_ALL_KINDS.
>> >(optinfo_verbosity_options): Add "internals".
>> >(kind_as_string): Update for renaming of MSG_ALL to
>> > MSG_ALL_KINDS.
>> >(dump_context::apply_dump_filter_p): New member function.
>> >(dump_context::dump_loc): Use apply_dump_filter_p rather than
>> >explicitly masking the dump_kind.
>> >(dump_context::begin_scope): Increment the scope depth
>> > first.  Use
>> >apply_dump_filter_p rather than explicitly masking the
>> > dump_kind.
>> >(dump_context::emit_item): Use apply_dump_filter_p rather than
>> >explicitly masking the dump_kind.
>> >(dump_dec): Likewise.
>> >(dump_hex): Likewise.
>> >(dump_switch_p_1): Default to MSG_ALL_PRIORITIES.
>> >(opt_info_switch_p_1): Default to MSG_PRIORITY_USER_FACING.
>> >(opt_info_switch_p): Update handling of default
>> >MSG_OPTIMIZED_LOCATIONS to cope with default of
>> >MSG_PRIORITY_USER_FACING.
>> >(dump_basic_block): Use apply_dump_filter_p rather than
>> > explicitly
>> >masking the dump_kind.
>> >(selftest::test_capture_of_dump_calls): Update
>> > test_dump_context
>> >instances to use MSG_ALL_KINDS | MSG_PRIORITY_USER_FACING
>> > rather
>> >than MSG_ALL.  Generalize scope test to be run at all four
>> >combinations of with/without MSG_PRIORITY_USER_FACING and
>> >MSG_PRIORITY_INTERNALS, adding examples of explicit priority
>> >for each of the two values.
>> >* dumpfile.h (enum dump_flag): Add comment about the MSG_*
>> > flags.
>> >Rename MSG_ALL to MSG_ALL_KINDS.  Add MSG_PRIORITY_USER_FACING,
>> >MSG_PRIORITY_INTERNALS, and MSG_ALL_PRIORITIES, updating the
>> >values for TDF_COMPARE_DEBUG and TDF_ALL_VALUES.
>> >(AUTO_DUMP_SCOPE): Add a note to the comment about the
>> > interaction
>> >with MSG_PRIORITY_*.
>> >* tree-vect-loop-manip.c (vect_loop_versioning): Mark
>> > versioning
>> >dump messages as MSG_PRIORITY_USER_FACING.
>> >* tree-vectorizer.h (DUMP_VECT_SCOPE): Add a note to the
>> > comment
>> >about the interaction with MSG_PRIORITY_*.
>> > 
>> > 

Re: [PATCH 2/3] Add -fopt-info-internals

2018-10-04 Thread David Malcolm
On Thu, 2018-10-04 at 12:45 +0200, Richard Biener wrote:
> On Fri, 28 Sep 2018, David Malcolm wrote:
> 
> > This patch introduces a verbosity level to dump messages:
> > "user-facing" vs "internals".
> > 
> > By default, messages at the top-level dump scope are "user-facing",
> > whereas those that are in nested scopes are implicitly "internals",
> > and are filtered out by -fopt-info unless a new "-internals" sub-
> > option
> > of "-fopt-info" is supplied (intended purely for use by GCC
> > developers).
> > Dumpfiles are unaffected by the change.
> > 
> > Given that the vectorizer is the only subsystem using
> > AUTO_DUMP_SCOPE
> > (via DUMP_VECT_SCOPE), this only affects the vectorizer.
> > 
> > Filtering out these implementation-detail messages goes a long way
> > towards making -fopt-info-vec-all more accessible to advanced end-
> > users;
> > the follow-up patch restores the most pertinent missing details.
> > 
> > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> > 
> > OK for trunk?
> 
> What happens for flags & MSG_ALL_PRIORITIES == 0?  That is,
> do we require two bits for the priority at all?

I didn't want to go through every dump_* call, adding explicit
priorities, hence I went with the implicit priority scheme.
In the patch,
  flags & MSG_ALL_PRIORITIES == 0
is the case for almost all dump_* calls, and indicates "use implicit
priority", making use of the nesting.  This is implemented in the patch
in dump_context::apply_dump_filter_p.

In a few places, some dump_* calls *do* have an explicit priority, in
which case dump_context::apply_dump_filter_p uses it.

Hence in some ways it's a tri-state:

(a) implicit
  (flags & MSG_ALL_PRIORITIES == 0)

(b) explicitly "user-facing"
  (flags & MSG_ALL_PRIORITIES == MSG_PRIORITY_USER_FACING)

(b) explicitly "internals"
  (flags & MSG_ALL_PRIORITIES == MSG_PRIORITY_INTERNALS)

...and so needs 2 bits.  This scheme also supports messages that are
both "user-facing" *and* "internals", though I don't know if that's
useful. selftest::test_capture_of_dump_calls has test coverage for all
4 combinations of the bits.

I went with the bits scheme for similarity with the existing MSG_ kinds
(MSG_OPTIMIZED_LOCATIONS, MSG_MISSED_OPTIMIZATION, MSG_NOTE), and the
way it allows filtering based on masking.

Dave

> It's an implementation detail of course, generally I like this
> (maybe not so much conflating with nesting, but ...).


> Richard.
> 
> > gcc/ChangeLog:
> > * doc/invoke.texi (-fopt-info): Document new "internals"
> > sub-option.
> > * dump-context.h (dump_context::apply_dump_filter_p): New decl.
> > * dumpfile.c (dump_options): Update for renaming of MSG_ALL to
> > MSG_ALL_KINDS.
> > (optinfo_verbosity_options): Add "internals".
> > (kind_as_string): Update for renaming of MSG_ALL to
> > MSG_ALL_KINDS.
> > (dump_context::apply_dump_filter_p): New member function.
> > (dump_context::dump_loc): Use apply_dump_filter_p rather than
> > explicitly masking the dump_kind.
> > (dump_context::begin_scope): Increment the scope depth
> > first.  Use
> > apply_dump_filter_p rather than explicitly masking the
> > dump_kind.
> > (dump_context::emit_item): Use apply_dump_filter_p rather than
> > explicitly masking the dump_kind.
> > (dump_dec): Likewise.
> > (dump_hex): Likewise.
> > (dump_switch_p_1): Default to MSG_ALL_PRIORITIES.
> > (opt_info_switch_p_1): Default to MSG_PRIORITY_USER_FACING.
> > (opt_info_switch_p): Update handling of default
> > MSG_OPTIMIZED_LOCATIONS to cope with default of
> > MSG_PRIORITY_USER_FACING.
> > (dump_basic_block): Use apply_dump_filter_p rather than
> > explicitly
> > masking the dump_kind.
> > (selftest::test_capture_of_dump_calls): Update
> > test_dump_context
> > instances to use MSG_ALL_KINDS | MSG_PRIORITY_USER_FACING
> > rather
> > than MSG_ALL.  Generalize scope test to be run at all four
> > combinations of with/without MSG_PRIORITY_USER_FACING and
> > MSG_PRIORITY_INTERNALS, adding examples of explicit priority
> > for each of the two values.
> > * dumpfile.h (enum dump_flag): Add comment about the MSG_*
> > flags.
> > Rename MSG_ALL to MSG_ALL_KINDS.  Add MSG_PRIORITY_USER_FACING,
> > MSG_PRIORITY_INTERNALS, and MSG_ALL_PRIORITIES, updating the
> > values for TDF_COMPARE_DEBUG and TDF_ALL_VALUES.
> > (AUTO_DUMP_SCOPE): Add a note to the comment about the
> > interaction
> > with MSG_PRIORITY_*.
> > * tree-vect-loop-manip.c (vect_loop_versioning): Mark
> > versioning
> > dump messages as MSG_PRIORITY_USER_FACING.
> > * tree-vectorizer.h (DUMP_VECT_SCOPE): Add a note to the
> > comment
> > about the interaction with MSG_PRIORITY_*.
> > 
> > gcc/testsuite/ChangeLog:
> > * gcc.dg/plugin/dump-1.c: Update expected output for
> > test_scopes
> > due to "-internals" not being selected.
> > * gcc.dg/plugin/dump-2.c: New test, based on dump-1.c, 

sched2 priorities and replacements

2018-10-04 Thread Robin Dapp
Hi,

I'm working on some insn latency changes in the s390 backend and noticed
a regression in the SPEC2006 bzip2 test case that was due to some insns
being scheduled differently.

The sequence in short form before my change is

;;  | insn | prio |
;;  |  823 |1 | %r1=%r1+0x1nothing
;;  |  420 |1 | %r3=zxn([%r1]) nothing
;;  |  422 |1 | [%r1]=%r7  nothing
;;  |  421 |0 | %r4=%r3nothing
;;  |  423 |0 | %r7=%r3nothing
;;  |  428 |0 | {pc={(%r6!=%r3)?L424:pc};clobber %cc;} nothing

sched2 detects a memory inc/ref that can be changed in order to get rid
of the dependency 823 -> 420.  After that insn 420 is selected and the
same happens for insn 422 before insn 823 is scheduled at third
position.  Actually, what promotes 420 to the first position in the
ready queue is not the priority but some of the tie breakers.

Now, after my change the situation looks like:

;;  | insn | prio |
;;  |  825 |6 | %r1=%r1+0x1nothing
;;  |  420 |4 | %r3=zxn([%r1]) nothing
;;  |  422 |3 | [%r1]=%r5  nothing
;;  |  421 |0 | %r4=%r3nothing
;;  |  423 |0 | %r5=%r3nothing
;;  |  428 |0 | {pc={(%r7!=%r3)?L424:pc};clobber %cc;} nothing
;;   --- forward dependences: 

The same inc/ref as before is detected but insn 825 still has the
highest priority and will be selected in the next cycle.  Subsequently,
the replacement is reverted and insn 420, 422 are scheduled.  This
introduces a data dependency 825 -> 420 that wasn't there before.

In haifa-sched.c:apply_replacement (), after applying the replacement,
the dependent insn (420) is updated and its scheduling properties are
reset.  Nothing happens with insn 825 however.  I would have expected
that its priority should be recomputed after breaking a dependency since
it is recursively determined by the priority of the dependent insns.

As a quick hack, I did

diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 1fdc9df9fb2..1340f34ed9f 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -4713,7 +4713,18 @@ apply_replacement (dep_t dep, bool immediately)
   success = validate_change (desc->insn, desc->loc, desc->newval, 0);
   gcc_assert (success);

+  rtx_insn *insn = DEP_PRO (dep);
+  INSN_PRIORITY_STATUS (insn) = -1;
+  sd_iterator_def sd_it;
+  sd_it = sd_iterator_start (insn, SD_LIST_FORW);
+  while (sd_iterator_cond (_it, ))
+   {
+ DEP_COST (dep) = UNKNOWN_DEP_COST;
+ sd_iterator_next (_it);
+   }
+  priority (insn);
   update_insn_after_change (desc->insn);
+
   if ((TODO_SPEC (desc->insn) & (HARD_DEP | DEP_POSTPONED)) == 0)
fix_tick_ready (desc->insn);

which prompts a recomputation of 825's priority and helps with my
regression (but doesn't get rid of it completely).  I haven't checked
any other test case but wanted to hear some opinions first.

This looks really basic and I'm rather sure I missed some other
scheduling part that could deal with this kind of situation.  Would
somebody mind elucidating?

Regards
 Robin



Re: Thomas Schwinge appointed OpenACC Maintainer

2018-10-04 Thread Thomas Schwinge
Hi!

On Thu, 20 Sep 2018 10:11:36 -0400, David Edelsohn  wrote:
>   I am pleased to announce that the GCC Steering Committee has
> appointed Thomas Schwinge as OpenACC Maintainer.

Thanks for your trust, after all these years.  I'm happy to accept, and
will of course (continue to) work closely and cooperatively with the
other maintainers and generally contributors.


I'm just returning from a multi-month parental leave :-) so I'll still
need a bit of time to catch up with everything, before I can start to
maintain, and review proposed changes.


Committed to trunk in r264848:

commit 042674a54a7346611e789789cdd4bb3194c3211d
Author: tschwinge 
Date:   Thu Oct 4 15:50:34 2018 +

List myself as "libgomp (OpenACC)" and "OpenACC" maintainer

* MAINTAINERS: List myself as "libgomp (OpenACC)" and "OpenACC"
maintainer.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@264848 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 ChangeLog   | 5 +
 MAINTAINERS | 2 ++
 2 files changed, 7 insertions(+)

diff --git ChangeLog ChangeLog
index 97fd95897c12..2623abd4a9f3 100644
--- ChangeLog
+++ ChangeLog
@@ -1,3 +1,8 @@
+2018-10-04  Thomas Schwinge  
+
+   * MAINTAINERS: List myself as "libgomp (OpenACC)" and "OpenACC"
+   maintainer.
+
 2018-09-13  Matthew Malcomson  
 
* MAINTAINERS (Write After Approval): Add self.
diff --git MAINTAINERS MAINTAINERS
index 4c0c30a743cb..e478d3b2dba1 100644
--- MAINTAINERS
+++ MAINTAINERS
@@ -165,6 +165,7 @@ libdecnumberBen Elliston

 libgcc Ian Lance Taylor
 libgo  Ian Lance Taylor
 libgompJakub Jelinek   
+libgomp (OpenACC)  Thomas Schwinge 
 libiberty  Ian Lance Taylor
 libitm Torvald Riegel  
 libobjcNicola Pero 

@@ -236,6 +237,7 @@ auto-vectorizer Zdenek Dvorak   

 loop infrastructureZdenek Dvorak   
 loop ivoptsBin Cheng   
 loop optimizer Bin Cheng   
+OpenACCThomas Schwinge 

 OpenMP Jakub Jelinek   
 testsuite  Rainer Orth 
 testsuite  Mike Stump  


Grüße
 Thomas


Re: [PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-10-04 Thread Martin Sebor

On 10/04/2018 08:58 AM, Jeff Law wrote:

On 8/27/18 9:42 AM, Richard Biener wrote:

On Mon, Aug 27, 2018 at 5:32 PM Jeff Law  wrote:


On 08/27/2018 02:29 AM, Richard Biener wrote:

On Sun, Aug 26, 2018 at 7:26 AM Jeff Law  wrote:


On 08/24/2018 09:58 AM, Martin Sebor wrote:

The warning suppression for -Wstringop-truncation looks for
the next statement after a truncating strncpy to see if it
adds a terminating nul.  This only works when the next
statement can be reached using the Gimple statement iterator
which isn't until after gimplification.  As a result, strncpy
calls that truncate their constant argument that are being
folded to memcpy this early get diagnosed even if they are
followed by the nul assignment:

  const char s[] = "12345";
  char d[3];

  void f (void)
  {
strncpy (d, s, sizeof d - 1);   // -Wstringop-truncation
d[sizeof d - 1] = 0;
  }

To avoid the warning I propose to defer folding strncpy to
memcpy until the pointer to the basic block the strnpy call
is in can be used to try to reach the next statement (this
happens as early as ccp1).  I'm aware of the preference to
fold things early but in the case of strncpy (a relatively
rarely used function that is often misused), getting
the warning right while folding a bit later but still fairly
early on seems like a reasonable compromise.  I fear that
otherwise, the false positives will drive users to adopt
other unsafe solutions (like memcpy) where these kinds of
bugs cannot be as readily detected.

Tested on x86_64-linux.

Martin

PS There still are outstanding cases where the warning can
be avoided.  I xfailed them in the test for now but will
still try to get them to work for GCC 9.

gcc-87028.diff


PR tree-optimization/87028 - false positive -Wstringop-truncation strncpy with 
global variable source string
gcc/ChangeLog:

  PR tree-optimization/87028
  * gimple-fold.c (gimple_fold_builtin_strncpy): Avoid folding when
  statement doesn't belong to a basic block.
  * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Handle MEM_REF on
  the left hand side of assignment.

gcc/testsuite/ChangeLog:

  PR tree-optimization/87028
  * c-c++-common/Wstringop-truncation.c: Remove xfails.
  * gcc.dg/Wstringop-truncation-5.c: New test.

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 07341eb..284c2fb 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1702,6 +1702,11 @@ gimple_fold_builtin_strncpy (gimple_stmt_iterator *gsi,
   if (tree_int_cst_lt (ssize, len))
 return false;

+  /* Defer warning (and folding) until the next statement in the basic
+ block is reachable.  */
+  if (!gimple_bb (stmt))
+return false;

I think you want cfun->cfg as the test here.  They should be equivalent
in practice.


Please do not add 'cfun' references.  Note that the next stmt is also accessible
when there is no CFG.  I guess the issue is that we fold this during
gimplification where the next stmt is not yet "there" (but still in GENERIC)?

That was my assumption.  I almost suggested peeking at gsi_next and
avoiding in that case.


So I'd rather add guards to maybe_fold_stmt in the gimplifier then.

So I think the concern with adding the guards to maybe_fold_stmt is the
possibility of further fallout.

I guess they could be written to target this case specifically to
minimize fallout, but that feels like we're doing the same thing
(band-aid) just in a different place.







We generally do not want to have unfolded stmts in the IL when we can avoid that
which is why we fold most stmts during gimplification.  We also do that because
we now do less folding on GENERIC.

But an unfolded call in the IL should always be safe and we've got
plenty of opportunities to fold it later.


Well - we do.  The very first one is forwprop though which means we'll miss to
re-write some memcpy parts into SSA:

  NEXT_PASS (pass_ccp, false /* nonzero_p */);
  /* After CCP we rewrite no longer addressed locals into SSA
 form if possible.  */
  NEXT_PASS (pass_forwprop);

likewise early object-size will be confused by memcpy calls that just exist
to avoid TBAA issues (another of our recommendations besides using unions).

We do fold mem* early for a reason ;)

"We can always do warnings earlier" would be a similar true sentence.

I'm not disagreeing at all.  There's a natural tension between the
benefits of folding early to enable more optimizations downstream and
leaving the IL in a state where we can give actionable warnings.


Similar trade-offs between folding early and losing information
as a result also impact high-level optimizations.

For instance, folding the strlen argument below

  void f3 (struct A* p)
  {
__builtin_strcpy (p->a, "123");

if (__builtin_strlen (p->a + 1) != 2)   // not folded
  __builtin_abort ();
  }

into

  _2 = [(void *)p_4(D) + 2B];

early on defeats the strlen optimization because there is no
mechanism to determine what member (void 

Re: [PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-10-04 Thread Jeff Law
On 8/28/18 6:12 PM, Martin Sebor wrote:
>>> Sadly, dstbase is the PARM_DECL for d.  That's where things are going
>>> "wrong".  Not sure why you're getting the PARM_DECL in that case.  I'd
>>> debug get_addr_base_and_unit_offset to understand what's going on.
>>> Essentially you're getting different results of
>>> get_addr_base_and_unit_offset in a case where they arguably should be
>>> the same.
>>
>> Probably get_attr_nonstring_decl has the same "mistake" and returns
>> the PARM_DECL instead of the SSA name pointer.  So we're comparing
>> apples and oranges here.
> 
> Returning the SSA_NAME_VAR from get_attr_nonstring_decl() is
> intentional but the function need not (perhaps should not)
> also set *REF to it.
Yea.  That does seem wrong.  I wonder if we should immediately carve out
a patch to fix that independent of everything else and see if there's
any undesirable fallout.


>>
>> I see a lot of "magic" here again in the attempt to "propagate"
>> a nonstring attribute.
> 
> That's the function's purpose: to look for the attribute.  Is
> there a better way to do this?
Well, I think the core issue here is what does the attribute mean and
that's brought up elsewhere in the discussion.   We've got the attribute
attached to the DECL node, but then we want to query an SSA_NAME.  But
the value in an SSA_NAME may have no real relationship with the DECL.

> 
>> Note
>>
>> foo (char *p __attribute__(("nonstring")))
>> {
>>   p = "bar";
>>   strlen (p); // or whatever is necessary to call get_attr_nonstring_decl
>> }
>>
>> is perfectly valid and p as passed to strlen is _not_ nonstring(?).
This is a good example of the SSA_NAME vs DECL question above (there's
others of course).  The value held by the SSA_NAME that gets passed to
strlen here has nothing to do with the underlying DECL for p.

As long as we're using the attribute on the DECL rather than tracking
when it applies to the SSA_NAME, then we have problems like above.


> 
> I don't know if you're saying that it should get a warning or
> shouldn't.  Right now it doesn't because the strlen() call is
> folded before we check for nonstring.
Put that aside for now.  It's an implementation detail.

In an ideal world, should we warn for the above code?  I'd lean towards
warning on the assignment rather than the strlen call.  The assignment
effectively drops the nonstring attribute.  It feels a lot like the case
where an assignment drops a const qualifier.

Jeff


Re: [PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-10-04 Thread Jeff Law
On 8/27/18 9:42 AM, Richard Biener wrote:
> On Mon, Aug 27, 2018 at 5:32 PM Jeff Law  wrote:
>>
>> On 08/27/2018 02:29 AM, Richard Biener wrote:
>>> On Sun, Aug 26, 2018 at 7:26 AM Jeff Law  wrote:

 On 08/24/2018 09:58 AM, Martin Sebor wrote:
> The warning suppression for -Wstringop-truncation looks for
> the next statement after a truncating strncpy to see if it
> adds a terminating nul.  This only works when the next
> statement can be reached using the Gimple statement iterator
> which isn't until after gimplification.  As a result, strncpy
> calls that truncate their constant argument that are being
> folded to memcpy this early get diagnosed even if they are
> followed by the nul assignment:
>
>   const char s[] = "12345";
>   char d[3];
>
>   void f (void)
>   {
> strncpy (d, s, sizeof d - 1);   // -Wstringop-truncation
> d[sizeof d - 1] = 0;
>   }
>
> To avoid the warning I propose to defer folding strncpy to
> memcpy until the pointer to the basic block the strnpy call
> is in can be used to try to reach the next statement (this
> happens as early as ccp1).  I'm aware of the preference to
> fold things early but in the case of strncpy (a relatively
> rarely used function that is often misused), getting
> the warning right while folding a bit later but still fairly
> early on seems like a reasonable compromise.  I fear that
> otherwise, the false positives will drive users to adopt
> other unsafe solutions (like memcpy) where these kinds of
> bugs cannot be as readily detected.
>
> Tested on x86_64-linux.
>
> Martin
>
> PS There still are outstanding cases where the warning can
> be avoided.  I xfailed them in the test for now but will
> still try to get them to work for GCC 9.
>
> gcc-87028.diff
>
>
> PR tree-optimization/87028 - false positive -Wstringop-truncation strncpy 
> with global variable source string
> gcc/ChangeLog:
>
>   PR tree-optimization/87028
>   * gimple-fold.c (gimple_fold_builtin_strncpy): Avoid folding when
>   statement doesn't belong to a basic block.
>   * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Handle MEM_REF on
>   the left hand side of assignment.
>
> gcc/testsuite/ChangeLog:
>
>   PR tree-optimization/87028
>   * c-c++-common/Wstringop-truncation.c: Remove xfails.
>   * gcc.dg/Wstringop-truncation-5.c: New test.
>
> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> index 07341eb..284c2fb 100644
> --- a/gcc/gimple-fold.c
> +++ b/gcc/gimple-fold.c
> @@ -1702,6 +1702,11 @@ gimple_fold_builtin_strncpy (gimple_stmt_iterator 
> *gsi,
>if (tree_int_cst_lt (ssize, len))
>  return false;
>
> +  /* Defer warning (and folding) until the next statement in the basic
> + block is reachable.  */
> +  if (!gimple_bb (stmt))
> +return false;
 I think you want cfun->cfg as the test here.  They should be equivalent
 in practice.
>>>
>>> Please do not add 'cfun' references.  Note that the next stmt is also 
>>> accessible
>>> when there is no CFG.  I guess the issue is that we fold this during
>>> gimplification where the next stmt is not yet "there" (but still in 
>>> GENERIC)?
>> That was my assumption.  I almost suggested peeking at gsi_next and
>> avoiding in that case.
> 
> So I'd rather add guards to maybe_fold_stmt in the gimplifier then.
So I think the concern with adding the guards to maybe_fold_stmt is the
possibility of further fallout.

I guess they could be written to target this case specifically to
minimize fallout, but that feels like we're doing the same thing
(band-aid) just in a different place.



> 
>>>
>>> We generally do not want to have unfolded stmts in the IL when we can avoid 
>>> that
>>> which is why we fold most stmts during gimplification.  We also do that 
>>> because
>>> we now do less folding on GENERIC.
>> But an unfolded call in the IL should always be safe and we've got
>> plenty of opportunities to fold it later.
> 
> Well - we do.  The very first one is forwprop though which means we'll miss to
> re-write some memcpy parts into SSA:
> 
>   NEXT_PASS (pass_ccp, false /* nonzero_p */);
>   /* After CCP we rewrite no longer addressed locals into SSA
>  form if possible.  */
>   NEXT_PASS (pass_forwprop);
> 
> likewise early object-size will be confused by memcpy calls that just exist
> to avoid TBAA issues (another of our recommendations besides using unions).
> 
> We do fold mem* early for a reason ;)
> 
> "We can always do warnings earlier" would be a similar true sentence.
I'm not disagreeing at all.  There's a natural tension between the
benefits of folding early to enable more optimizations downstream and
leaving the IL in a state where we can give 

Re: [PATCH] Error about alias attribute with body definition (PR c/87483).

2018-10-04 Thread Jan Hubicka
> On 10/4/18 11:23 AM, Jan Hubicka wrote:
> >> Hi.
> >>
> >> That's new warnings that warns about ifunc having a function body.
> >> It provides following warning:
> >>
> >> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87483.c: In function 
> >> ‘g’:
> >> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87483.c:5:35: warning: 
> >> ‘alias’ attribute ignored because function is defined [-Wattributes]
> >> 5 | __attribute__ ((alias ("f"))) int g () { return 1; } /* { dg-warning 
> >> "attribute ignored because function is defined" } */
> >>   |   ^
> >>
> >> Hope I found proper place in cgraphunit.c for the warning.
> >> Patch survives bootstrap and tests on x86_64-linux-gnu.
> >>
> >> Ready for trunk?
> > 
> > I think one can first define the function and later declare it again
> > with additional attribute.  I think it is better to take care of this
> > in process_function_and_variable_attributes
> 
> I've done that.
> Patch survives regression tests.
> 
> Is it ok now?
OK,
thanks!
Honza


Re: [PATCH] Redirect call within specific target attribute among MV clones (PR ipa/82625).

2018-10-04 Thread Martin Liška
On 10/4/18 4:17 PM, Jeff Law wrote:
> On 10/4/18 7:56 AM, Martin Liška wrote:
>> Hi.
>>
>> When having a pair of target clones where foo calls bar, if the target
>> attribute are equal we can redirect the call and not use ifunc dispatcher.
>>
>> Patch survives regression tests on x86_64-linux-gnu.
>> Ready for trunk?
>>
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2018-10-04  Martin Liska  
>>
>>  PR ipa/82625
>>  * multiple_target.c (redirect_to_specific_clone): New function.
>>  (ipa_target_clone): Use it.
>>  * tree-inline.c: Fix comment.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-10-04  Martin Liska  
>>
>>  PR ipa/82625
>>  * g++.dg/ext/pr82625.C: New test.
> Your timing is good.  The issues with unnecessary calls to ifunc
> dispatchers when we have an ifunc that is not a leaf in the call graph
> came up in some meetings I was having last week.

Funny.

> 
> I doubt this is enough to address all the issues folks raised, but ISTM
> it should certainly help.

Sure, are you talking about:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83411
https://hannes.hauswedell.net/post/2017/12/09/fmv/

or do you have any other specific situations?

> 
> OK for the trunk.

Installed as r264845.

Martin

> 
> Jeff
> 
> 



Re: [PATCH, LRA] Never reload fixed form constraints memory operand

2018-10-04 Thread Vladimir Makarov




On 10/04/2018 08:27 AM, Thomas Preudhomme wrote:

My bad, I used dg-cmp-results without verbosity which didn't show the
problem It starts to show it with -v -v, I'm not sure why. I'll have a
look right now and revert by the end of today if I cannot come up with
a fix. Does that sound ok?


Yes, that is ok.  It is pretty normal situation with LRA bugs. Sometimes 
a fix needs several iterations as one person can not test all the targets.





Re: [PATCH] Error about alias attribute with body definition (PR c/87483).

2018-10-04 Thread Martin Liška
On 10/4/18 11:23 AM, Jan Hubicka wrote:
>> Hi.
>>
>> That's new warnings that warns about ifunc having a function body.
>> It provides following warning:
>>
>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87483.c: In function ‘g’:
>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87483.c:5:35: warning: 
>> ‘alias’ attribute ignored because function is defined [-Wattributes]
>> 5 | __attribute__ ((alias ("f"))) int g () { return 1; } /* { dg-warning 
>> "attribute ignored because function is defined" } */
>>   |   ^
>>
>> Hope I found proper place in cgraphunit.c for the warning.
>> Patch survives bootstrap and tests on x86_64-linux-gnu.
>>
>> Ready for trunk?
> 
> I think one can first define the function and later declare it again
> with additional attribute.  I think it is better to take care of this
> in process_function_and_variable_attributes

I've done that.
Patch survives regression tests.

Is it ok now?
Martin

> 
> Honza
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2018-10-03  Martin Liska  
>>
>>  PR c/87483
>>  * cgraphunit.c (cgraph_node::finalize_function):
>>  Print error for functions with alias attribute and
>>  body.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-10-03  Martin Liska  
>>
>>  PR c/87483
>>  * gcc.dg/pr87483.c: New test.
>> ---
>>  gcc/cgraphunit.c   |  6 ++
>>  gcc/testsuite/gcc.dg/pr87483.c | 16 
>>  2 files changed, 22 insertions(+)
>>  create mode 100644 gcc/testsuite/gcc.dg/pr87483.c
>>
>>
> 
>> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
>> index c0baaeaefa4..40d6d348214 100644
>> --- a/gcc/cgraphunit.c
>> +++ b/gcc/cgraphunit.c
>> @@ -488,6 +488,12 @@ cgraph_node::finalize_function (tree decl, bool 
>> no_collect)
>>if (!TREE_ASM_WRITTEN (decl))
>>  (*debug_hooks->deferred_inline_function) (decl);
>>  
>> +  if (lookup_attribute ("alias", DECL_ATTRIBUTES (decl))
>> +  && node->definition)
>> +warning_at (DECL_SOURCE_LOCATION (node->decl), OPT_Wattributes,
>> +"% attribute ignored"
>> +" because function is defined");
>> +
>>if (!no_collect)
>>  ggc_collect ();
>>  
>> diff --git a/gcc/testsuite/gcc.dg/pr87483.c b/gcc/testsuite/gcc.dg/pr87483.c
>> new file mode 100644
>> index 000..d3af8dfee5d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pr87483.c
>> @@ -0,0 +1,16 @@
>> +/* PR c/87483 */
>> +/* { dg-require-alias "" } */
>> +
>> +int f (void) { return 0; }
>> +__attribute__ ((alias ("f"))) int g () { return 1; } /* { dg-warning 
>> "attribute ignored because function is defined" } */
>> +__attribute__ ((alias ("f"))) int g2 ();
>> +
>> +int h (void)
>> +{
>> +  return g2 ();
>> +}
>> +
>> +int main()
>> +{
>> +  return h();
>> +}
>>
> 

>From 09ebc54a2163d50c729a04223c1cb95dfc392fac Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 3 Oct 2018 15:18:03 +0200
Subject: [PATCH] Error about alias attribute with body definition (PR
 c/87483).

gcc/ChangeLog:

2018-10-03  Martin Liska  

	PR c/87483
	* cgraphunit.c (process_function_and_variable_attributes):
	Warn about a function with alias attribute and a body.

gcc/testsuite/ChangeLog:

2018-10-03  Martin Liska  

	PR c/87483
	* gcc.dg/pr87483.c: New test.
---
 gcc/cgraphunit.c   |  6 ++
 gcc/testsuite/gcc.dg/pr87483.c | 16 
 2 files changed, 22 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr87483.c

diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index c0baaeaefa4..146855fa804 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -784,6 +784,12 @@ process_function_and_variable_attributes (cgraph_node *first,
 	  DECL_ATTRIBUTES (decl) = remove_attribute ("weakref",
 		 DECL_ATTRIBUTES (decl));
 	}
+  else if (lookup_attribute ("alias", DECL_ATTRIBUTES (decl))
+	  && node->definition
+	  && !node->alias)
+	warning_at (DECL_SOURCE_LOCATION (node->decl), OPT_Wattributes,
+		"% attribute ignored"
+		" because function is defined");
 
   if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (decl))
 	  && !DECL_DECLARED_INLINE_P (decl)
diff --git a/gcc/testsuite/gcc.dg/pr87483.c b/gcc/testsuite/gcc.dg/pr87483.c
new file mode 100644
index 000..d3af8dfee5d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr87483.c
@@ -0,0 +1,16 @@
+/* PR c/87483 */
+/* { dg-require-alias "" } */
+
+int f (void) { return 0; }
+__attribute__ ((alias ("f"))) int g () { return 1; } /* { dg-warning "attribute ignored because function is defined" } */
+__attribute__ ((alias ("f"))) int g2 ();
+
+int h (void)
+{
+  return g2 ();
+}
+
+int main()
+{
+  return h();
+}
-- 
2.19.0



Re: [PATCH] Provide extension hint for aarch64 target (PR driver/83193).

2018-10-04 Thread Martin Liška
On 10/4/18 11:29 AM, Kyrill Tkachov wrote:
> Hi Martin,
> 
> On 18/07/18 16:49, Martin Liška wrote:
>> Hi.
>>
>> This patch improves aarch64 feature modifier hints.
>>
>> May I please ask ARM folks to test the patch?
>> Thanks,
>> Martin
>>
> 
> I've bootstrapped and tested it on aarch64-none-linux-gnu and I didn't see 
> any problems.
> I like the functionality! It is definitely an improvement to the user 
> experience.
> 
> I'm not an aarch64 maintainer but I have some comments on the patch below.

Hi.

Thanks for it.

> 
> Thanks,
> Kyrill
> 
>> gcc/ChangeLog:
>>
>> 2018-07-18  Martin Liska  
>>
>>     PR driver/83193
>>     * common/config/aarch64/aarch64-common.c (aarch64_parse_extension):
>>     Set invalid_extension when there's any.
>>     (aarch64_get_all_extension_candidates): New function.
>>     (aarch64_rewrite_selected_cpu): Pass NULL as new argument.
>>     * config/aarch64/aarch64-protos.h 
>> (aarch64_get_all_extension_candidates):
>>     Declare new function.
>>     * config/aarch64/aarch64.c (aarch64_parse_arch): Record
>>     invalid_feature.
>>     (aarch64_parse_cpu): Likewise.
>>     (aarch64_print_hint_for_feature_modifier): New.
>>     (aarch64_validate_mcpu): Record invalid feature modifier
>>     and print hint for it.
>>     (aarch64_validate_march): Likewise.
>>     (aarch64_handle_attr_arch): Likewise.
>>     (aarch64_handle_attr_cpu): Likewise.
>>     (aarch64_handle_attr_isa_flags): Likewise.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-07-18  Martin Liska  
>>
>>     PR driver/83193
>>     * gcc.target/aarch64/spellcheck_7.c: New test.
>>     * gcc.target/aarch64/spellcheck_8.c: New test.
>> ---
>>  gcc/common/config/aarch64/aarch64-common.c    | 20 +-
>>  gcc/config/aarch64/aarch64-protos.h   |  4 +-
>>  gcc/config/aarch64/aarch64.c  | 67 +++
>>  .../gcc.target/aarch64/spellcheck_7.c | 11 +++
>>  .../gcc.target/aarch64/spellcheck_8.c | 12 
>>  5 files changed, 97 insertions(+), 17 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/aarch64/spellcheck_7.c
>>  create mode 100644 gcc/testsuite/gcc.target/aarch64/spellcheck_8.c
>>
>>
> 
> diff --git a/gcc/common/config/aarch64/aarch64-common.c 
> b/gcc/common/config/aarch64/aarch64-common.c
> index 292fb818705..c2994514004 100644
> --- a/gcc/common/config/aarch64/aarch64-common.c
> +++ b/gcc/common/config/aarch64/aarch64-common.c
> @@ -175,7 +175,8 @@ static const struct arch_to_arch_name all_architectures[] 
> =
>     aarch64_parse_opt_result describing the result.  */
>  
>  enum aarch64_parse_opt_result
> -aarch64_parse_extension (const char *str, unsigned long *isa_flags)
> +aarch64_parse_extension (const char *str, unsigned long *isa_flags,
> + char **invalid_extension)
>  {
> 
> Please document the new argument in the function comment. Same for all the 
> other parsing functions that you extend in the patch.
> In particular, I'd like to see a comment on who allocates the memory for the 
> string and who is responsible for freeing it.
> 
> +/* Append all extension candidates and put them to CANDIDATES vector.  */
> +
> 
> Given the implementation below, how about:
> "Append all architecture extension candidates to the CANDIDATES vector." ?
> 
> +void
> +aarch64_get_all_extension_candidates (auto_vec *candidates)
> +{
> +  const struct aarch64_option_extension *opt;
> +  for (opt = all_extensions; opt->name != NULL; opt++)
> +    candidates->safe_push (opt->name);
> +}
> +
> 
> 
> 
> +
> +/* Print a hint with a suggestion for a feature modifier name
> +   that most closely resembles what the user passed in STR.  */
> +
> +void
> +aarch64_print_hint_for_feature_modifier (const char *str)
> +{
> 
> 
> Elsewhere in the backend and this patch as well we call them extensions 
> rather than feature modifiers.

But currently following error message is printed to user:
cc1: error: invalid feature modifier in '-mcpu=thunderx+cripto'

?

I'm working on another iteration of the patch.

Martin

> Let's be consistent: aarch64_print_hint_for_extension would is my suggestion
> 
> +  auto_vec candidates;
> +  aarch64_get_all_extension_candidates ();
> +  char *s;
> +  const char *hint = candidates_list_and_hint (str, s, candidates);
> +  if (hint)
> +    inform (input_location, "valid arguments are: %s;"
> + " did you mean %qs?", s, hint);
> +  else
> +    inform (input_location, "valid arguments are: %s;", s);
> +
> +  XDELETEVEC (s);
> +}
> +
> 
> diff --git a/gcc/testsuite/gcc.target/aarch64/spellcheck_7.c 
> b/gcc/testsuite/gcc.target/aarch64/spellcheck_7.c
> new file mode 100644
> index 000..1350b865162
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/spellcheck_7.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "" { *-*-* } { "-march=*" } { "" } } */
> 
> Another way that the extensions are used is with -mcpu. For example 

Re: [PATCH] Redirect call within specific target attribute among MV clones (PR ipa/82625).

2018-10-04 Thread Jeff Law
On 10/4/18 7:56 AM, Martin Liška wrote:
> Hi.
> 
> When having a pair of target clones where foo calls bar, if the target
> attribute are equal we can redirect the call and not use ifunc dispatcher.
> 
> Patch survives regression tests on x86_64-linux-gnu.
> Ready for trunk?
> 
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-10-04  Martin Liska  
> 
>   PR ipa/82625
>   * multiple_target.c (redirect_to_specific_clone): New function.
>   (ipa_target_clone): Use it.
>   * tree-inline.c: Fix comment.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-10-04  Martin Liska  
> 
>   PR ipa/82625
>   * g++.dg/ext/pr82625.C: New test.
Your timing is good.  The issues with unnecessary calls to ifunc
dispatchers when we have an ifunc that is not a leaf in the call graph
came up in some meetings I was having last week.

I doubt this is enough to address all the issues folks raised, but ISTM
it should certainly help.

OK for the trunk.

Jeff




[PATCH 1/2] testsuite: multiline.exp: implement optional target/xfail selector

2018-10-04 Thread David Malcolm
Successfully regrtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/testsuite/ChangeLog:
* lib/multiline.exp (proc dg-end-multiline-output): Check argument
count.  If there's a 3rd argument, use dg-process-target on it,
bailing out, or recording expected failures as "maybe_x".
(proc handle-multiline-outputs): Extract "maybe_x", and use it
to convert pass/fail into xpass/xfail.
---
 gcc/testsuite/lib/multiline.exp | 30 +-
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/lib/multiline.exp b/gcc/testsuite/lib/multiline.exp
index 5f8b62f..6c7ecdf 100644
--- a/gcc/testsuite/lib/multiline.exp
+++ b/gcc/testsuite/lib/multiline.exp
@@ -78,6 +78,8 @@ proc dg-begin-multiline-output { args } {
 # Mark the end of an expected multiline output
 # All lines up to here since the last dg-begin-multiline-output are
 # expected to be seen.
+#
+# dg-end-multiline-output comment [{ target/xfail selector }]
 
 proc dg-end-multiline-output { args } {
 global _multiline_last_beginning_line
@@ -85,6 +87,23 @@ proc dg-end-multiline-output { args } {
 set line [expr [lindex $args 0] - 1]
 verbose "multiline output lines: $_multiline_last_beginning_line-$line" 3
 
+if { [llength $args] > 3 } {
+   error "[lindex $args 0]: too many arguments"
+   return
+}
+
+set maybe_x ""
+if { [llength $args] >= 3 } {
+   switch [dg-process-target [lindex $args 2]] {
+   "F" { set maybe_x "x" }
+   "P" { set maybe_x "" }
+   "N" {
+   # If we get "N", this output doesn't apply to us so ignore it.
+   return
+   }
+   }
+}
+
 upvar 1 prog prog
 verbose "prog: $prog" 3
 # "prog" now contains the filename
@@ -93,8 +112,8 @@ proc dg-end-multiline-output { args } {
 set lines [_get_lines $prog $_multiline_last_beginning_line $line]
 
 verbose "lines: $lines" 3
-# Create an entry of the form:  first-line, last-line, lines
-set entry [list $_multiline_last_beginning_line $line $lines]
+# Create an entry of the form:  first-line, last-line, lines, maybe_x
+set entry [list $_multiline_last_beginning_line $line $lines $maybe_x]
 global multiline_expected_outputs
 lappend multiline_expected_outputs $entry
 verbose "within dg-end-multiline-output: multiline_expected_outputs: 
$multiline_expected_outputs" 3
@@ -118,6 +137,7 @@ proc handle-multiline-outputs { text } {
set start_line [lindex $entry 0]
set end_line   [lindex $entry 1]
set multiline  [lindex $entry 2]
+   set maybe_x[lindex $entry 3]
verbose "  multiline: $multiline" 3
set rexp [_build_multiline_regex $multiline $index]
verbose "rexp: ${rexp}" 4
@@ -130,10 +150,10 @@ proc handle-multiline-outputs { text } {
 
# Use "regsub" to attempt to prune the pattern from $text
if {[regsub -line $rexp $text "" text]} {
-   # Success; the multiline pattern was pruned.
-   pass "$title was found: \"$escaped_regex\""
+   # The multiline pattern was pruned.
+   ${maybe_x}pass "$title was found: \"$escaped_regex\""
} else {
-   fail "$title not found: \"$escaped_regex\""
+   ${maybe_x}fail "$title not found: \"$escaped_regex\""
}
 
set index [expr $index + 1]
-- 
1.8.5.3



[PATCH 2/2] Support string locations for C++ in -Wformat (PR c++/56856)

2018-10-04 Thread David Malcolm
-Wformat in the C++ FE doesn't work as well as it could:
(a) it doesn't report precise locations within the string literal, and
(b) it doesn't underline arguments for those arguments !CAN_HAVE_LOCATION_P,
despite having location wrapper nodes.

For example:

  Wformat-ranges.C:32:10: warning: format '%s' expects argument of type 
'char*', but argument 2 has type 'int' [-Wformat=]
  32 |   printf("hello %s", 42);
 |  ^~

(a) is due to not wiring up the langhook for extracting substring
locations.

This patch uses the one in c-family; it also fixes string literal
parsing so that it records string concatenations (needed for
extracting substring locations from concatenated strings).

(b) is due to the call to maybe_constant_value here:
   fargs[j] = maybe_constant_value (argarray[j]);
within build_over_call.

The patch fixes this by building a vec of location_t values when
calling check_function_arguments.
I attempted to eliminate the maybe_constant_value call here, but
it's needed by e.g. check_function_sentinel for detecting NULL,
and that code is in "c-family", so it can't simply call into
maybe_constant_value (which is in "cp").

With this patch, the output for the above example is improved to:

  Wformat-ranges.C:32:18: warning: format '%s' expects argument of type 
'char*', but argument 2 has type 'int' [-Wformat=]
  32 |   printf("hello %s", 42);
 | ~^   ~~
 |  |   |
 |  |   int
 |  char*
 | %d

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu (on top of
the multiline.exp patch).

OK for trunk?

gcc/cp/ChangeLog:
PR c++/56856
* call.c (build_over_call): Build a vec of locations of the
arguments before the call to maybe_constant_value, and pass to
check_function_arguments.
* cp-lang.c (LANG_HOOKS_GET_SUBSTRING_LOCATION): Define as
c_get_substring_location.
* parser.c (cp_parser_string_literal): Capture string
concatenation locations.

gcc/ChangeLog:
PR c++/56856
* input.c (expand_location_to_spelling_point): Add param "aspect"
and use rather than hardcoding LOCATION_ASPECT_CARET.
(get_substring_ranges_for_loc): Handle the case of a single token
within a macro expansion.
* input.h (expand_location_to_spelling_point): Add "aspect" param,
defaulting to LOCATION_ASPECT_CARET.

gcc/testsuite/ChangeLog:
PR c++/56856
* g++.dg/ext/builtin4.C: Set expected location for warning to the
correct location within the format string.
* g++.dg/plugin/plugin.exp (plugin_test_list): Add the plugin and
files for testing locations within string literal locations from
the C frontend.
* g++.dg/warn/Wformat-method.C: New test.
* g++.dg/warn/Wformat-pr71863.C: New test.
* g++.dg/warn/Wformat-ranges-c++11.C: New test.
* g++.dg/warn/Wformat-ranges.C: New test, based on
gcc.dg/format/diagnostic-ranges.c.
* gcc.dg/plugin/diagnostic-test-string-literals-1.c
(test_multitoken_macro): Generalize expected output to work with
both C and C++.
* gcc.dg/plugin/diagnostic-test-string-literals-2.c
(test_stringified_token_1): Likewise.
(test_stringified_token_3): Likewise.
---
 gcc/cp/call.c  |   4 +-
 gcc/cp/cp-lang.c   |   3 +
 gcc/cp/parser.c|  14 +-
 gcc/input.c|  44 ++-
 gcc/input.h|   5 +-
 gcc/testsuite/g++.dg/ext/builtin4.C|   2 +-
 gcc/testsuite/g++.dg/plugin/plugin.exp |   5 +
 gcc/testsuite/g++.dg/warn/Wformat-method.C |  40 +++
 gcc/testsuite/g++.dg/warn/Wformat-pr71863.C|  33 ++
 gcc/testsuite/g++.dg/warn/Wformat-ranges-c++11.C   |  18 +
 gcc/testsuite/g++.dg/warn/Wformat-ranges.C | 374 +
 gcc/testsuite/gcc.dg/format/diagnostic-ranges.c|  11 +-
 .../plugin/diagnostic-test-string-literals-1.c |   6 +-
 .../plugin/diagnostic-test-string-literals-2.c |   4 +-
 14 files changed, 537 insertions(+), 26 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wformat-method.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wformat-pr71863.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wformat-ranges-c++11.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wformat-ranges.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index b2ca667..747f837 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8188,6 +8188,7 @@ build_over_call (struct z_candidate *cand, int flags, 
tsubst_flags_t complain)
 {
   tree *fargs = (!nargs ? argarray
: (tree *) alloca (nargs * sizeof (tree)));
+  auto_vec arglocs (nargs);
   for (j = 

Re: [PATCH] Fix typo, fixing PR87465

2018-10-04 Thread Jeff Law
On 10/4/18 2:58 AM, Richard Biener wrote:
> On Wed, 3 Oct 2018, Christophe Lyon wrote:
> 
>> On Mon, 1 Oct 2018 at 11:36, Richard Biener  wrote:
>>>
>>>
>>> The following typo-fix happens to fix a --param max-peel-branches limit
>>> caused missed peeling.  The typo is present everywhere, the missed
>>> peeling is a regression from GCC 7.
>>>
>>> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>>>
>>> I'm not really considering to backport this anywhere.  Note the
>>> testcase isn't fully optimized on the tree level because
>>> DOM doesn't figure out the trivial CSE after SLP vectorizes the
>>> array init (we have PRs for that issue).
>>>
>>> Richard.
>>>
>>> 2018-10-01  Richard Biener  
>>>
>>> PR tree-optimization/87465
>>> * tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Fix typo
>>> causing branch miscounts.
>>>
>>> * gcc.dg/tree-ssa/cunroll-15.c: New testcase.
>>>
>>
>> Hi,
>>
>> The new testcase fails on arm and powerpc.
> 
> Bah, it's the old issue that DOM doesn't handle CSE of
> 
>   in = *.LC0;
>   _52 = in[1];
>   _51 = in[0];
Right.  It doesn't work hard at all to hash alternate forms of memory
references to the same hash value.  It's one of the reasons I'd like to
move away from its hash routines to a standardized VN like you've
implemented elsewhere as I believe your VN code does a better job at
finding these alternate forms and generating a consistent VN for them.


> 
> I see no other way of xfailing for a specific list of targets then.
> 
> Please feel free to amend if new ones pop up.
> 
> I'm committing the following.  I really wonder if we could go
> for a cheap non-iterating FRE now (and whether late jump threading
> from DOM is important).
It's not hugely important, though with the desire to kill the VRP
threading, the late DOM threading increases slightly in importance.

jeff


[PATCH] Redirect call within specific target attribute among MV clones (PR ipa/82625).

2018-10-04 Thread Martin Liška
Hi.

When having a pair of target clones where foo calls bar, if the target
attribute are equal we can redirect the call and not use ifunc dispatcher.

Patch survives regression tests on x86_64-linux-gnu.
Ready for trunk?

Martin

gcc/ChangeLog:

2018-10-04  Martin Liska  

PR ipa/82625
* multiple_target.c (redirect_to_specific_clone): New function.
(ipa_target_clone): Use it.
* tree-inline.c: Fix comment.

gcc/testsuite/ChangeLog:

2018-10-04  Martin Liska  

PR ipa/82625
* g++.dg/ext/pr82625.C: New test.
---
 gcc/multiple_target.c  | 51 ++
 gcc/testsuite/g++.dg/ext/pr82625.C | 36 +
 gcc/tree-inline.c  |  2 +-
 3 files changed, 88 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/pr82625.C


diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c
index a610d9a3345..2d892f201c5 100644
--- a/gcc/multiple_target.c
+++ b/gcc/multiple_target.c
@@ -451,6 +451,54 @@ expand_target_clones (struct cgraph_node *node, bool definition)
   return ret;
 }
 
+/* When NODE is a target clone, consider all callees and redirect
+   to a clone with equal target attributes.  That prevents multiple
+   multi-versioning dispatches and a call-chain can be optimized.  */
+
+static void
+redirect_to_specific_clone (cgraph_node *node)
+{
+  cgraph_function_version_info *fv = node->function_version ();
+  if (fv == NULL)
+return;
+
+  tree attr_target = lookup_attribute ("target", DECL_ATTRIBUTES (node->decl));
+  if (attr_target == NULL_TREE)
+return;
+
+  /* We need to remember NEXT_CALLER as it could be modified in the loop.  */
+  for (cgraph_edge *e = node->callees; e ; e = e->next_callee)
+{
+  cgraph_function_version_info *fv2 = e->callee->function_version ();
+  if (!fv2)
+	continue;
+
+  tree attr_target2 = lookup_attribute ("target",
+	DECL_ATTRIBUTES (e->callee->decl));
+
+  /* Function is not calling proper target clone.  */
+  if (!attribute_list_equal (attr_target, attr_target2))
+	{
+	  while (fv2->prev != NULL)
+	fv2 = fv2->prev;
+
+	  /* Try to find a clone with equal target attribute.  */
+	  for (; fv2 != NULL; fv2 = fv2->next)
+	{
+	  cgraph_node *callee = fv2->this_node;
+	  attr_target2 = lookup_attribute ("target",
+	   DECL_ATTRIBUTES (callee->decl));
+	  if (attribute_list_equal (attr_target, attr_target2))
+		{
+		  e->redirect_callee (callee);
+		  e->redirect_call_stmt_to_callee ();
+		  break;
+		}
+	}
+	}
+}
+}
+
 static unsigned int
 ipa_target_clone (void)
 {
@@ -464,6 +512,9 @@ ipa_target_clone (void)
   for (unsigned i = 0; i < to_dispatch.length (); i++)
 create_dispatcher_calls (to_dispatch[i]);
 
+  FOR_EACH_FUNCTION (node)
+redirect_to_specific_clone (node);
+
   return 0;
 }
 
diff --git a/gcc/testsuite/g++.dg/ext/pr82625.C b/gcc/testsuite/g++.dg/ext/pr82625.C
new file mode 100644
index 000..47bd2df1104
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/pr82625.C
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+__attribute__ ((target ("default")))
+static unsigned foo(const char *buf, unsigned size) {
+  return 1;
+}
+
+__attribute__ ((target ("avx")))
+static unsigned foo(const char *buf, unsigned size) {
+  return 2;
+}
+
+__attribute__ ((target ("default")))
+unsigned bar() {
+  char buf[4096];
+  unsigned acc = 0;
+  for (int i = 0; i < sizeof(buf); i++) {
+acc += foo([i], 1);
+  }
+  return acc;
+}
+
+__attribute__ ((target ("avx")))
+unsigned bar() {
+  char buf[4096];
+  unsigned acc = 0;
+  for (int i = 0; i < sizeof(buf); i++) {
+acc += foo([i], 1);
+  }
+  return acc;
+}
+
+/* { dg-final { scan-tree-dump-times "return 4096;" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "return 8192;" 1 "optimized" } } */
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index ff8ee8ce78f..913425394e0 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -2631,7 +2631,7 @@ copy_loops (copy_body_data *id,
 }
 }
 
-/* Call cgraph_redirect_edge_call_stmt_to_callee on all calls in BB */
+/* Call redirect_call_stmt_to_callee on all calls in BB.  */
 
 void
 redirect_all_calls (copy_body_data * id, basic_block bb)



Re: [PATCH v3] Change default to -fno-math-errno

2018-10-04 Thread Szabolcs Nagy
On 04/10/18 14:25, Jeff Law wrote:
> On 10/4/18 6:37 AM, Richard Biener wrote:
>> On Thu, Oct 4, 2018 at 2:06 PM Joseph Myers  wrote:
>>>
>>> On Thu, 4 Oct 2018, Richard Biener wrote:
>>>
 The other issue is that we're treating -fno-math-errno as disabling
 errno handling in general (not only for math functions).  That would
>>>
>>> Also, is it treated as meaning the math functions *do not set errno* (so a
>>> load of errno (or of anything that might be errno) before a math function
>>> call could be moved after the call)?
>>
>> Yes.  That's implicit in math functions becoming CONST.
> Right.  But I think this highlights the other issue here.  Namely that
> if the user's code does read errno, we don't know if the intent was to
> read errno from some other call before the math bits or the most recent
> math call.
> 

if (math_errhandling & MATH_ERRNO) == 0 a math
function may still set errno.

it can only set it if there was an error though,
not arbitrarily clobber it, but this means that

(1) reordering errno access around math calls is
invalid even with -fno-math-errno.

(2) user code reading errno after a math call can
assume it to be unmodified in exactly the same
cases with -fmath-errno and -fno-math-errno.

> So again, as much as I'd like to move to a world where we don't have to
> worry about errno for math functions, I'm not comfortable making that
> change, though I won't object if someone  with a stronger standards
> background says "just do it".
> 
> jeff
> 



Re: [PATCH][IRA,LRA] Fix PR87466, all pseudos live across setjmp are spilled

2018-10-04 Thread Peter Bergner
On 10/3/18 10:08 PM, Jeff Law wrote:
> Given the desire for reload to die and the difficulties testing this
> change on a reload target, I don't think mirroring this optimization in
> reload is necessary.

Right, it's good to have incentives to move away from reload to LRA.
Maybe we could add a few random unneeded conflicts when using reload
to push targets a little more? ;-)


> OK for the trunk.

Thank you everyone for your reviews!  Patch committed.

Peter



Re: [PATCH v3] Change default to -fno-math-errno

2018-10-04 Thread Jeff Law
On 10/4/18 6:37 AM, Richard Biener wrote:
> On Thu, Oct 4, 2018 at 2:06 PM Joseph Myers  wrote:
>>
>> On Thu, 4 Oct 2018, Richard Biener wrote:
>>
>>> The other issue is that we're treating -fno-math-errno as disabling
>>> errno handling in general (not only for math functions).  That would
>>
>> Also, is it treated as meaning the math functions *do not set errno* (so a
>> load of errno (or of anything that might be errno) before a math function
>> call could be moved after the call)?
> 
> Yes.  That's implicit in math functions becoming CONST.
Right.  But I think this highlights the other issue here.  Namely that
if the user's code does read errno, we don't know if the intent was to
read errno from some other call before the math bits or the most recent
math call.

So again, as much as I'd like to move to a world where we don't have to
worry about errno for math functions, I'm not comfortable making that
change, though I won't object if someone  with a stronger standards
background says "just do it".

jeff


Re: [PATCH][i386] Fix scalar costing in ix86_add_stmt_cost

2018-10-04 Thread Richard Biener
On Thu, 4 Oct 2018, Richard Biener wrote:

> 
> The following fixes bogus differences in scalar iteration cost
> computed by the vectorizer when comparing 128 and 256 bit vectorizations.
> This was caused by the hook looking at the vector types mode even
> for kind == scalar_stmt and thus returning vector operation costs
> for things like add or negate.
> 
> This is most noticable on targets where ix86_vec_cost applies
> multiplication based on vector size (TARGET_AVX128_OPTIMAL, thus
> Zen and Bulldozer).  But it will also adjust the actual costs
> everywhere where scalar and vector costs diverge.
> 
> The adjustments done for Silvermont also look broken since they
> are applied to both scalar and vector cost which makes it mostly
> a no-op.  The patch adjusts it to only apply for vector costing
> but of course I didn't benchmark this and the magic number of 1.7
> may not make sense after this fix so I'm happy to leave that
> out - Kirill?  Should I just go ahead with that for trunk (we can
> revert or adjust if autotesters pick up regressions on your side)?
> 
> Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK?

So this revealed

FAIL: gcc.target/i386/vect-double-2.c scan-tree-dump-times vect 
"Vectorized loops: 1" 1

and looking closer makes it clear that for TARGET_BONNELL we
do not want to penaltize vector loads or stores but just
arithmetic(?), so the kind == vector_stmt was more correct.

Similar logic might apply to the TARGET_SILVERMONT and friends
so I'll leave that alone.

Thus please consider the patch with just the first hunk.

Thanks,
Richard.

> Richard.
> 
> 2018-10-04   Richard Biener  
> 
>   * config/i386/i386.c (ix86_add_stmt_cost): When scalar cost
>   is asked for initialize mode to the component mode of the
>   vector type.  Make sure Bonnel and esp. other Atom cost
>   adjustments are not done for scalar cost estimates.
> 
> Index: gcc/config/i386/i386.c
> ===
> --- gcc/config/i386/i386.c(revision 264837)
> +++ gcc/config/i386/i386.c(working copy)
> @@ -49486,17 +49486,21 @@ ix86_add_stmt_cost (void *data, int coun
>  {
>unsigned *cost = (unsigned *) data;
>unsigned retval = 0;
> +  bool scalar_p
> += (kind == scalar_stmt || kind == scalar_load || kind == scalar_store);
>  
>tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
>int stmt_cost = - 1;
>  
>bool fp = false;
> -  machine_mode mode = TImode;
> +  machine_mode mode = scalar_p ? SImode : TImode;
>  
>if (vectype != NULL)
>  {
>fp = FLOAT_TYPE_P (vectype);
>mode = TYPE_MODE (vectype);
> +  if (scalar_p)
> + mode = TYPE_MODE (TREE_TYPE (vectype));
>  }
>  
>if ((kind == vector_stmt || kind == scalar_stmt)
> @@ -49632,7 +49636,7 @@ ix86_add_stmt_cost (void *data, int coun
>  stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
>  
>/* Penalize DFmode vector operations for Bonnell.  */
> -  if (TARGET_BONNELL && kind == vector_stmt
> +  if (TARGET_BONNELL && !scalar_p
>&& vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode)
>  stmt_cost *= 5;  /* FIXME: The value here is arbitrary.  */
>  
> @@ -49648,7 +49652,8 @@ ix86_add_stmt_cost (void *data, int coun
>   for Silvermont as it has out of order integer pipeline and can execute
>   2 scalar instruction per tick, but has in order SIMD pipeline.  */
>if ((TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_GOLDMONT_PLUS
> -   || TARGET_TREMONT || TARGET_INTEL) && stmt_info && stmt_info->stmt)
> +   || TARGET_TREMONT || TARGET_INTEL)
> +  && !scalar_p && stmt_info && stmt_info->stmt)
>  {
>tree lhs_op = gimple_get_lhs (stmt_info->stmt);
>if (lhs_op && TREE_CODE (TREE_TYPE (lhs_op)) == INTEGER_TYPE)
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH v3] Change default to -fno-math-errno

2018-10-04 Thread Joseph Myers
On Thu, 4 Oct 2018, Richard Biener wrote:

> So it means that GCC may elide errno setting (by inlining a function for 
> example) but otherwise assume nothing else?

Yes, on systems where libm functions do in fact set errno (but it can also 
assume that only the actual errno variable is clobbered, not e.g. some 
random int in the middle of your array).  (It would be possible to have 
libm function variants that explicitly guarantee not to set errno, if 
there are particular functions for which having such variants is useful to 
e.g. the vectorizer, but it would of course be a property of the libm 
implementation whether such variants are available and for what 
functions.)

There could be a variant option that does mean "libm functions are not 
called with arguments that make them set errno", and -ffast-math could 
imply that (-ffinite-math-only doesn't, because e.g. underflow is 
consistent with -ffinite-math-only but can result in errno setting).

Or you could say -fno-math-errno is that variant option, but the thing 
enabled by default is the weaker option meaning "may or may not set errno" 
(but if Darwin libm indeed never sets errno, the stronger version should 
be enabled by default there).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, OpenACC] Fortran "declare create"/allocate support for OpenACC

2018-10-04 Thread Julian Brown
On Sun, 23 Sep 2018 10:48:52 +0200
Bernhard Reutner-Fischer  wrote:

> On Sat, 22 Sep 2018 at 00:32, Julian Brown 
> wrote:
> 
> @@ -6218,13 +6221,20 @@ add_clause (gfc_symbol *sym, gfc_omp_map_op
> map_op) {
>gfc_omp_namelist *n;
> 
> +  if (!module_oacc_clauses)
> +module_oacc_clauses = gfc_get_omp_clauses ();
> +
> +  if (sym->backend_decl == NULL)
> +gfc_get_symbol_decl (sym);
> +
> +  for (n = module_oacc_clauses->lists[OMP_LIST_MAP]; n != NULL; n =
> n->next)
> +if (n->sym->backend_decl == sym->backend_decl)
> +  return;
> +
> 
> Didn't look too close, but should this throw an error instead of
> silently returning, or was the error emitted earlier?

The purpose of this fragment seems not to have been to do with error
reporting at all, but rather to do with de-duplicating symbols that
are listed (once) in clauses of "declare" directives in module blocks.
Variables that are listed twice are diagnosed elsewhere.

As for why the de-duplication is necessary, it seems to be because of
the way that modules are instantiated in programs and in subroutines.
E.g. in declare-allocatable-1.f90, we have something along the lines of:

  module vars
implicit none
integer, parameter :: n = 100
real*8, allocatable :: b(:)
   !$acc declare create (b)
  end module vars

  program test
use vars
...
  end program test

  subroutine sub1
use vars
...
  end subroutine sub1

  subroutine sub2
use vars
...
  end subroutine sub2

The function find_module_oacc_declare_clauses is called for each of
'test', 'sub1' and 'sub2'. But in trans-decl.c:finish_oacc_declare, the
new declare clauses are only attached to the namespace for a FL_PROGRAM
(i.e. 'test'), not for the subroutines. The module_oacc_clauses global
variable is reset only after moving the clauses to a FL_PROGRAM's
namespace, otherwise it accumulates.

Hence, with the above code, we'd scan 'test', find declare clauses, and
attach them to the namespace for 'test'. We'd then reset
module_oacc_clauses.

Then, we'd scan 'sub1', and accumulate declare clauses from 'vars' into
a fresh module_oacc_clauses.

Then we'd scan 'sub2', and accumulate declare clauses from 'vars'
again: this is why the de-duplication in the patch seemed to be
necessary.

This seems wrong to me though, and admits the possibility of clauses
instantiated in a subroutine "leaking" into a subsequent program block.
As a tentative fix, I've tried resetting module_oacc_clauses before
each time the find_module_oacc_declare_clauses traversal takes place,
and removing the de-duplication code.

This seems to work fine for the current tests in the testsuite, but I
wonder the reason that things weren't done like like that to start
with? The code dates back to 2015 (by James Norris):

https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02367.html

> Furthermore the testcase uses "call abort" which is non-standard.
> We recently moved to "STOP n" in the testsuite, please adjust the new
> testcases accordingly.

Fixed. Re-tested with offloading to NVPTX and bootstrapped. OK?

Thank you,

Julian

ChangeLog

gcc/
* omp-low.c (scan_sharing_clauses): Update handling of OpenACC declare
create, declare copyin and declare deviceptr to have local lifetimes.
(convert_to_firstprivate_int): Handle pointer types.
(convert_from_firstprivate_int): Likewise.  Create local storage for
the values being pointed to.  Add new orig_type argument.
(lower_omp_target): Handle GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
Add orig_type argument to convert_from_firstprivate_int call.
Allow pointer types with GOMP_MAP_FIRSTPRIVATE_INT.  Don't privatize
firstprivate VLAs.
* tree-pretty-print.c (dump_omp_clause): Handle
GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.

gcc/fortran/
* gfortran.h (enum gfc_omp_map_op): Add OMP_MAP_DECLARE_ALLOCATE,
OMP_MAP_DECLARE_DEALLOCATE.
(gfc_omp_clauses): Add update_allocatable.
* trans-array.c (gfc_array_allocate): Call
gfc_trans_oacc_declare_allocate for decls that have oacc_declare_create
attribute set.
* trans-decl.c (add_attributes_to_decl): Enable lowering of OpenACC
declare create, declare copyin and declare deviceptr clauses.
(find_module_oacc_declare_clauses): Relax oacc_declare_create to
OMP_MAP_ALLOC, and oacc_declare_copyin to OMP_MAP_TO, in order to
match OpenACC 2.5 semantics.
(finish_oacc_declare): Reset module_oacc_clauses before scanning each
namespace.
* trans-openmp.c (gfc_trans_omp_clauses): Use GOMP_MAP_ALWAYS_POINTER
(for update directive) or GOMP_MAP_FIRSTPRIVATE_POINTER (otherwise) for
allocatable scalar decls.  Handle OMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}
clauses.
(gfc_trans_oacc_executable_directive): Use GOMP_MAP_ALWAYS_POINTER
for allocatable scalar data clauses inside acc update directives.

[PATCH] i386: Correct _mm512_mask3_fmaddsub_round_pd

2018-10-04 Thread H.J. Lu
Define _mm512_mask3_fmaddsub_round_pd with
__builtin_ia32_vfmaddsubpd512_mask, instead of
__builtin_ia32_vfmaddpd512_mask.

PR target/87517
* config/i386/avx512fintrin.h (_mm512_mask_fmaddsub_round_pd):
Defined with __builtin_ia32_vfmaddsubpd512_mask.
---
 gcc/config/i386/avx512fintrin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h
index 550c2b60886..f9bb4f3be49 100644
--- a/gcc/config/i386/avx512fintrin.h
+++ b/gcc/config/i386/avx512fintrin.h
@@ -3833,7 +3833,7 @@ _mm512_maskz_fnmsub_round_ps (__mmask16 __U, __m512 __A, 
__m512 __B,
 (__m512d)__builtin_ia32_vfmaddsubpd512_mask(A, B, C, -1, R)
 
 #define _mm512_mask_fmaddsub_round_pd(A, U, B, C, R)\
-(__m512d)__builtin_ia32_vfmaddpd512_mask(A, B, C, U, R)
+(__m512d)__builtin_ia32_vfmaddsubpd512_mask(A, B, C, U, R)
 
 #define _mm512_mask3_fmaddsub_round_pd(A, B, C, U, R)   \
 (__m512d)__builtin_ia32_vfmaddsubpd512_mask3(A, B, C, U, R)
-- 
2.17.1



Re: [PATCH] Introduce libgcov.so run-time library (PR gcov-profile/84107).

2018-10-04 Thread Martin Liška
On 9/18/18 10:00 AM, Martin Liška wrote:
> On 9/17/18 1:39 PM, Martin Liška wrote:
>> On 9/12/18 4:48 PM, Richard Biener wrote:
>>> On Wed, Sep 12, 2018 at 11:27 AM Martin Liška  wrote:

 On 9/11/18 5:08 PM, Alexander Monakov wrote:
> On Tue, 11 Sep 2018, Martin Liška wrote:
>> I've discussed the topic with Alexander on the Cauldron and we hoped
>> that the issue with unique __gcov_root can be handled with DECL_COMMON 
>> in each DSO.
>> Apparently this approach does not work as all DSOs in an executable have 
>> eventually
>> just a single instance of the __gcov_root, which is bad.
>
> No, that happens only if they have default visibility.  Emitting them with
> "hidden" visibility solves this.

 Ah, ok, thanks. So theoretically that way should work.

>
>> So that I returned back to catch the root of the failure. It shows that 
>> even with
>> -Wl,--dynamic-list-data __gcov_indirect_call_callee point to a different 
>> variable
>> among the DSOs. The reason is that the variable is TLS:
>
> (no, that the variable is TLS is not causing the bug; the issue is 
> libraries
> carry public definitions of just one or both variables depending on if 
> they
> have indirect calls or not, and the library state becomes "diverged" when
> one variable gets interposed while the other does not)

 I see, I'm attaching patch that does that. I can confirm your test-case 
 works
 fine w/o -Wl,--dynamic-list-data.

 I'm wondering if it will work as well with dlopen/dlsym machinery? Or now
 the linker flag will be needed?

>
>> That said I would like to go this direction. I would be happy to receive
>> a feedback. Then I'll test the patch.
>
> Hm, this TLS change is attacking the issue from a wrong direction.  What I
> suggested on the Cauldron as a simple and backportable way of solving this
> is to consolidate the two TLS variables into one TLS struct, which is then
> either interposed or not as a whole.

 Got it, current I prefer the solution.
>>>
>>> Sounds good.  I notice the code had this before but...
>>>
>>> +  TREE_PUBLIC (ic_tuple_var) = 1;
>>> +  DECL_EXTERNAL (ic_tuple_var) = 1;
>>> +  TREE_STATIC (ic_tuple_var) = 1;
>>>
>>> that's one flag too much?!  I suggest to drop DECL_EXTERNAL.
>>
>> I've done that. I'm sending updated version of the patch.
>> I'm currently testing the patch. Ready after it finishes tests?
>>
>> Martin
>>
>>>
>>> Richard.
>>>
 Martin

>
> Alexander
>

>>
> 
> Hi.
> 
> This is tested version where DECL_EXTERNAL is needed for LTO partitioning
> to work properly. Note that the previous variables emitted in tree-profile.c 
> were
> also DECL_EXTERNAL.
> 
> Patch survives tests on ppcl64-linux-gnu.
> 
> Martin
> 

Ok.

I'm sending final version of the patch that I'm going to install.
I removed TREE_STATIC and also removed few places where ptr_type_node
can replace build_pointer_type (void_type_node).

Martin
>From 3aa0d5a783bcef796d2dee09ff788d736ab0672d Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 12 Sep 2018 11:24:59 +0200
Subject: [PATCH] Fix divergence in indirect profiling (PR gcov-profile/84107).

gcc/ChangeLog:

2018-09-17  Martin Liska  

	PR gcov-profile/84107
	* tree-profile.c (init_ic_make_global_vars):
	Remove ic_void_ptr_var and ic_gcov_type_ptr_var.
	Come up with new ic_tuple* variables.  Emit
	__gcov_indirect_call{,_topn} variables.
	(gimple_gen_ic_profiler): Access the variable
	and emit gimple.
	(gimple_gen_ic_func_profiler): Access
	__gcov_indirect_call.callee field.
	(gimple_init_gcov_profiler): Use ptr_type_node.
	* value-prof.c (gimple_ic): Use ptr_type_node.

libgcc/ChangeLog:

2018-09-17  Martin Liska  

	PR gcov-profile/84107
	* libgcov-profiler.c (__gcov_indirect_call):
	Change type to indirect_call_tuple.
	(struct indirect_call_tuple): New struct.
	(__gcov_indirect_call_topn_profiler): Change type.
	(__gcov_indirect_call_profiler_v2): Use the new
	variables.
	* libgcov.h (struct indirect_call_tuple): New struct
	definition.
---
 gcc/tree-profile.c| 84 +--
 gcc/value-prof.c  |  7 ++--
 libgcc/libgcov-profiler.c | 25 
 libgcc/libgcov.h  |  9 +
 4 files changed, 66 insertions(+), 59 deletions(-)

diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
index f96bd4b9704..d8f2a3b1ba4 100644
--- a/gcc/tree-profile.c
+++ b/gcc/tree-profile.c
@@ -53,6 +53,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "stringpool.h"
 #include "attribs.h"
 #include "tree-pretty-print.h"
+#include "langhooks.h"
+#include "stor-layout.h"
 
 static GTY(()) tree gcov_type_node;
 static GTY(()) tree tree_interval_profiler_fn;
@@ -64,9 +66,9 @@ static GTY(()) tree tree_ior_profiler_fn;
 static GTY(()) tree tree_time_profiler_counter;
 
 
-static GTY(()) tree ic_void_ptr_var;
-static GTY(()) tree 

Re: [PATCH v3] Change default to -fno-math-errno

2018-10-04 Thread Richard Biener
On Thu, Oct 4, 2018 at 2:06 PM Joseph Myers  wrote:
>
> On Thu, 4 Oct 2018, Richard Biener wrote:
>
> > The other issue is that we're treating -fno-math-errno as disabling
> > errno handling in general (not only for math functions).  That would
>
> Also, is it treated as meaning the math functions *do not set errno* (so a
> load of errno (or of anything that might be errno) before a math function
> call could be moved after the call)?

Yes.  That's implicit in math functions becoming CONST.

>  Because the proper meaning is simply
> *may not set errno* (so you shouldn't be able to move an errno load across
> a call like that, since the call might still happen to clobber it - of
> course if it happens to be a call that GCC inlines and eliminates any
> possibility of errno setting that way, e.g. sqrt on many platforms, or a
> platform where libm never sets errno at all, then such a move is safe).

So it means that GCC may elide errno setting (by inlining a function
for example)
but otherwise assume nothing else?  That makes it quite useless to optimizers.
For example the vectorizer would need to know pow() doesn't clobber
random memory
and thus causes dependency issues.  It's of course also inconsistent with
the fact that we make those math functions CONST with -fno-math-errno.

I see that the user documentation suggests what you say but I'll note that
implementation-wise we've done what I explained for a long time now.

Richard.

> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: [PATCH] Optimize sin(atan(x)), take 2

2018-10-04 Thread Marc Glisse

On Wed, 3 Oct 2018, Jeff Law wrote:


+/* Simplify cos(atan(x)) -> 1 / sqrt(x*x + 1). */
+ (for coss (COS)
+  atans (ATAN)
+  sqrts (SQRT)
+  (simplify
+   (coss (atans:s @0))
+   (rdiv {build_one_cst (type);}
+   (sqrts (plus (mult @0 @0) {build_one_cst (type);})

Don't we have the same kinds of issues with the x*x in here?  As X gets
large it will overflow, but the result is going to be approaching zero.
So we might be able to use a similar trick here.


(I have not read the patch)

The similar trick would say that for X large, this is the same as 1/abs(X) 
I guess. Note that it may be simpler to generate a call to hypot (C99).


A related remark would be: with the precision of double, for x>=cst (about 
2^53), atan(x) is constant, within .5 ulp of pi/2 if the math library is 
super precise (which it probably isn't). Returning 0 for its cos (what 
happens if x*x gives +Inf) is thus completely fine unless you are using 
crlibm, but then you wouldn't use flag_unsafe_math_optimizations. The main 
issue is that if we have -ffast-math, we have -ffinite-math-only, and we 
are possibly introducing an infinity as intermediate result...


--
Marc Glisse


Re: [patch] Fix PR tree-optimization/86659

2018-10-04 Thread Richard Biener
On Tue, Oct 2, 2018 at 12:07 PM Eric Botcazou  wrote:
>
> > so the fix is to simply not optimize here?
>
> Yes, we cannot turn a BIT_FIELD_REF with rev-storage into a VIEW_CONVERT_EXPR.
>
> > Are there correctness issues with the patterns we have for rev-storage?  But
> > then some cases are let through via the realpart/imagpart/v_c_e case?  I
> > suppose we should never see REF_REVERSE_STORAGE_ORDER on refs operating on
> > registers (SSA_NAMEs or even is_gimple_reg()s)?
>
> REF_REVERSE_STORAGE_ORDER is set on BIT_FIELD_REF and MEM_REF only.

OK.

So I wonder why it is necessary to track 'reverse' in gimple_match_op
at all given we
bail out without optimizing as far as I can see?

That is, isn't the gimple_simplify hunk, adjusted to check
REF_REVERSE_STORAGE_ORDER
instead of res_op->reverse, enough to fix the issue?

> > Note that I think you need to adjust the GENERIC side as well, for example
> >
> > [...]
> >
> > where we lose the reverse-storage attribute as well.  You'd probably
> > have to cut out rev-storage refs somewhere in genmatch.c.
>
> I don't think that's necessary since we never fold top-level BIT_FIELD_REFs at
> the GENERIC level so far.  And given that it's impossible to control the top
> level in match.pd, I'd rather not try something impossible unless rforced to
> do it.  This ought to be controlled directly from fold-const.c, if need be.

OK, fine with me.

Richard.

> --
> Eric Botcazou


Re: [PATCH, LRA] Never reload fixed form constraints memory operand

2018-10-04 Thread Thomas Preudhomme
My bad, I used dg-cmp-results without verbosity which didn't show the
problem It starts to show it with -v -v, I'm not sure why. I'll have a
look right now and revert by the end of today if I cannot come up with
a fix. Does that sound ok?

Best regards,

Thomas
On Thu, 4 Oct 2018 at 12:31, H.J. Lu  wrote:
>
> On Wed, Oct 3, 2018 at 8:12 PM Vladimir Makarov  wrote:
> >
> > On 10/03/2018 12:47 PM, Thomas Preudhomme wrote:
> > > Best regards,
> > >
> > > Thomas
> > >
> > > never_reload_fixed_address_operand.patch
> > >
> > >
> > >  From 2831d8b886d92513c2d30d43a6a989d2bbd0ceee Mon Sep 17 00:00:00 2001
> > > From: Thomas Preud'homme
> > > Date: Thu, 27 Sep 2018 09:50:12 +0100
> > > Subject: [PATCH] [PATCH, LRA] Never reload fixed form constraints memory
> > >   operand
> > >
> > > Hi,
> > >
> > > The unconditional reload of address operand for recognized instruction
> > > in process_address_1 prevent the patch for fixing "PR85434: Address of
> > > stack protector guard spilled to stack on ARM" proposed at [1]. The code
> > > in this patch attempt to control which registers are used to make PIC
> > > access but the reload performed by process_address_1 will use generic
> > > PIC access. This patch removes the test for the instruction to be
> > > unrecognized to do the reload, thus always avoiding to reload address
> > > operand for fixed constraints (such as "X" used in the patch).
> > >
> > > [1]https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01838.html
> > >
> > > ChangeLog entry is as follows:
> > >
> > > *** gcc/ChangeLog ***
> > >
> > > 2018-10-03  Thomas Preud'homme
> > >
> > >   * lra-constraints.c (process_address_1): Bail out for all
> > >   satisfied fixed constraints.
> > >
> > > Testing: Successfully bootstrapped and regtested on:
> > > - arm-linux-gnueabihf (both Arm and Thumb2 mode)
> > > - aarch64-linux-gnu
> > > - x86_64-linux-gnu
> > > - i386-linux-gnu
> > > - sparc64-linux-gnu (gcc202)
> > > - powerpc64le-linux-gnu (gcc112)
> > >
> > > Is this ok for trunk?
> > >
> > OK. Thank you for testing all these targets, Thomas.
> >
>
> This caused:
>
> FAIL: gcc.target/i386/pr83317.c (internal compiler error)
> FAIL: gcc.target/i386/pr83317.c (test for excess errors)
>
> [hjl@gnu-4 gcc]$
> /export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc
> -B/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c
>  -m32   -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
> -fdiagnostics-color=never   -O1 -fPIC -msse2 -mfpmath=sse -S -o
> pr83317.s
> during RTL pass: reload
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c:
> In function \u2018foo\u2019:
> /export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c:21:1:
> internal compiler error: in lra_eliminate_reg_if_possible, at
> lra-eliminations.c:1393
> 0xea4875 lra_eliminate_reg_if_possible(rtx_def**)
> /export/gnu/import/git/sources/gcc/gcc/lra-eliminations.c:1393
> 0xe8a94c address_eliminator
> /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:362
> 0xe8aaf7 satisfies_memory_constraint_p
> /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:401
> 0xe8f947 process_alt_operands
> /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:2248
> 0xe93d1f curr_insn_transform
> /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:3861
> 0xe975f2 lra_constraints(bool)
> /export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:4878
> 0xe7c458 lra(_IO_FILE*)
> /export/gnu/import/git/sources/gcc/gcc/lra.c:2446
> 0xe111ff do_reload
> /export/gnu/import/git/sources/gcc/gcc/ira.c:5469
> 0xe116f2 execute
> /export/gnu/import/git/sources/gcc/gcc/ira.c:5653
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See  for instructions.
> [hjl@gnu-4 gcc]$
>
>
> --
> H.J.


Re: [PATCH][RFC][i386] Change sminmax reduction patterns

2018-10-04 Thread Richard Biener
On Thu, 4 Oct 2018, Richard Biener wrote:

> 
> This tries to apply the same trick to sminmax reduction patterns
> as for the reduc_plus_scal ones, namely reduce %zmm -> %ymm -> %xmm
> first.  On a microbenchmark this improves performance on Zen
> by ~30% for AVX2 and on Skylake-SP by ~10% for AVX512 (for AVX2
> there's no measurable difference).
> 
> I guess I'm mostly looking for feedback on the approach I took
> in not rewriting ix86_expand_reduc but instead "recurse" on the
> expanders as well as the need to define recursion stops for
> SSE modes previously not covered.
> 
> I'll throw this on a bootstrap & regtest on x86_64-unknown-linux-gnu
> later.
> 
> Any comments sofar?  Writing .md patterns is new for me ;)

Btw, ICC does the same for AVX2 but for KNL (the only one I
could force it to use AVX512) it does

vextractf64x4 $1, %zmm4, %ymm0  #5.15 c1
vmaxpd%ymm4, %ymm0, %ymm1   #5.15 c3
valignq   $3, %zmm1, %zmm1, %zmm16  #5.15 c7
valignq   $2, %zmm1, %zmm1, %zmm17  #5.15 c7
valignq   $1, %zmm1, %zmm1, %zmm18  #5.15 c9
vmaxsd%xmm18, %xmm17, %xmm2 #5.15 c13
vmaxsd%xmm1, %xmm16, %xmm3  #5.15 c15
vmaxsd%xmm3, %xmm2, %xmm4   #5.15 c15

I suppose that pipelines a bit better at the expense of using more
registers (but I wonder why it doesn't use the VL %ymm encodings
for the valignq instructions).  Similar tricks would be possible
for AVX2 but ICC doesn't use them, possibly because the required
shuffles have higher latency or lower throughput.

Richard.

> Thanks,
> Richard.
> 
> 2018-10-04  Richard Biener  
> 
>   * config/i386/sse.md (reduc__scal_): Split
>   into part reducing to half width and recursing and
>   SSE2 vector variant doing the final reduction with
>   ix86_expand_reduc.
> 
> Index: gcc/config/i386/sse.md
> ===
> --- gcc/config/i386/sse.md(revision 264837)
> +++ gcc/config/i386/sse.md(working copy)
> @@ -2544,11 +2544,29 @@ (define_expand "reduc_plus_scal_v4sf"
>  })
>  
>  ;; Modes handled by reduc_sm{in,ax}* patterns.
> +(define_mode_iterator REDUC_SSE_SMINMAX_MODE
> +  [(V4SF "TARGET_SSE") (V2DF "TARGET_SSE")
> +   (V2DI "TARGET_SSE") (V4SI "TARGET_SSE") (V8HI "TARGET_SSE")
> +   (V16QI "TARGET_SSE")])
> +
> +(define_expand "reduc__scal_"
> +  [(smaxmin:REDUC_SSE_SMINMAX_MODE
> + (match_operand: 0 "register_operand")
> + (match_operand:REDUC_SSE_SMINMAX_MODE 1 "register_operand"))]
> +  ""
> +{
> +  rtx tmp = gen_reg_rtx (mode);
> +  ix86_expand_reduc (gen_3, tmp, operands[1]);
> +  emit_insn (gen_vec_extract (operands[0], tmp,
> + const0_rtx));
> +  DONE;
> +})
> +
>  (define_mode_iterator REDUC_SMINMAX_MODE
>[(V32QI "TARGET_AVX2") (V16HI "TARGET_AVX2")
> (V8SI "TARGET_AVX2") (V4DI "TARGET_AVX2")
> (V8SF "TARGET_AVX") (V4DF "TARGET_AVX")
> -   (V4SF "TARGET_SSE") (V64QI "TARGET_AVX512BW")
> +   (V64QI "TARGET_AVX512BW")
> (V32HI "TARGET_AVX512BW") (V16SI "TARGET_AVX512F")
> (V8DI "TARGET_AVX512F") (V16SF "TARGET_AVX512F")
> (V8DF "TARGET_AVX512F")])
> @@ -2559,10 +2577,12 @@ (define_expand "reduc__scal_
>   (match_operand:REDUC_SMINMAX_MODE 1 "register_operand"))]
>""
>  {
> -  rtx tmp = gen_reg_rtx (mode);
> -  ix86_expand_reduc (gen_3, tmp, operands[1]);
> -  emit_insn (gen_vec_extract (operands[0], tmp,
> - const0_rtx));
> +  rtx tmp = gen_reg_rtx (mode);
> +  emit_insn (gen_vec_extract_hi_ (tmp, operands[1]));
> +  rtx tmp2 = gen_reg_rtx (mode);
> +  emit_insn (gen_3
> +(tmp2, tmp, gen_lowpart (mode, operands[1])));
> +  emit_insn (gen_reduc__scal_ (operands[0], 
> tmp2));
>DONE;
>  })
>  
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH v3] Change default to -fno-math-errno

2018-10-04 Thread Joseph Myers
On Thu, 4 Oct 2018, Richard Biener wrote:

> The other issue is that we're treating -fno-math-errno as disabling
> errno handling in general (not only for math functions).  That would

Also, is it treated as meaning the math functions *do not set errno* (so a 
load of errno (or of anything that might be errno) before a math function 
call could be moved after the call)?  Because the proper meaning is simply 
*may not set errno* (so you shouldn't be able to move an errno load across 
a call like that, since the call might still happen to clobber it - of 
course if it happens to be a call that GCC inlines and eliminates any 
possibility of errno setting that way, e.g. sqrt on many platforms, or a 
platform where libm never sets errno at all, then such a move is safe).

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH][RFC][i386] Change sminmax reduction patterns

2018-10-04 Thread Richard Biener


This tries to apply the same trick to sminmax reduction patterns
as for the reduc_plus_scal ones, namely reduce %zmm -> %ymm -> %xmm
first.  On a microbenchmark this improves performance on Zen
by ~30% for AVX2 and on Skylake-SP by ~10% for AVX512 (for AVX2
there's no measurable difference).

I guess I'm mostly looking for feedback on the approach I took
in not rewriting ix86_expand_reduc but instead "recurse" on the
expanders as well as the need to define recursion stops for
SSE modes previously not covered.

I'll throw this on a bootstrap & regtest on x86_64-unknown-linux-gnu
later.

Any comments sofar?  Writing .md patterns is new for me ;)

Thanks,
Richard.

2018-10-04  Richard Biener  

* config/i386/sse.md (reduc__scal_): Split
into part reducing to half width and recursing and
SSE2 vector variant doing the final reduction with
ix86_expand_reduc.

Index: gcc/config/i386/sse.md
===
--- gcc/config/i386/sse.md  (revision 264837)
+++ gcc/config/i386/sse.md  (working copy)
@@ -2544,11 +2544,29 @@ (define_expand "reduc_plus_scal_v4sf"
 })
 
 ;; Modes handled by reduc_sm{in,ax}* patterns.
+(define_mode_iterator REDUC_SSE_SMINMAX_MODE
+  [(V4SF "TARGET_SSE") (V2DF "TARGET_SSE")
+   (V2DI "TARGET_SSE") (V4SI "TARGET_SSE") (V8HI "TARGET_SSE")
+   (V16QI "TARGET_SSE")])
+
+(define_expand "reduc__scal_"
+  [(smaxmin:REDUC_SSE_SMINMAX_MODE
+ (match_operand: 0 "register_operand")
+ (match_operand:REDUC_SSE_SMINMAX_MODE 1 "register_operand"))]
+  ""
+{
+  rtx tmp = gen_reg_rtx (mode);
+  ix86_expand_reduc (gen_3, tmp, operands[1]);
+  emit_insn (gen_vec_extract (operands[0], tmp,
+   const0_rtx));
+  DONE;
+})
+
 (define_mode_iterator REDUC_SMINMAX_MODE
   [(V32QI "TARGET_AVX2") (V16HI "TARGET_AVX2")
(V8SI "TARGET_AVX2") (V4DI "TARGET_AVX2")
(V8SF "TARGET_AVX") (V4DF "TARGET_AVX")
-   (V4SF "TARGET_SSE") (V64QI "TARGET_AVX512BW")
+   (V64QI "TARGET_AVX512BW")
(V32HI "TARGET_AVX512BW") (V16SI "TARGET_AVX512F")
(V8DI "TARGET_AVX512F") (V16SF "TARGET_AVX512F")
(V8DF "TARGET_AVX512F")])
@@ -2559,10 +2577,12 @@ (define_expand "reduc__scal_
  (match_operand:REDUC_SMINMAX_MODE 1 "register_operand"))]
   ""
 {
-  rtx tmp = gen_reg_rtx (mode);
-  ix86_expand_reduc (gen_3, tmp, operands[1]);
-  emit_insn (gen_vec_extract (operands[0], tmp,
-   const0_rtx));
+  rtx tmp = gen_reg_rtx (mode);
+  emit_insn (gen_vec_extract_hi_ (tmp, operands[1]));
+  rtx tmp2 = gen_reg_rtx (mode);
+  emit_insn (gen_3
+(tmp2, tmp, gen_lowpart (mode, operands[1])));
+  emit_insn (gen_reduc__scal_ (operands[0], tmp2));
   DONE;
 })
 


[C++ PATCH] String concatenation is a thing

2018-10-04 Thread Nathan Sidwell
We've had string literal concatenation since we dropped K build 
support.  Let's use it to make things more readable.


Applying to trunk.

nathan
--
Nathan Sidwell
2018-10-04  Nathan Sidwell  

	* lang-specs.h: Use string contatenation, not line splicing.

Index: lang-specs.h
===
--- lang-specs.h	(revision 264838)
+++ lang-specs.h	(working copy)
@@ -40,29 +40,31 @@ along with GCC; see the file COPYING3.
   {".tcc", "@c++-header", 0, 0, 0},
   {".hh",  "@c++-header", 0, 0, 0},
   {"@c++-header",
-"%{E|M|MM:cc1plus -E %(cpp_options) %2 %(cpp_debug_options)}\
- %{!E:%{!M:%{!MM:\
-   %{save-temps*|no-integrated-cpp:cc1plus -E\
-		%(cpp_options) %2 -o %{save-temps*:%b.ii} %{!save-temps*:%g.ii} \n}\
-  cc1plus %{save-temps*|no-integrated-cpp:-fpreprocessed %{save-temps*:%b.ii} %{!save-temps*:%g.ii}}\
-	  %{!save-temps*:%{!no-integrated-cpp:%(cpp_unique_options)}}\
-	%(cc1_options) %2\
-	%{!fsyntax-only:%{!S:-o %g.s} \
-	%{!fdump-ada-spec*:%{!o*:--output-pch=%i.gch}\
-			   %W{o*:--output-pch=%*}}%V",
+  "%{E|M|MM:cc1plus -E %(cpp_options) %2 %(cpp_debug_options)}"
+  "%{!E:%{!M:%{!MM:"
+  "  %{save-temps*|no-integrated-cpp:cc1plus -E"
+  "	   %(cpp_options) %2 -o %{save-temps*:%b.ii} %{!save-temps*:%g.ii} \n}"
+  "  cc1plus %{save-temps*|no-integrated-cpp:-fpreprocessed"
+  " 	   %{save-temps*:%b.ii} %{!save-temps*:%g.ii}}"
+  "  %{!save-temps*:%{!no-integrated-cpp:%(cpp_unique_options)}}"
+  "  %(cc1_options) %2"
+  "  %{!fsyntax-only:%{!S:-o %g.s}"
+  "%{!fdump-ada-spec*:%{!o*:--output-pch=%i.gch}"
+  "  %W{o*:--output-pch=%*}}%V",
  CPLUSPLUS_CPP_SPEC, 0, 0},
   {"@c++",
-"%{E|M|MM:cc1plus -E %(cpp_options) %2 %(cpp_debug_options)}\
- %{!E:%{!M:%{!MM:\
-   %{save-temps*|no-integrated-cpp:cc1plus -E\
-		%(cpp_options) %2 -o %{save-temps*:%b.ii} %{!save-temps*:%g.ii} \n}\
-  cc1plus %{save-temps*|no-integrated-cpp:-fpreprocessed %{save-temps*:%b.ii} %{!save-temps*:%g.ii}}\
-	  %{!save-temps*:%{!no-integrated-cpp:%(cpp_unique_options)}}\
-	%(cc1_options) %2\
-   %{!fsyntax-only:%(invoke_as)",
- CPLUSPLUS_CPP_SPEC, 0, 0},
+  "%{E|M|MM:cc1plus -E %(cpp_options) %2 %(cpp_debug_options)}"
+  "%{!E:%{!M:%{!MM:"
+  "  %{save-temps*|no-integrated-cpp:cc1plus -E"
+  "	   %(cpp_options) %2 -o %{save-temps*:%b.ii} %{!save-temps*:%g.ii} \n}"
+  "  cc1plus %{save-temps*|no-integrated-cpp:-fpreprocessed"
+  " 	   %{save-temps*:%b.ii} %{!save-temps*:%g.ii}}"
+  "  %{!save-temps*:%{!no-integrated-cpp:%(cpp_unique_options)}}"
+  "	 %(cc1_options) %2"
+  "  %{!fsyntax-only:%(invoke_as)",
+  CPLUSPLUS_CPP_SPEC, 0, 0},
   {".ii", "@c++-cpp-output", 0, 0, 0},
   {"@c++-cpp-output",
-   "%{!M:%{!MM:%{!E:\
-cc1plus -fpreprocessed %i %(cc1_options) %2\
-%{!fsyntax-only:%(invoke_as)", 0, 0, 0},
+  "%{!E:%{!M:%{!MM:"
+  "  cc1plus -fpreprocessed %i %(cc1_options) %2"
+  "  %{!fsyntax-only:%(invoke_as)", 0, 0, 0},


Re: libgo patch committed: Update to 1.11 release

2018-10-04 Thread Rainer Orth
Hi Ian,

> Thanks for the note.  This patch should fix the problem.  Bootstrapped
> and ran Go tests on x86_64-pc-linux-gnu, both normally and pretending
> that the system had no memmem.  Committed to mainline.

I've now verified the patch on i386-pc-solaris2.10: bootstrap restored.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, LRA] Never reload fixed form constraints memory operand

2018-10-04 Thread H.J. Lu
On Wed, Oct 3, 2018 at 8:12 PM Vladimir Makarov  wrote:
>
> On 10/03/2018 12:47 PM, Thomas Preudhomme wrote:
> > Best regards,
> >
> > Thomas
> >
> > never_reload_fixed_address_operand.patch
> >
> >
> >  From 2831d8b886d92513c2d30d43a6a989d2bbd0ceee Mon Sep 17 00:00:00 2001
> > From: Thomas Preud'homme
> > Date: Thu, 27 Sep 2018 09:50:12 +0100
> > Subject: [PATCH] [PATCH, LRA] Never reload fixed form constraints memory
> >   operand
> >
> > Hi,
> >
> > The unconditional reload of address operand for recognized instruction
> > in process_address_1 prevent the patch for fixing "PR85434: Address of
> > stack protector guard spilled to stack on ARM" proposed at [1]. The code
> > in this patch attempt to control which registers are used to make PIC
> > access but the reload performed by process_address_1 will use generic
> > PIC access. This patch removes the test for the instruction to be
> > unrecognized to do the reload, thus always avoiding to reload address
> > operand for fixed constraints (such as "X" used in the patch).
> >
> > [1]https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01838.html
> >
> > ChangeLog entry is as follows:
> >
> > *** gcc/ChangeLog ***
> >
> > 2018-10-03  Thomas Preud'homme
> >
> >   * lra-constraints.c (process_address_1): Bail out for all
> >   satisfied fixed constraints.
> >
> > Testing: Successfully bootstrapped and regtested on:
> > - arm-linux-gnueabihf (both Arm and Thumb2 mode)
> > - aarch64-linux-gnu
> > - x86_64-linux-gnu
> > - i386-linux-gnu
> > - sparc64-linux-gnu (gcc202)
> > - powerpc64le-linux-gnu (gcc112)
> >
> > Is this ok for trunk?
> >
> OK. Thank you for testing all these targets, Thomas.
>

This caused:

FAIL: gcc.target/i386/pr83317.c (internal compiler error)
FAIL: gcc.target/i386/pr83317.c (test for excess errors)

[hjl@gnu-4 gcc]$
/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-debug/build-x86_64-linux/gcc/
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c
 -m32   -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never   -O1 -fPIC -msse2 -mfpmath=sse -S -o
pr83317.s
during RTL pass: reload
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c:
In function \u2018foo\u2019:
/export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.target/i386/pr83317.c:21:1:
internal compiler error: in lra_eliminate_reg_if_possible, at
lra-eliminations.c:1393
0xea4875 lra_eliminate_reg_if_possible(rtx_def**)
/export/gnu/import/git/sources/gcc/gcc/lra-eliminations.c:1393
0xe8a94c address_eliminator
/export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:362
0xe8aaf7 satisfies_memory_constraint_p
/export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:401
0xe8f947 process_alt_operands
/export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:2248
0xe93d1f curr_insn_transform
/export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:3861
0xe975f2 lra_constraints(bool)
/export/gnu/import/git/sources/gcc/gcc/lra-constraints.c:4878
0xe7c458 lra(_IO_FILE*)
/export/gnu/import/git/sources/gcc/gcc/lra.c:2446
0xe111ff do_reload
/export/gnu/import/git/sources/gcc/gcc/ira.c:5469
0xe116f2 execute
/export/gnu/import/git/sources/gcc/gcc/ira.c:5653
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
[hjl@gnu-4 gcc]$


-- 
H.J.


Re: [PATCH] dumpfile.c: use prefixes other that 'note: ' for MSG_{OPTIMIZED_LOCATIONS|MISSED_OPTIMIZATION}

2018-10-04 Thread Rainer Orth
Hi Andreas,

> On Okt 01 2018, David Malcolm  wrote:
>
>> Is there a link to the .log files somewhere so I can see the precise
>> messages in question?  (e.g. are they all "loop versioned for
>> vectorization to enhance alignment"?).
>
> Yes, they are all the same message.

just for the record: I'm seeing exactly the same on Solaris/SPARC (32
and 64-bit).

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Make associative container operators inline friend

2018-10-04 Thread Jonathan Wakely

On 04/10/18 07:37 +0200, François Dumont wrote:

Here is the cleanup version.

    * include/bits/stl_tree.h
    (_Rb_tree_iterator<>::operator==): Make inline friend.
    (_Rb_tree_iterator<>::operator!=): Likewise.
    (_Rb_tree_const_iterator<>::operator==): Likewise.
    (_Rb_tree_const_iterator<>::operator!=): Likewise.
    (operator==(const _Rb_tree_iterator<>&,
    const _Rb_tree_const_iterator&)): Remove.
    (operator!=(const _Rb_tree_iterator<>&,
    const _Rb_tree_const_iterator&)): Remove.
    (operator==(const _Rb_tree<>&, const _Rb_tree<>&)): Make inline friend.
    (operator<(const _Rb_tree<>&, const _Rb_tree<>&)): Likewise.
    (operator!=(const _Rb_tree<>&, const _Rb_tree<>&)): Likewise and 
deprecate.


Nice.


    (operator>(const _Rb_tree<>&, const _Rb_tree<>&)): Likewise.
    (operator<=(const _Rb_tree<>&, const _Rb_tree<>&)): Likewise.
    (operator>=(const _Rb_tree<>&, const _Rb_tree<>&)): Likewise.
    * include/debug/map.h (map<>::erase(const_iterator, const_iterator)):
    Compare __victim with _Base::cend().
    * include/debug/multimap.h (multimap<>::erase(const_iterator, 
const_iterator)):

    Likewise.
    * include/debug/set.h (set<>::erase(const_iterator, const_iterator)):
    Compare __victim with _Base::cend().
    * include/debug/multiset.h (multiset<>::erase(const_iterator, 
const_iterator)):

    Likewise.

Tested under Linux x86_64.

Ok to commit ?


Yes, thanks very much.




Re: PR85787: Extend malloc_candidate_p to handle multiple phis.

2018-10-04 Thread Prathamesh Kulkarni
On Fri, 14 Sep 2018 at 22:49, Jeff Law  wrote:
>
> On 8/28/18 5:26 AM, Prathamesh Kulkarni wrote:
> > H
> > The attached patch extends malloc_candidate_p to handle multiple phis.
> > There's a lot of noise in the patch because I moved most of
> > malloc_candidate_p into
> > new function malloc_candidate_p_1. The only real change is following hunk:
> >
> > +   gimple *arg_def = SSA_NAME_DEF_STMT (arg);
> > +   if (is_a (arg_def))
> > + {
> > +   if (!malloc_candidate_p_1 (fun, arg, phi, ipa))
> > +   DUMP_AND_RETURN ("nested phi fail")
> > +   continue;
> > + }
> > +
> >
> > Which checks recursively that the phi argument is used only within
> > comparisons against 0
> > and the phi.
> >
> > Bootstrapped+tested on x86_64-unknown-linux-gnu.
> > OK to commit ?
> >
> > Thanks,
> > Prathamesh
> >
> >
> > pr85787-1.txt
> >
> > 2018-08-28  Prathamesh Kulkarni  
> >
> >   PR tree-optimization/85787
> >   * ipa-pure-const.c (malloc_candidate_p_1): Move most of 
> > malloc_candidate_p
> >   into this function and add support for detecting multiple phis.
> >   (DUMP_AND_RETURN): Move from malloc_candidate_p into top-level macro.
> >
> OK.
Hi Jeff,
Thanks for the review, and sorry for the delay.
Committed it as r264838 after re-bootstrap+test on x86_64 with
--enable-languages=all,ada,go.

Thanks,
Prathamesh
> jeff
>
>


Re: [PATCH 2/3] Add -fopt-info-internals

2018-10-04 Thread Richard Biener
On Fri, 28 Sep 2018, David Malcolm wrote:

> This patch introduces a verbosity level to dump messages:
> "user-facing" vs "internals".
> 
> By default, messages at the top-level dump scope are "user-facing",
> whereas those that are in nested scopes are implicitly "internals",
> and are filtered out by -fopt-info unless a new "-internals" sub-option
> of "-fopt-info" is supplied (intended purely for use by GCC developers).
> Dumpfiles are unaffected by the change.
> 
> Given that the vectorizer is the only subsystem using AUTO_DUMP_SCOPE
> (via DUMP_VECT_SCOPE), this only affects the vectorizer.
> 
> Filtering out these implementation-detail messages goes a long way
> towards making -fopt-info-vec-all more accessible to advanced end-users;
> the follow-up patch restores the most pertinent missing details.
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> 
> OK for trunk?

What happens for flags & MSG_ALL_PRIORITIES == 0?  That is,
do we require two bits for the priority at all?

It's an implementation detail of course, generally I like this
(maybe not so much conflating with nesting, but ...).

Richard.

> gcc/ChangeLog:
>   * doc/invoke.texi (-fopt-info): Document new "internals"
>   sub-option.
>   * dump-context.h (dump_context::apply_dump_filter_p): New decl.
>   * dumpfile.c (dump_options): Update for renaming of MSG_ALL to
>   MSG_ALL_KINDS.
>   (optinfo_verbosity_options): Add "internals".
>   (kind_as_string): Update for renaming of MSG_ALL to MSG_ALL_KINDS.
>   (dump_context::apply_dump_filter_p): New member function.
>   (dump_context::dump_loc): Use apply_dump_filter_p rather than
>   explicitly masking the dump_kind.
>   (dump_context::begin_scope): Increment the scope depth first.  Use
>   apply_dump_filter_p rather than explicitly masking the dump_kind.
>   (dump_context::emit_item): Use apply_dump_filter_p rather than
>   explicitly masking the dump_kind.
>   (dump_dec): Likewise.
>   (dump_hex): Likewise.
>   (dump_switch_p_1): Default to MSG_ALL_PRIORITIES.
>   (opt_info_switch_p_1): Default to MSG_PRIORITY_USER_FACING.
>   (opt_info_switch_p): Update handling of default
>   MSG_OPTIMIZED_LOCATIONS to cope with default of
>   MSG_PRIORITY_USER_FACING.
>   (dump_basic_block): Use apply_dump_filter_p rather than explicitly
>   masking the dump_kind.
>   (selftest::test_capture_of_dump_calls): Update test_dump_context
>   instances to use MSG_ALL_KINDS | MSG_PRIORITY_USER_FACING rather
>   than MSG_ALL.  Generalize scope test to be run at all four
>   combinations of with/without MSG_PRIORITY_USER_FACING and
>   MSG_PRIORITY_INTERNALS, adding examples of explicit priority
>   for each of the two values.
>   * dumpfile.h (enum dump_flag): Add comment about the MSG_* flags.
>   Rename MSG_ALL to MSG_ALL_KINDS.  Add MSG_PRIORITY_USER_FACING,
>   MSG_PRIORITY_INTERNALS, and MSG_ALL_PRIORITIES, updating the
>   values for TDF_COMPARE_DEBUG and TDF_ALL_VALUES.
>   (AUTO_DUMP_SCOPE): Add a note to the comment about the interaction
>   with MSG_PRIORITY_*.
>   * tree-vect-loop-manip.c (vect_loop_versioning): Mark versioning
>   dump messages as MSG_PRIORITY_USER_FACING.
>   * tree-vectorizer.h (DUMP_VECT_SCOPE): Add a note to the comment
>   about the interaction with MSG_PRIORITY_*.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.dg/plugin/dump-1.c: Update expected output for test_scopes
>   due to "-internals" not being selected.
>   * gcc.dg/plugin/dump-2.c: New test, based on dump-1.c, with
>   "-internals" added to re-enable the output from test_scopes.
>   * gcc.dg/plugin/plugin.exp (plugin_test_list): Add dump-2.c.
> ---
>  gcc/doc/invoke.texi|  31 -
>  gcc/dump-context.h |   2 +
>  gcc/dumpfile.c | 248 
> +++--
>  gcc/dumpfile.h |  41 +-
>  gcc/testsuite/gcc.dg/plugin/dump-1.c   |  10 +-
>  gcc/testsuite/gcc.dg/plugin/dump-2.c   |  30 
>  gcc/testsuite/gcc.dg/plugin/plugin.exp |   3 +-
>  gcc/tree-vect-loop-manip.c |   6 +-
>  gcc/tree-vectorizer.h  |   6 +-
>  9 files changed, 279 insertions(+), 98 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/plugin/dump-2.c
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 5c95f67..ab8d9b7e8 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -14207,14 +14207,21 @@ Controls optimization dumps from various 
> optimization passes. If the
>  @samp{-} separated option keywords to select the dump details and
>  optimizations.  
>  
> -The @var{options} can be divided into two groups: options describing the
> -verbosity of the dump, and options describing which optimizations
> -should be included. The options from both the groups can be freely
> -mixed as they are non-overlapping. 

Re: [PATCH 1/3] Fix -fopt-info for plugin passes

2018-10-04 Thread Richard Biener
On Fri, 28 Sep 2018, David Malcolm wrote:

> Attempts to dump via -fopt-info from a plugin pass fail, due
> to the dfi->alt_state for such passes never being set.
> 
> This is because the -fopt-info options were being set up per-pass
> during option-parsing (via gcc::dump_manager::opt_info_enable_passes),
> but this data was not retained or used it for passes created later
> (for plugins and target-specific passes).
> 
> This patch fixes the issue by storing the -fopt-info options into
> gcc::dump_manager, refactoring the dfi-setup code out of
> opt_info_enable_passes, and reusing it for such passes, fixing the
> issue.  The patch adds a demo plugin to test that dumping from a
> plugin works.
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> 
> OK for trunk?

OK.

Richard.

> gcc/ChangeLog:
>   * dumpfile.c (gcc::dump_manager::dump_manager): Initialize new
>   fields.
>   (gcc::dump_manager::~dump_manager): Free m_optinfo_filename.
>   (gcc::dump_manager::register_pass): New member function, adapted
>   from loop body in gcc::pass_manager::register_pass, adding a
>   call to update_dfi_for_opt_info.
>   (gcc::dump_manager::opt_info_enable_passes): Store the
>   -fopt-info options into the new fields.  Move the loop
>   bodies into...
>   (gcc::dump_manager::update_dfi_for_opt_info): ...this new member
>   function.
>   * dumpfile.h (struct opt_pass): New forward decl.
>   (gcc::dump_manager::register_pass): New decl.
>   (gcc::dump_manager::update_dfi_for_opt_info): New decl.
>   (class gcc::dump_manager): Add fields "m_optgroup_flags",
>   "m_optinfo_flags", and "m_optinfo_filename".
>   * passes.c (gcc::pass_manager::register_pass): Move all of the
>   dump-handling code to gcc::dump_manager::register_pass.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.dg/plugin/dump-1.c: New test.
>   * gcc.dg/plugin/dump_plugin.c: New test plugin.
>   * gcc.dg/plugin/plugin.exp (plugin_test_list): Add the above.
> ---
>  gcc/dumpfile.c| 120 +
>  gcc/dumpfile.h|  12 +++
>  gcc/passes.c  |  30 ++-
>  gcc/testsuite/gcc.dg/plugin/dump-1.c  |  28 ++
>  gcc/testsuite/gcc.dg/plugin/dump_plugin.c | 143 
> ++
>  gcc/testsuite/gcc.dg/plugin/plugin.exp|   2 +
>  6 files changed, 275 insertions(+), 60 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/plugin/dump-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/plugin/dump_plugin.c
> 
> diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
> index d430ea3..d359e41 100644
> --- a/gcc/dumpfile.c
> +++ b/gcc/dumpfile.c
> @@ -177,12 +177,16 @@ gcc::dump_manager::dump_manager ():
>m_next_dump (FIRST_AUTO_NUMBERED_DUMP),
>m_extra_dump_files (NULL),
>m_extra_dump_files_in_use (0),
> -  m_extra_dump_files_alloced (0)
> +  m_extra_dump_files_alloced (0),
> +  m_optgroup_flags (OPTGROUP_NONE),
> +  m_optinfo_flags (TDF_NONE),
> +  m_optinfo_filename (NULL)
>  {
>  }
>  
>  gcc::dump_manager::~dump_manager ()
>  {
> +  free (m_optinfo_filename);
>for (size_t i = 0; i < m_extra_dump_files_in_use; i++)
>  {
>dump_file_info *dfi = _extra_dump_files[i];
> @@ -1512,6 +1516,50 @@ dump_flag_name (int phase) const
>return dfi->swtch;
>  }
>  
> +/* Handle -fdump-* and -fopt-info for a pass added after
> +   command-line options are parsed (those from plugins and
> +   those from backends).
> +
> +   Because the registration of plugin/backend passes happens after the
> +   command-line options are parsed, the options that specify single
> +   pass dumping (e.g. -fdump-tree-PASSNAME) cannot be used for new
> +   passes. Therefore we currently can only enable dumping of
> +   new passes when the 'dump-all' flags (e.g. -fdump-tree-all)
> +   are specified.  This is done here.
> +
> +   Similarly, the saved -fopt-info options are wired up to the new pass.  */
> +
> +void
> +gcc::dump_manager::register_pass (opt_pass *pass)
> +{
> +  gcc_assert (pass);
> +
> +  register_one_dump_file (pass);
> +
> +  dump_file_info *pass_dfi = get_dump_file_info (pass->static_pass_number);
> +  gcc_assert (pass_dfi);
> +
> +  enum tree_dump_index tdi;
> +  if (pass->type == SIMPLE_IPA_PASS
> +  || pass->type == IPA_PASS)
> +tdi = TDI_ipa_all;
> +  else if (pass->type == GIMPLE_PASS)
> +tdi = TDI_tree_all;
> +  else
> +tdi = TDI_rtl_all;
> +  const dump_file_info *tdi_dfi = get_dump_file_info (tdi);
> +  gcc_assert (tdi_dfi);
> +
> +  /* Check if dump-all flag is specified.  */
> +  if (tdi_dfi->pstate)
> +{
> +  pass_dfi->pstate = tdi_dfi->pstate;
> +  pass_dfi->pflags = tdi_dfi->pflags;
> +}
> +
> +  update_dfi_for_opt_info (pass_dfi);
> +}
> +
>  /* Finish a tree dump for PHASE. STREAM is the stream created by
> dump_begin.  */
>  
> @@ -1587,47 +1635,47 @@ opt_info_enable_passes (optgroup_flags_t 
> 

[C++ Patch] PR 71139 ("[concepts] ill-formed compound-requirement lacking a semicolon not rejected")

2018-10-04 Thread Paolo Carlini

Hi,

yesterday I didn't notice that we have a separate bug for a similar 
issue affecting cp_parser_compound_requirement. Tested x86_64-linux.


Thanks, Paolo.

PS: while working on these issues, I noticed that somewhere else in the 
concepts parsing bits we check the return value of cp_parser_require 
and, in case, immediately return error_mark_node. That doesn't seem a 
good idea - and isn't what we do for all the other cp_parser_require 
semicolon calls - because it normally degrades error recovery when the 
only bug in the code is the missing semicolon. I mean to further look 
into that...


///

/cp
2018-10-04  Paolo Carlini  

PR c++/71140
* parser.c (cp_parser_compound_requirement): Require a terminating
semicolon, per 7.5.7.3.

/testsuite
2018-10-04  Paolo Carlini  

PR c++/71140
* g++.dg/concepts/pr71139.C: New.
* g++.dg/concepts/pr67595.C: Adjust.
* g++.dg/concepts/diagnostic1.C: Likewise.
* g++.dg/concepts/disjunction1.C: Likewise.
Index: cp/parser.c
===
--- cp/parser.c (revision 264837)
+++ cp/parser.c (working copy)
@@ -26108,6 +26112,8 @@ cp_parser_compound_requirement (cp_parser *parser)
 return error_mark_node;
 }
 
+  cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
+
   return finish_compound_requirement (expr, type, noexcept_p);
 }
 
Index: testsuite/g++.dg/concepts/diagnostic1.C
===
--- testsuite/g++.dg/concepts/diagnostic1.C (revision 264837)
+++ testsuite/g++.dg/concepts/diagnostic1.C (working copy)
@@ -6,7 +6,7 @@ concept bool SameAs = __is_same_as(T, U);
 
 template 
 concept bool R1 = requires (T& t) {
-  { t.begin() } -> T
+  { t.begin() } -> T;
   { t.end() } -> SameAs;
 };
 
Index: testsuite/g++.dg/concepts/disjunction1.C
===
--- testsuite/g++.dg/concepts/disjunction1.C(revision 264837)
+++ testsuite/g++.dg/concepts/disjunction1.C(working copy)
@@ -29,7 +29,7 @@ template  concept bool Movable() {
 }
 template  int Swappable_ = requires { 0; };
 template  int Swappable();
-template  concept bool Dereferencable = requires{{0}};
+template  concept bool Dereferencable = requires{{0};};
 template  using RvalueReferenceType = decltype(0);
 template  int IsValueType;
 template  struct value_type;
@@ -39,7 +39,7 @@ requires IsValueType<
 _t>>;
 template  concept bool Readable() {
   return Movable() && DefaultConstructible() &&
- Dereferencable && requires{{0}};
+ Dereferencable && requires{{0};};
 }
 template  concept bool MoveWritable() {
   return Movable() && DefaultConstructible() &&
Index: testsuite/g++.dg/concepts/pr67595.C
===
--- testsuite/g++.dg/concepts/pr67595.C (revision 264837)
+++ testsuite/g++.dg/concepts/pr67595.C (working copy)
@@ -1,9 +1,9 @@
 // { dg-options "-std=c++17 -fconcepts" }
 
-template  concept bool allocatable = requires{{new X}->X * };
+template  concept bool allocatable = requires{{new X}->X *;};
 template  concept bool semiregular = allocatable;
 template  concept bool readable = requires{requires semiregular};
-template  int weak_input_iterator = requires{{0}->readable};
+template  int weak_input_iterator = requires{{0}->readable;};
 template  bool input_iterator{weak_input_iterator}; // { 
dg-warning "narrowing conversion" }
 template  bool forward_iterator{input_iterator};
 template  bool bidirectional_iterator{forward_iterator};
Index: testsuite/g++.dg/concepts/pr71139.C
===
--- testsuite/g++.dg/concepts/pr71139.C (nonexistent)
+++ testsuite/g++.dg/concepts/pr71139.C (working copy)
@@ -0,0 +1,8 @@
+// { dg-do compile { target c++14 } }
+// { dg-additional-options "-fconcepts" }
+
+template
+concept bool C = requires(T t) {
+{ +t }  // { dg-error "expected .\;. before .\}. token" }
+};
+static_assert(C, "");


[PATCH][i386] Fix scalar costing in ix86_add_stmt_cost

2018-10-04 Thread Richard Biener


The following fixes bogus differences in scalar iteration cost
computed by the vectorizer when comparing 128 and 256 bit vectorizations.
This was caused by the hook looking at the vector types mode even
for kind == scalar_stmt and thus returning vector operation costs
for things like add or negate.

This is most noticable on targets where ix86_vec_cost applies
multiplication based on vector size (TARGET_AVX128_OPTIMAL, thus
Zen and Bulldozer).  But it will also adjust the actual costs
everywhere where scalar and vector costs diverge.

The adjustments done for Silvermont also look broken since they
are applied to both scalar and vector cost which makes it mostly
a no-op.  The patch adjusts it to only apply for vector costing
but of course I didn't benchmark this and the magic number of 1.7
may not make sense after this fix so I'm happy to leave that
out - Kirill?  Should I just go ahead with that for trunk (we can
revert or adjust if autotesters pick up regressions on your side)?

Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK?

Richard.

2018-10-04   Richard Biener  

* config/i386/i386.c (ix86_add_stmt_cost): When scalar cost
is asked for initialize mode to the component mode of the
vector type.  Make sure Bonnel and esp. other Atom cost
adjustments are not done for scalar cost estimates.

Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c  (revision 264837)
+++ gcc/config/i386/i386.c  (working copy)
@@ -49486,17 +49486,21 @@ ix86_add_stmt_cost (void *data, int coun
 {
   unsigned *cost = (unsigned *) data;
   unsigned retval = 0;
+  bool scalar_p
+= (kind == scalar_stmt || kind == scalar_load || kind == scalar_store);
 
   tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
   int stmt_cost = - 1;
 
   bool fp = false;
-  machine_mode mode = TImode;
+  machine_mode mode = scalar_p ? SImode : TImode;
 
   if (vectype != NULL)
 {
   fp = FLOAT_TYPE_P (vectype);
   mode = TYPE_MODE (vectype);
+  if (scalar_p)
+   mode = TYPE_MODE (TREE_TYPE (vectype));
 }
 
   if ((kind == vector_stmt || kind == scalar_stmt)
@@ -49632,7 +49636,7 @@ ix86_add_stmt_cost (void *data, int coun
 stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
 
   /* Penalize DFmode vector operations for Bonnell.  */
-  if (TARGET_BONNELL && kind == vector_stmt
+  if (TARGET_BONNELL && !scalar_p
   && vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode)
 stmt_cost *= 5;  /* FIXME: The value here is arbitrary.  */
 
@@ -49648,7 +49652,8 @@ ix86_add_stmt_cost (void *data, int coun
  for Silvermont as it has out of order integer pipeline and can execute
  2 scalar instruction per tick, but has in order SIMD pipeline.  */
   if ((TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_GOLDMONT_PLUS
-   || TARGET_TREMONT || TARGET_INTEL) && stmt_info && stmt_info->stmt)
+   || TARGET_TREMONT || TARGET_INTEL)
+  && !scalar_p && stmt_info && stmt_info->stmt)
 {
   tree lhs_op = gimple_get_lhs (stmt_info->stmt);
   if (lhs_op && TREE_CODE (TREE_TYPE (lhs_op)) == INTEGER_TYPE)


Re: [PATCH] Provide extension hint for aarch64 target (PR driver/83193).

2018-10-04 Thread Kyrill Tkachov

Hi Martin,

On 18/07/18 16:49, Martin Liška wrote:

Hi.

This patch improves aarch64 feature modifier hints.

May I please ask ARM folks to test the patch?
Thanks,
Martin



I've bootstrapped and tested it on aarch64-none-linux-gnu and I didn't see any 
problems.
I like the functionality! It is definitely an improvement to the user 
experience.

I'm not an aarch64 maintainer but I have some comments on the patch below.

Thanks,
Kyrill


gcc/ChangeLog:

2018-07-18  Martin Liska  

PR driver/83193
* common/config/aarch64/aarch64-common.c (aarch64_parse_extension):
Set invalid_extension when there's any.
(aarch64_get_all_extension_candidates): New function.
(aarch64_rewrite_selected_cpu): Pass NULL as new argument.
* config/aarch64/aarch64-protos.h 
(aarch64_get_all_extension_candidates):
Declare new function.
* config/aarch64/aarch64.c (aarch64_parse_arch): Record
invalid_feature.
(aarch64_parse_cpu): Likewise.
(aarch64_print_hint_for_feature_modifier): New.
(aarch64_validate_mcpu): Record invalid feature modifier
and print hint for it.
(aarch64_validate_march): Likewise.
(aarch64_handle_attr_arch): Likewise.
(aarch64_handle_attr_cpu): Likewise.
(aarch64_handle_attr_isa_flags): Likewise.

gcc/testsuite/ChangeLog:

2018-07-18  Martin Liska  

PR driver/83193
* gcc.target/aarch64/spellcheck_7.c: New test.
* gcc.target/aarch64/spellcheck_8.c: New test.
---
 gcc/common/config/aarch64/aarch64-common.c| 20 +-
 gcc/config/aarch64/aarch64-protos.h   |  4 +-
 gcc/config/aarch64/aarch64.c  | 67 +++
 .../gcc.target/aarch64/spellcheck_7.c | 11 +++
 .../gcc.target/aarch64/spellcheck_8.c | 12 
 5 files changed, 97 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/spellcheck_7.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/spellcheck_8.c




diff --git a/gcc/common/config/aarch64/aarch64-common.c 
b/gcc/common/config/aarch64/aarch64-common.c
index 292fb818705..c2994514004 100644
--- a/gcc/common/config/aarch64/aarch64-common.c
+++ b/gcc/common/config/aarch64/aarch64-common.c
@@ -175,7 +175,8 @@ static const struct arch_to_arch_name all_architectures[] =
aarch64_parse_opt_result describing the result.  */
 
 enum aarch64_parse_opt_result

-aarch64_parse_extension (const char *str, unsigned long *isa_flags)
+aarch64_parse_extension (const char *str, unsigned long *isa_flags,
+char **invalid_extension)
 {

Please document the new argument in the function comment. Same for all the 
other parsing functions that you extend in the patch.
In particular, I'd like to see a comment on who allocates the memory for the 
string and who is responsible for freeing it.

+/* Append all extension candidates and put them to CANDIDATES vector.  */
+

Given the implementation below, how about:
"Append all architecture extension candidates to the CANDIDATES vector." ?

+void
+aarch64_get_all_extension_candidates (auto_vec *candidates)
+{
+  const struct aarch64_option_extension *opt;
+  for (opt = all_extensions; opt->name != NULL; opt++)
+candidates->safe_push (opt->name);
+}
+



+
+/* Print a hint with a suggestion for a feature modifier name
+   that most closely resembles what the user passed in STR.  */
+
+void
+aarch64_print_hint_for_feature_modifier (const char *str)
+{


Elsewhere in the backend and this patch as well we call them extensions rather 
than feature modifiers.
Let's be consistent: aarch64_print_hint_for_extension would is my suggestion

+  auto_vec candidates;
+  aarch64_get_all_extension_candidates ();
+  char *s;
+  const char *hint = candidates_list_and_hint (str, s, candidates);
+  if (hint)
+inform (input_location, "valid arguments are: %s;"
+" did you mean %qs?", s, hint);
+  else
+inform (input_location, "valid arguments are: %s;", s);
+
+  XDELETEVEC (s);
+}
+

diff --git a/gcc/testsuite/gcc.target/aarch64/spellcheck_7.c 
b/gcc/testsuite/gcc.target/aarch64/spellcheck_7.c
new file mode 100644
index 000..1350b865162
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/spellcheck_7.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-march=*" } { "" } } */

Another way that the extensions are used is with -mcpu. For example 
-mcpu=cortex-a57+crypto.
So you'll need to skip if -mcpu is given as well.
And we'll want a test for -mcpu error-checking as well.

+/* { dg-options "-march=armv8-a+typo" } */
+
+void
+foo ()
+{
+}
+



Re: [PATCH] Error about alias attribute with body definition (PR c/87483).

2018-10-04 Thread Jan Hubicka
> Hi.
> 
> That's new warnings that warns about ifunc having a function body.
> It provides following warning:
> 
> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87483.c: In function ‘g’:
> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87483.c:5:35: warning: 
> ‘alias’ attribute ignored because function is defined [-Wattributes]
> 5 | __attribute__ ((alias ("f"))) int g () { return 1; } /* { dg-warning 
> "attribute ignored because function is defined" } */
>   |   ^
> 
> Hope I found proper place in cgraphunit.c for the warning.
> Patch survives bootstrap and tests on x86_64-linux-gnu.
> 
> Ready for trunk?

I think one can first define the function and later declare it again
with additional attribute.  I think it is better to take care of this
in process_function_and_variable_attributes

Honza
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-10-03  Martin Liska  
> 
>   PR c/87483
>   * cgraphunit.c (cgraph_node::finalize_function):
>   Print error for functions with alias attribute and
>   body.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-10-03  Martin Liska  
> 
>   PR c/87483
>   * gcc.dg/pr87483.c: New test.
> ---
>  gcc/cgraphunit.c   |  6 ++
>  gcc/testsuite/gcc.dg/pr87483.c | 16 
>  2 files changed, 22 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr87483.c
> 
> 

> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
> index c0baaeaefa4..40d6d348214 100644
> --- a/gcc/cgraphunit.c
> +++ b/gcc/cgraphunit.c
> @@ -488,6 +488,12 @@ cgraph_node::finalize_function (tree decl, bool 
> no_collect)
>if (!TREE_ASM_WRITTEN (decl))
>  (*debug_hooks->deferred_inline_function) (decl);
>  
> +  if (lookup_attribute ("alias", DECL_ATTRIBUTES (decl))
> +  && node->definition)
> +warning_at (DECL_SOURCE_LOCATION (node->decl), OPT_Wattributes,
> + "% attribute ignored"
> + " because function is defined");
> +
>if (!no_collect)
>  ggc_collect ();
>  
> diff --git a/gcc/testsuite/gcc.dg/pr87483.c b/gcc/testsuite/gcc.dg/pr87483.c
> new file mode 100644
> index 000..d3af8dfee5d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr87483.c
> @@ -0,0 +1,16 @@
> +/* PR c/87483 */
> +/* { dg-require-alias "" } */
> +
> +int f (void) { return 0; }
> +__attribute__ ((alias ("f"))) int g () { return 1; } /* { dg-warning 
> "attribute ignored because function is defined" } */
> +__attribute__ ((alias ("f"))) int g2 ();
> +
> +int h (void)
> +{
> +  return g2 ();
> +}
> +
> +int main()
> +{
> +  return h();
> +}
> 



[PATCH] Error about alias attribute with body definition (PR c/87483).

2018-10-04 Thread Martin Liška
Hi.

That's new warnings that warns about ifunc having a function body.
It provides following warning:

/home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87483.c: In function ‘g’:
/home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87483.c:5:35: warning: 
‘alias’ attribute ignored because function is defined [-Wattributes]
5 | __attribute__ ((alias ("f"))) int g () { return 1; } /* { dg-warning 
"attribute ignored because function is defined" } */
  |   ^

Hope I found proper place in cgraphunit.c for the warning.
Patch survives bootstrap and tests on x86_64-linux-gnu.

Ready for trunk?
Martin

gcc/ChangeLog:

2018-10-03  Martin Liska  

PR c/87483
* cgraphunit.c (cgraph_node::finalize_function):
Print error for functions with alias attribute and
body.

gcc/testsuite/ChangeLog:

2018-10-03  Martin Liska  

PR c/87483
* gcc.dg/pr87483.c: New test.
---
 gcc/cgraphunit.c   |  6 ++
 gcc/testsuite/gcc.dg/pr87483.c | 16 
 2 files changed, 22 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr87483.c


diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index c0baaeaefa4..40d6d348214 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -488,6 +488,12 @@ cgraph_node::finalize_function (tree decl, bool no_collect)
   if (!TREE_ASM_WRITTEN (decl))
 (*debug_hooks->deferred_inline_function) (decl);
 
+  if (lookup_attribute ("alias", DECL_ATTRIBUTES (decl))
+  && node->definition)
+warning_at (DECL_SOURCE_LOCATION (node->decl), OPT_Wattributes,
+		"% attribute ignored"
+		" because function is defined");
+
   if (!no_collect)
 ggc_collect ();
 
diff --git a/gcc/testsuite/gcc.dg/pr87483.c b/gcc/testsuite/gcc.dg/pr87483.c
new file mode 100644
index 000..d3af8dfee5d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr87483.c
@@ -0,0 +1,16 @@
+/* PR c/87483 */
+/* { dg-require-alias "" } */
+
+int f (void) { return 0; }
+__attribute__ ((alias ("f"))) int g () { return 1; } /* { dg-warning "attribute ignored because function is defined" } */
+__attribute__ ((alias ("f"))) int g2 ();
+
+int h (void)
+{
+  return g2 ();
+}
+
+int main()
+{
+  return h();
+}



Re: [PATCH] Fix typo, fixing PR87465

2018-10-04 Thread Richard Biener
On Wed, 3 Oct 2018, Christophe Lyon wrote:

> On Mon, 1 Oct 2018 at 11:36, Richard Biener  wrote:
> >
> >
> > The following typo-fix happens to fix a --param max-peel-branches limit
> > caused missed peeling.  The typo is present everywhere, the missed
> > peeling is a regression from GCC 7.
> >
> > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> >
> > I'm not really considering to backport this anywhere.  Note the
> > testcase isn't fully optimized on the tree level because
> > DOM doesn't figure out the trivial CSE after SLP vectorizes the
> > array init (we have PRs for that issue).
> >
> > Richard.
> >
> > 2018-10-01  Richard Biener  
> >
> > PR tree-optimization/87465
> > * tree-ssa-loop-ivcanon.c (tree_estimate_loop_size): Fix typo
> > causing branch miscounts.
> >
> > * gcc.dg/tree-ssa/cunroll-15.c: New testcase.
> >
> 
> Hi,
> 
> The new testcase fails on arm and powerpc.

Bah, it's the old issue that DOM doesn't handle CSE of

  in = *.LC0;
  _52 = in[1];
  _51 = in[0];

I see no other way of xfailing for a specific list of targets then.

Please feel free to amend if new ones pop up.

I'm committing the following.  I really wonder if we could go
for a cheap non-iterating FRE now (and whether late jump threading
from DOM is important).

Richard.

2018-10-04  Richard Biener  

* gcc.dg/tree-ssa/cunroll-15.c: Add XFAILs for arm and powerpc.

Index: gcc/testsuite/gcc.dg/tree-ssa/cunroll-15.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/cunroll-15.c  (revision 264835)
+++ gcc/testsuite/gcc.dg/tree-ssa/cunroll-15.c  (working copy)
@@ -19,4 +19,5 @@ int Test(void)
 /* { dg-final { scan-tree-dump-times "optimized:\[^\n\r\]*completely unrolled" 
2 "cunroll" } } */
 /* When SLP vectorization is enabled the following will fail because DOM
doesn't know how to deal with the vectorized initializer of in.  */
-/* { dg-final { scan-tree-dump "return 1;" "optimized" } } */
+/* DOM also doesn't know to CSE in[1] with in = *.LC0 thus the list of targets 
this fails.  */
+/* { dg-final { scan-tree-dump "return 1;" "optimized" { xfail { arm*-*-* 
powerpc-*-* } } } } */


Re: [PATCH] Optimize sin(atan(x)), take 2

2018-10-04 Thread Richard Biener
On Thu, Oct 4, 2018 at 6:39 AM Jeff Law  wrote:
>
> On 9/3/18 1:11 PM, Giuliano Augusto Faulin Belinassi wrote:
> > Fixed the issues pointed by the previous discussions. Closes PR86829.
> >
> > Adds substitution rules for sin(atan(x)) and cos(atan(x)), being
> > careful with overflow issues by constructing a assumed convergence
> > constant (see comment in real.c).
> >
> > 2018-09-03  Giuliano Belinassi 
> >
> > * match.pd: add simplification rules to sin(atan(x)) and cos(atan(x)).
> > * real.c: add code for assumed convergence constant to sin(atan(x)).
> > * real.h: allows the added code from real.c to be called externally.
> > * tree.c: add code for bulding nodes with the convergence constant.
> > * tree.h: allows the added code from tree.c to be called externally.

Look at examples to fix up your ChangeLog, you need to list functions
you add/modify.  I don't think you want the tree.[ch] forwarders, instead use

 (with
   {
  REAL_VALUE_TYPE rcst;
  build_sinatan_real (, type);
  tree tcst = build_real (type, cst);
   }

in the pattern then just use (cond (le (abs @0) { tcst; })

or even include the function here in full (well, maybe not, but maybe
include some of its comment to make it clear what kind of constant
we build here).

> > * sinatan-1.c: tests assumed convergence constant.
> > * sinatan-2.c: tests simplification rule.
> > * sinatan-3.c: likewise.
> >
> > There seems to be no broken tests in trunk that are related to this
> > modification.
> Pretty cool.
>
>
> >
> >
> > sinatanv2.patch
> >
> > Index: gcc/match.pd
> > ===
> > --- gcc/match.pd  (revisão 264058)
> > +++ gcc/match.pd  (cópia de trabalho)
> > @@ -4169,6 +4169,39 @@
> > (tans (atans @0))
> > @0)))
> >
> > + /* Simplify sin(atan(x)) -> x / sqrt(x*x + 1). */
> > + (for sins (SIN)
> > +  atans (ATAN)
> > +  sqrts (SQRT)
> > +  (simplify
> > +   (sins (atans:s @0))
> > +   (if (SCALAR_FLOAT_TYPE_P (type))
> > +(switch
> > + (if (types_match (type, float_type_node))
> > +  (cond (le (abs @0) { build_sinatan_cst (float_type_node); })
> > +   (rdiv @0 (sqrts (plus (mult @0 @0)
> > +   {build_one_cst (float_type_node);})))
> > +   (BUILT_IN_COPYSIGNF { build_one_cst (float_type_node); } @0)))
> > + (if (types_match (type, double_type_node))
> > +  (cond (le (abs @0) { build_sinatan_cst (double_type_node); })
> > +   (rdiv @0 (sqrts (plus (mult @0 @0)
> > +   {build_one_cst (double_type_node);})))
> > +   (BUILT_IN_COPYSIGN  { build_one_cst (double_type_node); } @0)))
> > + (if (types_match (type, long_double_type_node))
> > +  (cond (le (abs @0) { build_sinatan_cst (long_double_type_node); })
> > +   (rdiv @0 (sqrts (plus (mult @0 @0)
> > +   {build_one_cst (long_double_type_node);})))
> > +   (BUILT_IN_COPYSIGNL { build_one_cst (long_double_type_node); } 
> > @0)))
> So you don't want to build the constants as a float, double or long
> double.  Instead you want to build it as "type".  I think that should
> let you simplify this a bit.It turns into
>
>   (if (SCALAR_FLOAT_TYPE_P (type))
>(cond (le (abs @0) {build_sinatan_cst (type); })
>  (rdiv @0 (sqrts (plus (mult @0 @0)
>{build_one_cst (type); })))
>  (BUILT_IN_COPYSIGNL { build_one_cst (type); } @0
>
> Or something along those lines I think.  Richi is much better at
> match.pd stuff than I.

Yeah, you'd need to add copysigns (COPYSIGN) to the for iterator
and use that instead of BUILT_IN_COPYSIGNL of course.

Richard.

>
>
>
>
>
>
> > +
> > +/* Simplify cos(atan(x)) -> 1 / sqrt(x*x + 1). */
> > + (for coss (COS)
> > +  atans (ATAN)
> > +  sqrts (SQRT)
> > +  (simplify
> > +   (coss (atans:s @0))
> > +   (rdiv {build_one_cst (type);}
> > +   (sqrts (plus (mult @0 @0) {build_one_cst (type);})
> Don't we have the same kinds of issues with the x*x in here?  As X gets
> large it will overflow, but the result is going to be approaching zero.
>  So we might be able to use a similar trick here.
>
>
> > +
> > +/*  Build real constant used by sin(atan(x)) optimization.
> > +The logic here is as follows: It can be mathematically
> > +shown that sin(atan(x)) = x / sqrt(1 + x*x), but notice
> > +that the second formula has an x*x, which can overflow
> > +if x is big enough. However, x / sqrt(1 + x*x) converges
> > +to 1 for large x. What must be the value of x such that
> > +when computing x / sqrt (1 + x*x) = 1?
> > +
> > +Therefore, we must then solve x / sqrt(1 + x*x) > eps
> > +for x, where eps is the largest number smaller than 1
> > +representable by the target. Hence, solving for eps
> > +yields that x > eps / sqrt(1 - eps*eps). This eps can
> > +be easily calculated by calling nextafter. Likewise for
> > +the negative x.  */
> Imagine a pause here while I lookup isolation 

Re: [PATCH v3] Change default to -fno-math-errno

2018-10-04 Thread Richard Biener
On Thu, Oct 4, 2018 at 1:10 AM Jeff Law  wrote:
>
> On 9/4/18 9:27 AM, Wilco Dijkstra wrote:
> >
> > ping
> >
> >
> >
> >
> > From: Wilco Dijkstra
> > Sent: 18 June 2018 15:01
> > To: GCC Patches
> > Cc: nd; Joseph Myers
> > Subject: [PATCH v3] Change default to -fno-math-errno
> >
> >
> > GCC currently defaults to -fmath-errno.  This generates code assuming math
> > functions set errno and the application checks errno.  Few applications
> > test errno and various systems and math libraries no longer set errno since 
> > it
> > is optional.  GCC generates much faster code for simple math functions with
> > -fno-math-errno such as sqrt and lround (avoiding a call and PLT 
> > redirection).
> > Therefore it is reasonable to change the default to -fno-math-errno.  This 
> > is
> > already the case for non-C languages.  Only change the default for C99 and
> > later.
> >
> > long f(float x) { return lroundf(x) + 1; }
> >
> > by default:
> >
> > f:
> > str x30, [sp, -16]!
> > bl  lroundf
> > add x0, x0, 1
> > ldr x30, [sp], 16
> > ret
> >
> > With -fno-math-errno:
> >
> > f:
> > fcvtas  x0, s0
> > add x0, x0, 1
> > ret
> >
> > Passes regress on AArch64. OK for commit?
> >
> > ChangeLog:
> > 2018-06-18  Wilco Dijkstra  
> >
> > * common.opt (fmath-errno): Change default to 0.
> > * opts.c (set_fast_math_flags): Force -fno-math-errno with 
> > -ffast-math.
> > * c-family/c-opts.c (c_common_init_options_struct): Set 
> > flag_errno_math
> > to special value.
> > (c_common_post_options): Set flag_errno_math default based on 
> > language.
> >
> > doc/
> > * invoke.texi (-fmath-errno) Update documentation.
> >
> > testsuite/
> >
> > * gcc.dg/errno-1.c: Add -fmath-errno.
> > * gcc.dg/torture/pr68264.c: Likewise.
> > * gcc.dg/tree-ssa/ssa-dse-15.c: Likewise.
> > * gcc.target/aarch64/no-inline-lrint_1.c: Likewise.
> So I went back and reviewed all the discussion around this.  I'm still
> having trouble getting comfortable with flipping the default -- unless
> we know ahead of time that the target runtime doesn't set errno on any
> of the math routines.  That implies a target hook to describe the
> runtime's handling off errno in math functions.  It also introduces
> target dependencies early in the GIMPLE pipeline which is generally
> counter to design goals around GIMPLE.
>
> Essentially if we flip the default, we run the risk that user code which
> does check this stuff will silently break.  That's not a good position
> to take IMHO.
>
> One could also argue that if we're using -fno-math-errno and we see
> errno read after a math call without it being set/clobbered by some
> other non-math call then we should be warning the user.  That'd be some
> kind of propagation engine in gimple.  That would make flipping the
> default more reasonable -- we'd have a reasonable chance of warning the
> user that their code is potentially buggy and may need adjustment.
>
> That led me to wonder if we could prove that for the majority of FP
> intensive codes that even if the library set errno that nobody could
> possibly be reading it, then we could in large part treat those builtin
> calls as -fno-math-errno (we'd mark them somehow and check the mark at
> GIMPLE->RTL expansion time (and possibly other points) to allow us to
> generate machine instructions rather than library calls).  That would
> get most of the benefit without the possibility to breaking user code.

One major issue here is that there's no way for GCC to know what
access constitutes an access to errno.  We already have "hacks" in place
that even though things like pow() techincally clobber memory we only
consider (with TBAA) clobbers of 'int' typed memory.

glibc currently calls __errno_location to get at a pointer to errno, so
theoretically we could add a __attribute__((returns_errno_location))
and a __attribute__((errno)) for implementations that have a global
(TLS?) variable __errno.  Then points-to analysis could track this,
but that also means this tracking would be weak.

The other issue is that we're treating -fno-math-errno as disabling
errno handling in general (not only for math functions).  That would
have to be fixed when we switch the default (or even now...  though
nobody uses errno set from malloc & friends which is where we abuse
the flag).

Richard.

> I'm willing to go with Joseph's recommendation here, but I'm not really
> comfortable acking the patch as-is on my own.
>
> jeff


Re: [PATCH] Use get_create for fn_summary (PR ipa/87491).

2018-10-04 Thread Martin Liška
On 10/3/18 6:56 PM, Jeff Law wrote:
> On 10/3/18 3:57 AM, Martin Liška wrote:
>> On 10/3/18 11:07 AM, Jan Hubicka wrote:
 Hi.

 There's one another ICE when calling fn_summary->get for a node
 that's not present. Back-trace is:

 (gdb) bt
 #0  inline_to_all_callers (node=>>> "*.LTHUNK0"/2>, data=0x7fffd690) at 
 /home/marxin/Programming/gcc/gcc/ipa-inline.c:2260
 #1  0x00b0b2c1 in cgraph_node::call_for_symbol_and_aliases 
 (include_overwritable=, data=, 
 callback=, this=) at 
 /home/marxin/Programming/gcc/gcc/cgraph.h:3221
 #2  cgraph_node::call_for_symbol_and_aliases_1 
 (this=this@entry=, 
 callback=callback@entry=0x1737440 >>> void*)>, data=data@entry=0x7fffd690, 
 include_overwritable=include_overwritable@entry=true) at 
 /home/marxin/Programming/gcc/gcc/cgraph.c:3745
 #3  0x01739422 in cgraph_node::call_for_symbol_and_aliases 
 (include_overwritable=true, data=0x7fffd690, callback=0x1737440 
 , this=>>> 0x7698f438 "a"/1>)
 at /home/marxin/Programming/gcc/gcc/cgraph.h:3225
 #4  ipa_inline () at /home/marxin/Programming/gcc/gcc/ipa-inline.c:2581

 with -fdump-ipa-all -fno-inline-small-functions  -O3.
 The problematic symbol is:

 (gdb) p node
 $1 = 
 (gdb) p node->debug()
 *.LTHUNK0/2 (int B::*.LTHUNK0(int, int)) @0x7698f5a0
   Type: function definition analyzed alias cpp_implicit_alias
   Visibility: prevailing_def_ironly artificial
   References: _ZN1B1aEii/1 (alias)
   Referring: 
   Availability: available
   First run: 0
   Function flags:
   Called by: virtual int B::_ZTv0_n24_N1B1aEii(int, int)/3 (can throw 
 external) 
   Calls: 

 Hope it's fine to use get_create and let the symbol be inlined?
 Patch survives tests on x86_64-linux-gnu.

 Ready for trunk?
 Martin

 gcc/ChangeLog:

 2018-10-03  Martin Liska  

PR ipa/87491
* ipa-inline.c (inline_to_all_callers_1): Use ::get_create
at place where we can have a function that's not
in summary.
>>> Creating empty (and thus bogus) summary is not a giid idea. 
>>> We want to print summary of the ultimate alias target of node.
>> I see, fixed in attached patch.
>> Is it ok now?
>>
>> Martin
>>
>>> Honza
 ---
  gcc/ipa-inline.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


 diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
 index 025788522fb..eea2888011c 100644
 --- a/gcc/ipa-inline.c
 +++ b/gcc/ipa-inline.c
 @@ -,7 +,7 @@ inline_to_all_callers_1 (struct cgraph_node *node, 
 void *data,
  fprintf (dump_file,
   "\nInlining %s size %i.\n",
   node->name (),
 - ipa_fn_summaries->get (node)->size);
 + ipa_fn_summaries->get_create (node)->size);
  fprintf (dump_file,
   " Called once from %s %i insns.\n",
   node->callers->caller->name (),

>>
>> 0001-Call-ultimate_alias_target-for-node-being-inlined-PR.patch
>>
>> From a68cd435af85bf61811c6947bc5c97d65e2af2a3 Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Wed, 3 Oct 2018 10:14:14 +0200
>> Subject: [PATCH] Call ultimate_alias_target for node being inlined (PR
>>  ipa/87491).
>>
>> gcc/ChangeLog:
>>
>> 2018-10-03  Martin Liska  
>>
>>  PR ipa/87491
>>  * ipa-inline.c (inline_to_all_callers_1):
>>  Call ultimate_alias_target for node being inlined.
> OK.  FWIW, I would be suspicious of a get_create call in dumping code --
> couldn't that result in creating a node in the graph only when dumping
> was enabled?  If that interacted badly with something else it'd be a
> pain to find.

Yes, that's why my second patch uses ultimate alias target.
Patch installed, thanks for review.

Martin

> 
> Jeff
> 



Re: [PATCH] Provide extension hint for aarch64 target (PR driver/83193).

2018-10-04 Thread Martin Liška
PING^4

On 9/19/18 8:02 PM, Martin Liška wrote:
> PING^3
> 
> On 8/23/18 1:00 PM, Martin Liška wrote:
>> PING^2
>>
>> On 08/01/2018 03:56 PM, Martin Liška wrote:
>>> PING^1
>>>
>>> On 07/18/2018 05:49 PM, Martin Liška wrote:
 Hi.

 This patch improves aarch64 feature modifier hints.

 May I please ask ARM folks to test the patch?
 Thanks,
 Martin

 gcc/ChangeLog:

 2018-07-18  Martin Liska  

  PR driver/83193
 * common/config/aarch64/aarch64-common.c (aarch64_parse_extension):
  Set invalid_extension when there's any.
 (aarch64_get_all_extension_candidates): New function.
 (aarch64_rewrite_selected_cpu): Pass NULL as new argument.
 * config/aarch64/aarch64-protos.h 
 (aarch64_get_all_extension_candidates):
  Declare new function.
 * config/aarch64/aarch64.c (aarch64_parse_arch): Record
  invalid_feature.
 (aarch64_parse_cpu): Likewise.
 (aarch64_print_hint_for_feature_modifier): New.
 (aarch64_validate_mcpu): Record invalid feature modifier
  and print hint for it.
 (aarch64_validate_march): Likewise.
 (aarch64_handle_attr_arch): Likewise.
 (aarch64_handle_attr_cpu): Likewise.
 (aarch64_handle_attr_isa_flags): Likewise.

 gcc/testsuite/ChangeLog:

 2018-07-18  Martin Liska  

  PR driver/83193
 * gcc.target/aarch64/spellcheck_7.c: New test.
 * gcc.target/aarch64/spellcheck_8.c: New test.
 ---
   gcc/common/config/aarch64/aarch64-common.c    | 20 +-
   gcc/config/aarch64/aarch64-protos.h   |  4 +-
   gcc/config/aarch64/aarch64.c  | 67 +++
   .../gcc.target/aarch64/spellcheck_7.c | 11 +++
   .../gcc.target/aarch64/spellcheck_8.c | 12 
   5 files changed, 97 insertions(+), 17 deletions(-)
   create mode 100644 gcc/testsuite/gcc.target/aarch64/spellcheck_7.c
   create mode 100644 gcc/testsuite/gcc.target/aarch64/spellcheck_8.c


>>>
>>
> 



Re: Fold more boolean expressions

2018-10-04 Thread Richard Biener
On Tue, Oct 2, 2018 at 3:38 PM MCC CS  wrote:
>
> Thanks a lot!

Well, thank you for the patch (and the patience...)!

Richard.


Re: [PATCH c++/pr87295] -fdebug-types-section ICE

2018-10-04 Thread Richard Biener
On Tue, Oct 2, 2018 at 4:18 PM Nathan Sidwell  wrote:
>
> Ping?

I'm not sure this is the right kind of fix for sth that only triggers
with fat objects.

My suggestion is to watch the difference to the DIE tree(s) (including
the comdat ones) before early-LTO debug write-out (we already have
a dumpfile for that, but it doesn't dump the comdat units, see
dwarf2out_early_finish) and the state before we finally output
for the fat part (just before we call output_comp_unit).  I believe
we are not properly "undoing" the effect of optimize_external_refs ().

That is, I see

#4  0x00dcfdd1 in optimize_external_refs_1 (
die=>, map=0x31c1c50) at
/tmp/trunk/gcc/dwarf2out.c:8904
8904  ref_p = lookup_external_ref (map, c);
(gdb) p debug_dwarf_die (c)
DIE0: DW_TAG_structure_type (0x768a9a50)
  abbrev id: 0 offset: 0 mark: 1
  DW_AT_name: "integral_constant"
  DW_AT_signature: die -> signature: 2e365acce74a6a1f (0x768a92d0)
  DW_AT_declaration: 1
  DW_AT_sibling: die -> 0 (0x768a9aa0)
DIE0: DW_TAG_typedef (0x768a9a00)
  abbrev id: 0 offset: 0 mark: 1
  DW_AT_name: "value_type"
  DW_AT_decl_file: "t.ii" (1)
  DW_AT_decl_line: 5
  DW_AT_decl_column: 15
  DW_AT_type: die -> 0 (0x768a9aa0)

$16 = void
(gdb) p debug_dwarf_die (die)
DIE0: DW_TAG_structure_type (0x768a9820)
  abbrev id: 0 offset: 0 mark: 1
  signature: 2e365acce74a6a1f
  DW_AT_name: "integral_constant"
  DW_AT_signature: die -> 0 (0x768a9a50)

that looks bougs (a DIE with a signature which has a signature itself).

Richard.

> On 9/17/18 7:30 AM, Nathan Sidwell wrote:
> > Richard,
> > this patch makes the ICE go away, but I really don't know if it's
> > correct.  When cloning the type die I copy die_id, so it is found during
> > the (currently ICEing) hash lookup.  In this particular testcase we
> > clone the die twice.  Once from break_out_comdat_types and once from
> > copy_decls_for_unworthy_types. That seems a little strange (but I have
> > only the smallest understanding of debug data) As these tracebacks show:
> >
> > Breakpoint 5, clone_as_declaration (
> >  die= > >)
> >  at ../../../src/gcc/dwarf2out.c:8179
> > 8179  clone->die_id = die->die_id;
> > (gdb) back
> > #0  clone_as_declaration (die= > DW_TAG_structure_type >)
> >  at ../../../src/gcc/dwarf2out.c:8179
> > #1  0x00e34faf in generate_skeleton_ancestor_tree
> > (node=0x7fffe230) at ../../../src/gcc/dwarf2out.c:8345
> > #2  0x00e351c3 in generate_skeleton_bottom_up
> > (parent=0x7fffe230) at ../../../src/gcc/dwarf2out.c:8419
> > #3  0x00e35291 in generate_skeleton (
> >  die= > >)
> >  at ../../../src/gcc/dwarf2out.c:8449
> > #4  0x00e352ce in remove_child_or_replace_with_skeleton
> > (unit=,
> >  child= > >,
> >  prev= > >)
> >  at ../../../src/gcc/dwarf2out.c:8471
> > #5  0x00e3570a in break_out_comdat_types (die= > 0x773260a0 DW_TAG_compile_unit>)
> >  at ../../../src/gcc/dwarf2out.c:8636
> > #6  0x00e75153 in dwarf2out_early_finish (filename=0x34de9d0
> > "bug.ii") at ../../../src/gcc/dwarf2out.c:32034
> > #7  0x00daa210 in symbol_table::finalize_compilation_unit
> > (this=0x7730d100) at ../../../src/gcc/cgraphunit.c:2783
> > #8  0x01417ec2 in compile_file () at ../../../src/gcc/toplev.c:480
> > #9  0x0141a90c in do_compile () at ../../../src/gcc/toplev.c:2170
> > #10 0x0141abf8 in toplev::main (this=0x7fffe56e, argc=8,
> > argv=0x7fffe668) at ../../../src/gcc/toplev.c:2305
> > #11 0x0227731e in main (argc=8, argv=0x7fffe668) at
> > ../../../src/gcc/main.c:39
> > (gdb) p clone
> > $2 = 
> > (gdb) c
> > Continuing.
> >
> > Breakpoint 5, clone_as_declaration (die= > DW_TAG_structure_type >)
> >  at ../../../src/gcc/dwarf2out.c:8179
> > 8179  clone->die_id = die->die_id;
> > (gdb) p clone
> > $3 = 
> > (gdb) back
> > #0  clone_as_declaration (die= > DW_TAG_structure_type >)
> >  at ../../../src/gcc/dwarf2out.c:8179
> > #1  0x00e34dac in copy_ancestor_tree (unit= > 0x773260a0 DW_TAG_compile_unit>,
> >  die= > >, decl_table=0x7fffe2f0)
> >  at ../../../src/gcc/dwarf2out.c:8267
> > #2  0x00e35adb in copy_decls_walk (unit= > 0x773260a0 DW_TAG_compile_unit>,
> >  die= > >, decl_table=0x7fffe2f0)
> >  at ../../../src/gcc/dwarf2out.c:8764
> > #3  0x00e35ba6 in copy_decls_walk (unit= > 0x773260a0 DW_TAG_compile_unit>,
> >  die= > >, decl_table=0x7fffe2f0)
> >  at ../../../src/gcc/dwarf2out.c:8790
> > #4  0x00e35ba6 in copy_decls_walk (unit= > 0x773260a0 DW_TAG_compile_unit>,
> >  die=,
> > decl_table=0x7fffe2f0) at ../../../src/gcc/dwarf2out.c:8790
> > #5  0x00e35c09 in copy_decls_for_unworthy_types
> > (unit=)
> >  at ../../../src/gcc/dwarf2out.c:8804
> > #6  0x00e7519a in dwarf2out_early_finish (filename=0x34de9d0
> > "bug.ii") at