Re: [PATCH v3 1/6, Committed] [MIPS] Split Loongson (MMI) from loongson3a

2018-11-07 Thread Paul Hua
On Wed, Nov 7, 2018 at 5:12 PM Paul Hua  wrote:
>
> Hi, Matthew:
>
> I committed the patch. Thanks for your review.
>

After committed this patch some test failure under
with-arch=mips64r2(i only test under -with-arch=loongson3a).

  664 FAIL: gcc.target/mips/insn-casesi.c   -O0  (test for excess
errors)
  665 FAIL: gcc.target/mips/insn-casesi.c   -O1  (test for excess
errors)
  666 FAIL: gcc.target/mips/insn-casesi.c   -O2  (test for excess
errors)
  667 FAIL: gcc.target/mips/insn-casesi.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
  668 FAIL: gcc.target/mips/insn-casesi.c   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
  669 FAIL: gcc.target/mips/insn-casesi.c   -O3 -g  (test for excess
errors)
  670 FAIL: gcc.target/mips/insn-casesi.c   -Os  (test for excess errors)

The error message is " /usr/bin/as: unrecognized option '-mno-loongson-mmi' "

Those error come from follow options.
>   mips_option_dependency options "-mips16" "-mno-loongson-mmi"
>   mips_option_dependency options "-mmicromips" "-mno-loongson-mmi"
>   mips_option_dependency options "-msoft-float" "-mno-loongson-mmi"
>   mips_option_dependency options "-mmicromips" "-mno-loongson-ext"

We should add those dependency only config with
--with-arch=loongson3a/gs464/gs464e/gs246e.
I committed the attached patch as obvious.

Paul Hua
From 11a0bec83b3a0f2765d35b6aa84263016836f86e Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Thu, 8 Nov 2018 15:01:35 +0800
Subject: [PATCH] Add mips option dependency only config with loongson target.

gcc/testsuite/
	* gcc.target/mips/mips.exp (mips-dg-options):
	Add mips_option_dependency msoft-float vs no-mmi and
	mips16/micromips vs no-mmi/ext/ext2 only gcc
	config with Loongson target.
---
 gcc/testsuite/gcc.target/mips/mips.exp | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/mips/mips.exp b/gcc/testsuite/gcc.target/mips/mips.exp
index e70d416d0dd..002cc280e30 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -1054,10 +1054,19 @@ proc mips-dg-options { args } {
 mips_option_dependency options "-mno-plt" "addressing=unknown"
 mips_option_dependency options "-mabicalls" "-G0"
 mips_option_dependency options "-mno-gpopt" "-mexplicit-relocs"
-mips_option_dependency options "-mips16" "-mno-loongson-mmi"
-mips_option_dependency options "-mmicromips" "-mno-loongson-mmi"
-mips_option_dependency options "-msoft-float" "-mno-loongson-mmi"
-mips_option_dependency options "-mmicromips" "-mno-loongson-ext"
+
+if { [check_configured_with "with-arch=loongson3a"] 
+	 || [check_configured_with "with-arch=gs464"]
+	 || [check_configured_with "with-arch=gs464e"]
+	 || [check_configured_with "with-arch=gs264e"] } {
+	mips_option_dependency options "-msoft-float" "-mno-loongson-mmi"
+	mips_option_dependency options "-mips16" "-mno-loongson-mmi"
+	mips_option_dependency options "-mips16" "-mno-loongson-ext"
+	mips_option_dependency options "-mips16" "-mno-loongson-ext2"
+	mips_option_dependency options "-mmicromips" "-mno-loongson-mmi"
+	mips_option_dependency options "-mmicromips" "-mno-loongson-ext"
+	mips_option_dependency options "-mmicromips" "-mno-loongson-ext2"
+}
 
 # Work out information about the current ABI.
 set abi_test_option_p [mips_test_option_p options abi]
-- 
2.18.0



Re: [PATCH] Fix PR87906

2018-11-07 Thread Richard Biener
On November 7, 2018 7:47:43 PM GMT+01:00, Rainer Orth 
 wrote:
>Hi Richard,
>
>> This adds a workaround for LTO decl merging prevailing a
>> non-ultimate origin decl, breaking invariants of the middle-end.
>> In the future (GCC 10) I hope to have DIE references here so
>> this will not be an issue there anymore.
>>
>> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
>>
>> Richard.
>>
>> From ff035da8314ea8e0889b99bb338e67dd5dae455b Mon Sep 17 00:00:00
>2001
>> From: Richard Guenther 
>> Date: Wed, 7 Nov 2018 08:56:52 +0100
>> Subject: [PATCH] fix-pr87906
>>
>> 2018-11-07  Richard Biener  
>>
>>  PR lto/87906
>>  * tree-streamer-in.c (lto_input_ts_block_tree_pointers): Fixup
>>  BLOCK_ABSTRACT_ORIGIN to be the ultimate origin.
>>
>>  * g++.dg/lto/pr87906_0.C: New testcase.
>>  * g++.dg/lto/pr87906_1.C: Likewise.
>>
>> diff --git a/gcc/testsuite/g++.dg/lto/pr87906_0.C
>b/gcc/testsuite/g++.dg/lto/pr87906_0.C
>> new file mode 100644
>> index 000..08e7ed3ba07
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/lto/pr87906_0.C
>> @@ -0,0 +1,35 @@
>> +// { dg-lto-do link }
>> +// { dg-lto-options { { -O -fPIC -flto } } }
>> +// { dg-extra-ld-options "-shared -nostdlib" }
>> +
>> +namespace com {
>> +namespace sun {
>> +namespace star {}
>> +} // namespace sun
>> +} // namespace com
>> +namespace a = com::sun::star;
>> +namespace com {
>> +namespace sun {
>> +namespace star {
>> +namespace uno {
>
>the new testcase FAILs on Solaris:
>
>+FAIL: g++.dg/lto/pr87906 cp_lto_pr87906_0.o assemble,  -O -fPIC -flto 
>+UNRESOLVED: g++.dg/lto/pr87906 cp_lto_pr87906_0.o-cp_lto_pr87906_1.o
>execute  -O -fPIC -flto 
>+UNRESOLVED: g++.dg/lto/pr87906 cp_lto_pr87906_0.o-cp_lto_pr87906_1.o
>link  -O -fPIC -flto 
>+FAIL: g++.dg/lto/pr87906 cp_lto_pr87906_1.o assemble,  -O -fPIC -flto 
>
>/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.dg/lto/pr87906_0.C:6:11:
>error: expected identifier before numeric constant
>/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.dg/lto/pr87906_0.C:6:11:
>error: expected unqualified-id before numeric constant
>
>and several more due to the -Dsun default.  How about
>sed -e 's/sun/moon/g' instead ;-)

Argh. Works for me. 

Richard. 

>   Rainer



Re: [patches] Re: [PATCH] Update soft-fp from glibc.

2018-11-07 Thread Kito Cheng
Hi Joseph:

I don't have commit right, could you help me to commit that, thanks :)

On Thu, Nov 8, 2018 at 1:14 AM Joseph Myers  wrote:
>
> This patch is OK.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com


[doc, committed] clarify -fno-common behavior

2018-11-07 Thread Sandra Loosemore
I've checked in this patch to fix a minor but long-standing bug in the 
description of -fno-common, PR 42726.


-Sandra
2018-11-07  Sandra Loosemore  

	PR middle-end/42726

	gcc/
	* doc/invoke.texi (Code Gen Options): Clarify -fno-common behavior.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 265904)
+++ gcc/doc/invoke.texi	(working copy)
@@ -13294,7 +13294,7 @@ C, and on some targets may carry a speed
 variable references.
 
 The @option{-fno-common} option specifies that the compiler should instead
-place uninitialized global variables in the data section of the object file.
+place uninitialized global variables in the BSS section of the object file.
 This inhibits the merging of tentative definitions by the linker so
 you get a multiple-definition error if the same 
 variable is defined in more than one compilation unit.


[doc, committed] remove leading dash from @opindex entries

2018-11-07 Thread Sandra Loosemore

I noticed this buglet when working on the PR80828 fix:

The introductory paragraph to the Option Index appendix says: "GCC’s 
command line options are indexed here without any initial ‘-’ or ‘--’." 
Indeed, that was mostly true, but there were ~20 index entries that 
incorrectly included the leading dash.  Fixed thusly.


-Sandra
2018-11-07  Sandra Loosemore  

	gcc/
	* doc/invoke.texi: Remove leading dash from @opindex entries
	throughout the file.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 265903)
+++ gcc/doc/invoke.texi	(working copy)
@@ -2934,7 +2934,7 @@ union U @{
 
 @item -Wabi-tag @r{(C++ and Objective-C++ only)}
 @opindex Wabi-tag
-@opindex -Wabi-tag
+@opindex Wabi-tag
 Warn when a type with an ABI tag is used in a context that does not
 have that ABI tag.  See @ref{C++ Attributes} for more information
 about ABI tags.
@@ -3845,7 +3845,7 @@ a left margin is printed, showing line n
 left margin.
 
 @item -fdiagnostics-minimum-margin-width=@var{width}
-@opindex -fdiagnostics-minimum-margin-width
+@opindex fdiagnostics-minimum-margin-width
 This option controls the minimum width of the left margin printed by
 @option{-fdiagnostics-show-line-numbers}.  It defaults to 6.
 
@@ -5734,8 +5734,8 @@ larger.
 This option warns on all uses of @code{alloca} in the source.
 
 @item -Walloca-larger-than=@var{byte-size}
-@opindex -Walloca-larger-than=
-@opindex -Wno-alloca-larger-than
+@opindex Walloca-larger-than=
+@opindex Wno-alloca-larger-than
 This option warns on calls to @code{alloca} with an integer argument whose
 value is either zero, or that is not bounded by a controlling predicate
 that limits its value to at most @var{byte-size}.  It also warns for calls
@@ -6661,8 +6661,8 @@ real to lower precision real values.  Th
 @option{-Wconversion}.
 
 @item -Wno-scalar-storage-order
-@opindex -Wno-scalar-storage-order
-@opindex -Wscalar-storage-order
+@opindex Wno-scalar-storage-order
+@opindex Wscalar-storage-order
 Do not warn on suspicious constructs involving reverse scalar storage order.
 
 @item -Wsized-deallocation @r{(C++ and Objective-C++ only)}
@@ -7263,8 +7263,8 @@ Warn if a variable-length array is used
 the variable-length array.
 
 @item -Wvla-larger-than=@var{byte-size}
-@opindex -Wvla-larger-than=
-@opindex -Wno-vla-larger-than
+@opindex Wvla-larger-than=
+@opindex Wno-vla-larger-than
 If this option is used, the compiler will warn for declarations of
 variable-length arrays whose size is either unbounded, or bounded
 by an argument that allows the array size to exceed @var{byte-size}
@@ -8942,13 +8942,13 @@ it may significantly increase code size
 This flag is enabled by default at @option{-O3}.
 
 @item -fipa-bit-cp
-@opindex -fipa-bit-cp
+@opindex fipa-bit-cp
 When enabled, perform interprocedural bitwise constant
 propagation. This flag is enabled by default at @option{-O2}. It
 requires that @option{-fipa-cp} is enabled.
 
 @item -fipa-vrp
-@opindex -fipa-vrp
+@opindex fipa-vrp
 When enabled, perform interprocedural propagation of value
 ranges. This flag is enabled by default at @option{-O2}. It requires
 that @option{-fipa-cp} is enabled.
@@ -12559,7 +12559,7 @@ object file names should not be used as
 Options}.
 
 @item -flinker-output=@var{type}
-@opindex -flinker-output
+@opindex flinker-output
 This option controls the code generation of the link time optimizer.  By
 default the linker output is determined by the linker plugin automatically. For
 debugging the compiler and in the case of incremental linking to non-lto object
@@ -15105,8 +15105,8 @@ single precision and to 32 bits for doub
 
 @item -mlow-precision-sqrt
 @itemx -mno-low-precision-sqrt
-@opindex -mlow-precision-sqrt
-@opindex -mno-low-precision-sqrt
+@opindex mlow-precision-sqrt
+@opindex mno-low-precision-sqrt
 Enable or disable the square root approximation.
 This option only has an effect if @option{-ffast-math} or
 @option{-funsafe-math-optimizations} is used as well.  Enabling this reduces
@@ -15116,8 +15116,8 @@ If enabled, it implies @option{-mlow-pre
 
 @item -mlow-precision-div
 @itemx -mno-low-precision-div
-@opindex -mlow-precision-div
-@opindex -mno-low-precision-div
+@opindex mlow-precision-div
+@opindex mno-low-precision-div
 Enable or disable the division approximation.
 This option only has an effect if @option{-ffast-math} or
 @option{-funsafe-math-optimizations} is used as well.  Enabling this reduces
@@ -18109,11 +18109,11 @@ Specify the C-SKY target processor.  Val
 @item -mbig-endian
 @opindex mbig-endian
 @itemx -EB
-@opindex -EB
+@opindex EB
 @itemx -mlittle-endian
 @opindex mlittle-endian
 @itemx -EL
-@opindex -EL
+@opindex EL
 
 Select big- or little-endian code.  The default is little-endian.
 
@@ -27950,7 +27950,7 @@ preferred alignment to @option{-mpreferr
 @opindex mvaes
 @need 200
 @itemx -mwaitpkg
-@opindex -mwaitpkg
+@opindex mwaitpkg
 @need 200
 @itemx -mvpclmulqdq
 @opindex mvpclmulqdq
@@ -28536,7 

Re: [RFC][PATCH LRA] WIP patch to fix one part of PR87507

2018-11-07 Thread Peter Bergner
On 11/7/18 11:36 AM, Jeff Law wrote:
> OK with this change.

Before I commit, how about I add the following test cases to test
both valid and invalid asm constraints?  I think I have the reg
numbers for the other architectures defined correctly. 

Peter


gcc/testsuite/
PR rtl-optimization/87600
* gcc.dg/pr87600.h: New.
* gcc.dg/pr87600-1.c: Likewise.
* gcc.dg/pr87600-2.c: Likewise.

Index: gcc/testsuite/gcc.dg/pr87600.h
===
--- gcc/testsuite/gcc.dg/pr87600.h  (nonexistent)
+++ gcc/testsuite/gcc.dg/pr87600.h  (working copy)
@@ -0,0 +1,19 @@
+#if defined (__aarch64__)
+# define REG1 "x0"
+# define REG2 "x1"
+#elif defined (__arm__)
+# define REG1 "r0"
+# define REG2 "r1"
+#elif defined (__i386__)
+# define REG1 "%eax"
+# define REG2 "%edx"
+#elif defined (__powerpc__)
+# define REG1 "r3"
+# define REG2 "r4"
+#elif defined (__s390__)
+# define REG1 "0"
+# define REG2 "1"
+#elif defined (__x86_64__)
+# define REG1 "rax"
+# define REG2 "rdx"
+#endif
Index: gcc/testsuite/gcc.dg/pr87600-1.c
===
--- gcc/testsuite/gcc.dg/pr87600-1.c(nonexistent)
+++ gcc/testsuite/gcc.dg/pr87600-1.c(working copy)
@@ -0,0 +1,52 @@
+/* PR rtl-optimization/87600  */
+/* { dg-do compile { target aarch64*-*-* arm*-*-* i?86-*-* powerpc*-*-* 
s390*-*-* x86_64-*-* } } */
+/* { dg-options "-O2" } */
+
+#include "pr87600.h"
+
+/* The following are all valid uses of local register variables.  */
+
+long
+test0 (long arg)
+{
+  register long var asm (REG1);
+  asm ("blah %0 %1" : "+" (var) : "r" (arg));
+  return var;
+}
+
+long
+test1 (long arg0, long arg1)
+{
+  register long var asm (REG1);
+  asm ("blah %0, %1, %2" : "=" (var) : "r" (arg0), "0" (arg1));
+  return var + arg1;
+}
+
+long
+test2 (void)
+{
+  register long var1 asm (REG1);
+  register long var2 asm (REG1);
+  asm ("blah %0 %1" : "=" (var1) : "0" (var2));
+  return var1;
+}
+
+long
+test3 (void)
+{
+  register long var1 asm (REG1);
+  register long var2 asm (REG2);
+  long var3;
+  asm ("blah %0 %1" : "=" (var1), "=r" (var3) : "1" (var2));
+  return var1 + var3;
+}
+
+long
+test4 (void)
+{
+  register long var1 asm (REG1);
+  register long var2 asm (REG2);
+  register long var3 asm (REG2);
+  asm ("blah %0 %1" : "=" (var1), "=r" (var2) : "1" (var3));
+  return var1;
+}
Index: gcc/testsuite/gcc.dg/pr87600-2.c
===
--- gcc/testsuite/gcc.dg/pr87600-2.c(nonexistent)
+++ gcc/testsuite/gcc.dg/pr87600-2.c(working copy)
@@ -0,0 +1,44 @@
+/* PR rtl-optimization/87600  */
+/* { dg-do compile { target aarch64*-*-* arm*-*-* i?86-*-* powerpc*-*-* 
s390*-*-* x86_64-*-* } } */
+/* { dg-options "-O2" } */
+
+#include "pr87600.h"
+
+/* The following are all invalid uses of local register variables.  */
+
+long
+test0 (void)
+{
+  register long var1 asm (REG1);
+  register long var2 asm (REG1);
+  asm ("blah %0 %1" : "=r" (var1), "=r" (var2)); /* { dg-error "invalid hard 
register usage between output operands" } */
+  return var1;
+}
+
+long
+test1 (void)
+{
+  register long var1 asm (REG1);
+  register long var2 asm (REG2);
+  asm ("blah %0 %1" : "=r" (var1) : "0" (var2)); /* { dg-error "invalid hard 
register usage between output operand and matching constraint operand" } */
+  return var1;
+}
+
+long
+test2 (void)
+{
+  register long var1 asm (REG1);
+  register long var2 asm (REG1);
+  asm ("blah %0 %1" : "=" (var1) : "r" (var2)); /* { dg-error "invalid hard 
register usage between earlyclobber operand and input operand" } */
+  return var1;
+}
+
+long
+test3 (void)
+{
+  register long var1 asm (REG1);
+  register long var2 asm (REG1);
+  long var3;
+  asm ("blah %0 %1" : "=" (var1), "=r" (var3) : "1" (var2)); /* { dg-error 
"invalid hard register usage between earlyclobber operand and input operand" } 
*/
+  return var1 + var3;
+}



Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules

2018-11-07 Thread Segher Boessenkool
On Wed, Nov 07, 2018 at 10:34:30PM +, Wilco Dijkstra wrote:
> Hi Jeff,
> 
> > So if we're going from 0->2 ULPs in some cases, do we want to guard it
> > with one of the various options, if so, which?  Giuliano's follow-up
> > will still have the potential for 2ULPs.
> 
> The ULP difference is not important since the individual math functions 
> already have ULP of 3 or higher. Changing ULP error for some or all inputs
> (like we did with the rewritten math functions) is not considered an issue as
> long as worst-case ULP error doesn't increase.

But the max. error in sinh/cosh/atanh is less than 2 ULP, with some math
libraries.  It could be < 1 ULP, in theory, so sinh(atanh(x)) less than
2 ULP even.

> The question is more like whether errno and trapping/exception behaviour
> is identical - I guess it is not so I would expect this to be fastmath only.
> Which particular flag one uses is a detail given there isn't a clear 
> definition
> for most of them.

And signed zeroes.  Yeah.  I think it would have to be
flag_unsafe_math_optimizations + some more.


Segher


[doc, committed] document -e and --entry

2018-11-07 Thread Sandra Loosemore

Trying to knock off some easy documentation bugs from bugzilla...

I've checked in this patch to fix PR driver/80828, about missing 
documentation for these options that are passed through to the linker.


-Sandra

2018-11-07  Sandra Loosemore  

	PR driver/80828

	gcc/
	* doc/invoke.texi (Option Summary): Add -e and --entry.
	(Link Options): Likewise.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 265902)
+++ gcc/doc/invoke.texi	(working copy)
@@ -524,6 +524,7 @@ Objective-C and Objective-C++ Dialects}.
 @xref{Link Options,,Options for Linking}.
 @gccoptlist{@var{object-file-name}  -fuse-ld=@var{linker}  -l@var{library} @gol
 -nostartfiles  -nodefaultlibs  -nolibc  -nostdlib @gol
+-e @var{entry}  --entry=@var{entry} @gol
 -pie  -pthread  -r  -rdynamic @gol
 -s  -static -static-pie -static-libgcc  -static-libstdc++ @gol
 -static-libasan  -static-libtsan  -static-liblsan  -static-libubsan @gol
@@ -12712,6 +12713,15 @@ library subroutines.
 constructors are called; @pxref{Collect2,,@code{collect2}, gccint,
 GNU Compiler Collection (GCC) Internals}.)
 
+@item -e @var{entry}
+@itemx --entry=@var{entry}
+@opindex e
+@opindex entry
+
+Specify that the program entry point is @var{entry}.  The argument is
+interpreted by the linker; the GNU linker accepts either a symbol name
+or an address.
+
 @item -pie
 @opindex pie
 Produce a dynamically linked position independent executable on targets


Re: [PATCH 4/6] [RS6000] Remove constraints on call rounded_stack_size_rtx arg

2018-11-07 Thread Segher Boessenkool
On Wed, Nov 07, 2018 at 04:09:06PM +1030, Alan Modra wrote:
> This call arg is unused on rs6000.

This is fine.  Okay for trunk.  Thank you!


Segher


>   * config/rs6000/darwin.md (call_indirect_nonlocal_darwin64,
>   call_nonlocal_darwin64, call_value_indirect_nonlocal_darwin64,
>   call_value_nonlocal_darwin64): Remove constraints from second call
>   arg, the rounded_stack_size_rtx arg.
>   * config/rs6000/rs6000.md (tls_gd_aix, tls_gd_sysv,
>   tls_gd_call_aix, tls_gd_call_sysv, tls_ld_aix, tls_ld_sysv,
>   tls_ld_call_aix, tls_ld_call_sysv, call_local32, call_local64,
>   call_value_local32, call_value_local64, call_indirect_nonlocal_sysv,
>   call_nonlocal_sysv, call_nonlocal_sysv_secure,
>   call_value_indirect_nonlocal_sysv, call_value_nonlocal_sysv,
>   call_value_nonlocal_sysv_secure, call_local_aix,
>   call_value_local_aix, call_nonlocal_aix, call_value_nonlocal_aix,
>   call_indirect_aix, call_value_indirect_aix, call_indirect_elfv2,
>   call_value_indirect_elfv2, sibcall_local32, sibcall_local64,
>   sibcall_value_local32, sibcall_value_local64, sibcall_aix,
>   sibcall_value_aix): Likewise.


Re: [PATCH 3/6] [RS6000] Replace TLSmode with P, and correct tls call mems

2018-11-07 Thread Segher Boessenkool
On Wed, Nov 07, 2018 at 04:08:26PM +1030, Alan Modra wrote:
> There is really no need to define a TLSmode mode iterator that is
> identical (since !TARGET_64BIT == TARGET_32BIT) to the much used P
> mode iterator.

Nice :-)

> It's nonsense to think we might ever want to support
> 32-bit TLS on 64-bit or vice versa!  The patch also fixes a minor
> error in the call mems.  All other direct calls use (call (mem:SI ..)).

You can also replace  with ,  with
, and l with .  Also, was "TLSmode:"
needed anywhere?  I don't see any other iterator used in those patterns.

>   * config/rs6000/rs6000.md (TLSmode): Delete mode iterator.  Replace
>   with P throughout except for call mems which should use SI.

Approved for trunk.  Further cleanup as above pre-approved.  Thanks!


Segher


[PATCH] minor FDO profile related fixes

2018-11-07 Thread Indu Bhagat

I have been looking at -fdump-ipa-profile dump with an intention to sanitize
bits of information so that one may use it to judge the "quality of a profile"
in FDO.

The overall question I want to address is - are there ways to know which
functions were not run in the training run, i.e. have ZERO profile ?
(This patch corrects some dumped info; in a subsequent patch I would like to add
some more explicit information/diagnostics.)

Towards that end, I noticed that there are a couple of misleading bits of
information (so I think) in the symbol table dump listing all functions in the
compilation unit :
   --- "globally 0" appears even when profile data has not been fed by feedback
profile (not the intent as the documentation of profile_guessed_global0
 in profile-count.h suggests).
   --- "unlikely_executed" should appear only when there is profile feedback or
   a function attribute is specified (as per documentation of
   node_frequency in coretypes.h). "unlikely_executed" in case of STALE or
   NO profile is misleading in my opinion.

Summary of changes :

1. This patch makes some adjustments around how x_profile_status of a function
is set - x_profile_status should be set to PROFILE_READ only when there is a
profile for a function read from the .gcda file. So, instead of relying on
profile_info (set whenever the gcda feedback file is present, even if the
function does not have a profile available in the file), use exec_counts
(non null when function has a profile (the latter may or may not be zero)). In
essence, x_profile_status and profile_count::m_quality
are set consistent to the stated intent (in code comments.)

2. A minor change in coverage.c is for more precise location of the message

Following -fdump-ipa-profile dump excerpts show the effect :


 -O1, -O2, -O3


0. APPLICABLE PROFILE
Trunk : Function flags: count:224114269 body hot
After Patch : Function flags: count:224114269 (precise) body hot

1. STALE PROFILE
(i.e., those cases covered by Wcoverage-mismatch; when control flow changes
 between profile-generate and profile-use)
Trunk : Function flags: count:224114269 body hot
After Patch : Function flags: count:224114269 (precise) body hot

2. NO PROFILE
(i.e., those cases covered by Wmissing-profile; when function has no profile
 available in the .gcda file)
Trunk (missing .gcda file) : Function flags: count:1073741824 (estimated 
locally) body
Trunk (missing function) : Function flags: count: 1073741824 (estimated 
locally, globally 0) body unlikely_executed
After Patch (missing .gcda file) : Function flags: count:1073741824 (estimated 
locally) body
After Patch (missing function) : Function flags: count:1073741824 (estimated 
locally) body

3. ZERO PROFILE (functions not run in training run)
Trunk : Function flags: count: 1073741824 (estimated locally, globally 0) body 
unlikely_executed
After Patch (remains the same) : count: 1073741824 (estimated locally, globally 
0) body unlikely_executed

--
O0
--
In O0, flag_guess_branch_prob is not set. This makes the profile_quality set to
(precise) for most of the above cases.

0. APPLICABLE PROFILE
Trunk : Function flags: count:224114269 body hot
After Patch : Function flags: count:224114269 (precise) body hot

1. STALE PROFILE
(i.e., those cases covered by Wcoverage-mismatch; when control flow changes
 between profile-generate and profile-use)
Trunk : Function flags: count:224114269 body hot
After Patch : Function flags: count:224114269 (precise) body hot

2. NO PROFILE
(i.e., those cases covered by Wmissing-profile; when function has no profile
 available in the .gcda file)
Trunk (missing file) : Function flags: body
Trunk (missing function) : Function flags: count:0 body unlikely_executed
After Patch (missing file) :  Function flags: body
*** After Patch (missing function) : Function flags: count:0 (precise) body
(*** This remains misleading, and I do not have a solution for this; as use of 
heuristics
 to guess branch probability is not allowed in O0)

3. ZERO PROFILE (functions not run in training run)
Trunk : Function flags: count:0 body unlikely_executed
After Patch : Function flags: count:0 (precise) body

--

make check-gcc on x86_64 shows no new failures.

(A related PR was https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86957 where we 
added diagnostics for the NO PROFILE case.)

diff --git a/gcc/coverage.c b/gcc/coverage.c
index 599a3bb..7595e6c 100644
--- a/gcc/coverage.c
+++ b/gcc/coverage.c
@@ -358,7 +358,7 @@ get_coverage_counts (unsigned counter, unsigned 
cfg_checksum,
   if (warning_printed && dump_enabled_p ())
{
  dump_user_location_t loc
-   = dump_user_location_t::from_location_t (input_location);
+   = 

Re: [PATCH 1/6] [RS6000] rs6000_output_call for external call insn assembly output

2018-11-07 Thread Segher Boessenkool
On Wed, Nov 07, 2018 at 04:07:15PM +1030, Alan Modra wrote:
> +extern const char *rs6000_output_call (rtx *, unsigned int, bool, const char 
> *);

Maybe have a separate rs6000_output_call and rs6000_output_sibcall?  Bare
boolean function parameters aren't great.  (They can of course both call
rs6000_output_call_1 or whatever, if that makes sense).

> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -21380,6 +21380,37 @@ rs6000_assemble_integer (rtx x, unsigned int size, 
> int aligned_p)
>return default_assemble_integer (x, size, aligned_p);
>  }
>  
> +/* Return a template string for assembly to emit when making an
> +   external call.  FUN is the %z argument, ARG is either NULL or
> +   a @TLSGD or @TLSLD __tls_get_addr argument specifier.  */
> +
> +const char *
> +rs6000_output_call (rtx *operands, unsigned int fun, bool sibcall,
> + const char *arg)
> +{
> +  /* -Wformat-overflow workaround, without which gcc thinks that %u
> +  might produce 10 digits.  FUN is 0 or 1 as of 2018-03.  */
> +  gcc_assert (fun <= 6);

So "fun" is the operand number.  Rename it, and use MAX_RECOG_OPERANDS
instead of 6?  And allow for it to take 2 or 3 chars to print :-)

"operands" is unused here, compiling this will warn.

"output" is a lie, this function doesn't output anything.  Hardly the
only case of this in the rs6000 port, but it is annoying.  What would be
a good name for this...  "rs6000_template_for_call"?


Are there some patterns that can be collapsed to one after this?


Segher


Re: [PATCH] combine: Do not combine moves from hard registers

2018-11-07 Thread Segher Boessenkool
Hi!

On Mon, Nov 05, 2018 at 06:16:16PM +, Renlin Li wrote:
> Sorry, this is not correct. Instructions scheduled between x and x+1 
> directly use hard register r1.
> It is not IRA/LRA assigning r1 to the operands.
> 
> 
> To reproduce this particular case, you could use:
> cc1  -O3 -marm -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=softfp 
> gcc.c-torture/execute/builtins/memcpy-chk.c
> 
> This insn is been splitted.
> 
> (insn 152 150 154 11 (set (mem/c:QI (plus:SI (reg/f:SI 266)
> (const_int 24 [0x18])) [0 MEM[(void *) + 20B]+4 S1 A32])
> (reg:QI 1 r1)) "memcpy-chk-reduce.c":48:3 189 {*arm_movqi_insn}
>  (expr_list:REG_DEAD (reg:QI 1 r1)
> (nil)))

Okay, I see what is going on.  The arm port often expands movmem to use
hard registers (so that it can use load/store multiple?).  This does not
work well with scheduling and RA, and this combine feature makes it worse.

I don't know what to do about it in combine.


Segher


Re: [PATCH] Verify that last argument of __builtin_expect_with_probability is a real cst (PR c/87811).

2018-11-07 Thread Jeff Law
On 11/7/18 2:36 AM, Martin Liška wrote:
> On 11/5/18 7:00 PM, Martin Sebor wrote:
>> On 11/01/2018 07:45 AM, Martin Liška wrote:
>>> On 11/1/18 1:15 PM, Jakub Jelinek wrote:
 On Thu, Nov 01, 2018 at 01:09:16PM +0100, Martin Liška wrote:
> -range 0.0 to 1.0, inclusive.
> +range 0.0 to 1.0, inclusive.  The @var{probability} argument must be
> +a compiler time constant.
 When you say must, I think error_at should be used rather than warning_at.
 If others disagree I'm open for leaving it as is.
>>> Error is fine for me as well.
>>>
> @@ -2474,6 +2481,11 @@ expr_expected_value_1 (tree type, tree op0, enum 
> tree_code code,
>    *predictor = PRED_BUILTIN_EXPECT_WITH_PROBABILITY;
>    *probability = probi;
>  }
> +  else
> +  warning_at (gimple_location (def), 0,
> +  "probability argument %qE must be a in the "
> +  "range 0.0 to 1.0", prob);
 Wrong indentation.

 And, no diagnostics for -O0 (which should also be covered by a testcase).
>>> Test for that added.
>>>
> +/* { dg-options "-O2 -fdump-tree-profile_estimate -frounding-math" } */
 Why the -frounding-math options?
>>> I remember I had some issue with:
>>>   tree r = fold_build2_initializer_loc (UNKNOWN_LOCATION,
>>>     MULT_EXPR, t, prob, base);
>>>
>>> on targets with a non-IEEE floating point arithmetics (s390?).
>>>
>>>  I think test
 coverage should handle both that and when that option is not used
 if that option makes any difference.
>>> It will eventually pop up if we install new tests w/o rounding math.
>>>
 Jakub

>>>
>>> Martin
>>>
>> I noticed a few minor issues in the hunks below:
>>
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -12046,7 +12046,8 @@
>>  when testing pointer or floating-point values.
>>
>>  This function has the same semantics as @code{__builtin_expect},
>>  but the caller provides the expected probability that @var{exp} == @var{c}.
>>  The last argument, @var{probability}, is a floating-point value in the
>> -range 0.0 to 1.0, inclusive.
>> +range 0.0 to 1.0, inclusive.  The @var{probability} argument must be
>> +a compiler time constant.
>>
>> The term is "compile-time constant" but please see below.
>>
>> --- a/gcc/predict.c
>> +++ b/gcc/predict.c
>> @@ -2467,6 +2467,13 @@
>>  expr_expected_value_1 (tree type, tree op0, enum tree_code code,
>>    base = build_real_from_int_cst (t, base);
>>    tree r = fold_build2_initializer_loc (UNKNOWN_LOCATION,
>>  MULT_EXPR, t, prob, base);
>> +  if (TREE_CODE (r) != REAL_CST)
>> +    {
>> +  error_at (gimple_location (def),
>> +    "probability argument %qE must be a compile "
>> +    "time constant", prob);
>> +  return NULL;
>>     }
>>
>> According to GCC coding conventions, when used as an adjective
>> the term "compile-time" should be hyphenated.  But the term used
>> in other diagnostics is either "constant integer" or "constant
>> integer expressions" so I would suggest to use it instead, here
>> and in the manual.
>>
>> @@ -2474,6 +2481,11 @@
>>  expr_expected_value_1 (tree type, tree op0, enum tree_code code,
>>    *predictor = PRED_BUILTIN_EXPECT_WITH_PROBABILITY;
>>    *probability = probi;
>>  }
>> +  else
>> +    error_at (gimple_location (def),
>> +  "probability argument %qE must be a in the "
>> +  "range 0.0 to 1.0", prob);
>> +
>>
>> There's a stray 'a' in the text of the error.
>>
>> But it's not really meaningful to say
>>
>>   3.14 must be in the range 0.0 to 1.0
>>
>> because that simply cannot happen.  We could say "argument 2 must
>> be in the range" but I would instead suggest to rephrase the error
>> along the same lines as other similar messages GCC already issues:
>>
>>   "probability %qE is outside the range [0.0, 1.0]"
>>
>> Martin
> Hi Martin.
> 
> Thanks for help with the wording. Please take a look at attached patch
> candidate.
> 
> Martin
> 
> 
> 0001-Change-wording-of-__builtin_expect_with_probability-.patch
> 
> From 94b61505be171b6b16f7a85c62c722d3c9e13c2f Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Wed, 7 Nov 2018 10:27:00 +0100
> Subject: [PATCH] Change wording of __builtin_expect_with_probability errors.
> 
> gcc/ChangeLog:
> 
> 2018-11-07  Martin Liska  
> 
>   * doc/extend.texi: Reword.
>   * predict.c (expr_expected_value_1): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-11-07  Martin Liska  
> 
>   * gcc.dg/pr87811.c: Update scanned pattern.
>   * gcc.dg/pr87811-2.c: Likewise.
OK.
jeff


Re: [PR87793] reject non-toplevel unspecs in debug loc exprs on x86

2018-11-07 Thread Jeff Law
On 11/7/18 12:42 AM, Alexandre Oliva wrote:
> Before revision 254025, we'd reject UNSPECs in debug loc exprs.
> TARGET_CONST_NOT_OK_FOR_DEBUG_P still rejects that by default, on all
> ports that override it, except for x86, that accepts @gotoff unspecs.
> We can indeed accept them in top-level expressions, but not as
> subexpressions: the assembler rejects the difference between two
> @gotoff symbols, for example.
> 
> We could simplify such a difference and drop the @gotoffs, provided
> that the symbols are in the same section; we could also accept
> @gotoffs plus literal constants.  However, accepting those but
> rejecting such combinations as subexpressions would be ugly, and most
> likely not worth the trouble: sym@gotoff+litconst hardly makes sense
> as a standalone expression, and the difference between @gotoffs should
> be avoided to begin with, as follows.
> 
> Ideally, the debug loc exprs would use the symbolic data in
> REG_EQUIV/REG_EQUAL notes, or delegitimized addresses, instead of
> simplifying the difference between two legitimized addresses so that
> the occurrences of the GOT register cancel each other.  That would
> require some more elaborate surgery in var-tracking and cselib than
> would be appropriate at this stage.
> 
> Regstrapped on x86_64- and i686-linux-gnu.  Ok to install?
> 
> 
> for  gcc/ChangeLog
> 
>   PR target/87793
>   * config/i386/i386.c (ix86_const_not_ok_for_debug_p): Reject
>   non-toplevel UNSPEC.
> 
> for  gcc/testsuite/ChangeLog
> 
>   PR target/87793
>   * gcc.dg/pr87793.c: New.
OK.
jeff


Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.

2018-11-07 Thread Jakub Jelinek
On Wed, Nov 07, 2018 at 03:23:55PM -0700, Jeff Law wrote:
> > @@ -882,8 +883,12 @@ hash_table
> >if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
> >  expand ();
> >  
> > -  m_searches++;
> > +#if ENABLE_EXTRA_CHECKING
> > +if (insert == INSERT)
> > +  verify (comparable, hash);
> > +#endif

Plus formatting, the above is indented too much.

Jakub


[PR/87936] --disable-checking bootstrap break

2018-11-07 Thread Nathan Sidwell
I'm committing this to unbreak a --disable-checking bootstrap build 
failure.  As documented in the PR we think there's an out of bound array 
access.


nathan
--
Nathan Sidwell
2018-11-07  Nathan Sidwell  

	PR 87926
	* Makefile.in (bitmap.o-warn): Add -Wno-error to unbreak
	--disable-checking bootstrap.

Index: Makefile.in
===
--- Makefile.in	(revision 265883)
+++ Makefile.in	(working copy)
@@ -221,6 +221,7 @@ libgcov-merge-tool.o-warn = -Wno-error
 gimple-match.o-warn = -Wno-unused
 generic-match.o-warn = -Wno-unused
 dfp.o-warn = -Wno-strict-aliasing
+bitmap.o-warn = -Wno-error # PR 87926
 
 # All warnings have to be shut off in stage1 if the compiler used then
 # isn't gcc; configure determines that.  WARN_CFLAGS will be either


Re: PING [PATCH] use MAX_OFILE_ALIGNMENT to validate attribute aligned (PR 87795)

2018-11-07 Thread Jeff Law
On 11/6/18 5:06 PM, Martin Sebor wrote:
> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg02081.html
I thought I'd already ACK's this one...


OK.

Jeff


Re: Small typo in iconv.m4

2018-11-07 Thread Jeff Law
On 11/6/18 9:37 AM, Hafiz Abid Qadeer wrote:
> Hi All,
> I was investigating a character set related problem with windows hosted
> GDB and I tracked it down to a typo in iconv.m4. This typo caused
> libiconv detection to fail and related support was not built into gdb.
> 
> The problem is with the following line.
> CPPFLAGS="$LIBS $INCICONV"
> which should have been
> CPPFLAGS="$CPPFLAGS $INCICONV"
> 
> OK to commit the attached patch?
> 
> 2018-11-06  Hafiz Abid Qadeer  
> 
>   * config/iconv.m4 (AM_ICONV_LINK): Don't overwrite CPPFLAGS.
>   Append $INCICONV to it.
>   * gcc/configure: Regenerate.
>   * libcpp/configure: Likewise.
>   * libstdc++-v3/configure: Likewise.
>   * intl/configure: Likewise.
> 
> Thanks,
> 
THanks.  I wasn't sure if you had commit privs, so I went ahead and
installed the patch.

Jeff


Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules

2018-11-07 Thread Wilco Dijkstra
Hi Jeff,

> So if we're going from 0->2 ULPs in some cases, do we want to guard it
> with one of the various options, if so, which?  Giuliano's follow-up
> will still have the potential for 2ULPs.

The ULP difference is not important since the individual math functions 
already have ULP of 3 or higher. Changing ULP error for some or all inputs
(like we did with the rewritten math functions) is not considered an issue as
long as worst-case ULP error doesn't increase.

The question is more like whether errno and trapping/exception behaviour
is identical - I guess it is not so I would expect this to be fastmath only.
Which particular flag one uses is a detail given there isn't a clear definition
for most of them.

Wilco


Re: [PATCH AutoFDO/4]Fix profile count computation/propagation.

2018-11-07 Thread Jeff Law
On 10/31/18 12:34 AM, bin.cheng wrote:
> Hi,
> This patch fixes AutoFDO breakage on trunk.  The main reason for breakage is 
> AutoFDO
> relies on standalone edge count computing and propagating profile 
> count/probability info
> on CFG, but in new infra, edge count is actually computed from probability, 
> which leads
> to chicken-egg problem and corrupted profile count.  This patch fixes the 
> issue by using
> explicit edge count.
> 
> There is another issue not touched yet that, in quite common case, profiled 
> samples are
> not enough and profile info computed for lots of blocks is ZERO.  In the 
> future, we may
> add some heuristics checking quality of sampled counts and reverting to 
> guessed profile
> count if necessary.  I think change made in this patch is also needed for 
> that.
> 
> Package mysql server is used in test of this patch set.  It can't be compiled 
> with autofdo
> on trunk, even with compilation issues worked-around, there isn't performance 
> improvement.
> I local experiments, with this patch set it's improved by 12.3%, 4.3% 
> irrespectively for
> read-only/write-heavy benchmarks.  Unfortunately,  this patch set was written 
> against
> GCC 8 branch a while ago, improvement gets worse on trunk and I haven't 
> investigated
> the reason yet.  I guess there are still other issues which need to be fixed 
> in the future.
> 
> Bootstrap and test on x86_64 in patch set.  Is it OK?
> 
> Thanks,
> bin
> 2018-10-31  Bin Cheng  
> 
>   * auto-profile.c (AFDO_EINFO): New macro.
>   (struct edge_info): New structure.
>   (is_edge_annotated, set_edge_annotated): Delete.
>   (afdo_propagate_edge, afdo_propagate_circuit, afdo_propagate): Remove
>   parameter.  Adjust edge count computation and annotation using struct
>   edge_info.
>   (afdo_calculate_branch_prob): Ditto.
>   (afdo_annotate_cfg): Simplify code setting basic block profile count.
> 
> 
> 0004-Fix-AutoFDO-breakage-after-profile-count-rewriting.patch
> 
> From 6506c12d1b633b6d1bfae839b3633a4f99b3a481 Mon Sep 17 00:00:00 2001
> From: chengbin 
> Date: Mon, 20 Aug 2018 15:25:02 +0800
> Subject: [PATCH 4/4] Fix AutoFDO breakage after profile count rewriting.
> 
> ---
>  gcc/auto-profile.c | 190 
> ++---
>  1 file changed, 95 insertions(+), 95 deletions(-)
> 
> diff --git a/gcc/auto-profile.c b/gcc/auto-profile.c
> index cde4f41c1d9..ff3ea23d830 100644
> --- a/gcc/auto-profile.c
> +++ b/gcc/auto-profile.c
> @@ -101,6 +101,17 @@ along with GCC; see the file COPYING3.  If not see
>  namespace autofdo
>  {
>  
> +/* Intermediate edge info used when propagating AutoFDO profile information.
> +   We can't edge->count() directly since it's computed from edge's 
> probability
> +   while probability is yet not decided during propagation.  */
> +#define AFDO_EINFO(e) ((struct edge_info *) e->aux)
> +struct edge_info
> +{
> +  edge_info () : count (profile_count::zero ().afdo ()), annotated_p (false) 
> {}
> +  profile_count count;
> +  bool annotated_p;
> +};
edge_info isn't POD, so make it a class rather than a struct.

OK with that change assuming it does not have a hard dependency on prior
patches in this series.

jeff


[committed] Fix type of "num" argument to memcpy in gcc.c-torture/compile/pr65595.c

2018-11-07 Thread Jozef Lawrynowicz

The test uses "unsigned long" as the "num" argument to memcpy, but it should be
size_t, and these types are not equivalent on all targets.

Committed to trunk.


>From 3ebbb8102bd9b984c6f1a1eaf0bca45fe4fd23e1 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Tue, 6 Nov 2018 12:49:00 +
Subject: [PATCH 04/12] [TESTSUITE] size_type memcpy

2018-11-07  Jozef Lawrynowicz  

	gcc/testsuite/ChangeLog:
  
	* gcc.c-torture/compile/pr65595.c: Change type of "num" argument to
	memcpy from "unsigned long" to __SIZE_TYPE__.
---
 gcc/testsuite/gcc.c-torture/compile/pr65595.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.c-torture/compile/pr65595.c b/gcc/testsuite/gcc.c-torture/compile/pr65595.c
index 0ab7161..b6a0aa4 100644
--- a/gcc/testsuite/gcc.c-torture/compile/pr65595.c
+++ b/gcc/testsuite/gcc.c-torture/compile/pr65595.c
@@ -1,4 +1,4 @@
-extern void *memcpy(void *, const void *, unsigned long);
+extern void *memcpy(void *, const void *, __SIZE_TYPE__);
 struct in6_addr {
   struct {
 int u6_addr32[4];
-- 
2.7.4



Re: PR fortran/87919 patch for -fno-dec-structure

2018-11-07 Thread Jakub Jelinek
On Wed, Nov 07, 2018 at 05:05:13PM -0500, Fritz Reese wrote:

--- a/gcc/fortran/options.c
+++ b/gcc/fortran/options.c
@@ -32,6 +32,20 @@ along with GCC; see the file COPYING3.  If not see
 
 gfc_option_t gfc_option;
 
+#define _expand(m) m

I think it would be better to avoid names like _expand, too generic
name and starts with underscore, name it e.g. SET_BITFLAG_1 or something
similar.  And it isn't mentioned in the ChangeLog.

@@ -62,14 +75,30 @@ set_dec_flags (int value)
 }

What about the
  /* Allow legacy code without warnings.  */
  gfc_option.allow_std |= GFC_STD_F95_OBS | GFC_STD_F95_DEL
| GFC_STD_GNU | GFC_STD_LEGACY;
  gfc_option.warn_std &= ~(GFC_STD_LEGACY | GFC_STD_F95_DEL);
that is done for value, shouldn't set_dec_flags remove those
flags again?  Maybe not the allow_std ones, because those are set already by
default, perhaps just the warn_std flags?

   /* Set other DEC compatibility extensions.  */
-  flag_dollar_ok |= value;
-  flag_cray_pointer |= value;
-  flag_dec_structure |= value;
-  flag_dec_intrinsic_ints |= value;
-  flag_dec_static |= value;
-  flag_dec_math |= value;
+  SET_BITFLAG (flag_dollar_ok, value, value);
+  SET_BITFLAG (flag_cray_pointer, value, value);
+  SET_BITFLAG (flag_dec_structure, value, value);
+  SET_BITFLAG (flag_dec_intrinsic_ints, value, value);
+  SET_BITFLAG (flag_dec_static, value, value);
+  SET_BITFLAG (flag_dec_math, value, value);
 }
 
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/array_temporaries_5.f90
@@ -0,0 +1,20 @@
+! { dg-do run }
+! { dg-options "-fcheck-array-temporaries -fno-check-array-temporaries" }
+!
+! PR fortran/87919
+!
+! Ensure -fno-check-array-temporaries disables array temporary checking.
+! Copied from array_temporaries_2.f90.

For tests where you expect no errors and that are just copies of other
testcases, perhaps
include 'array_temporaries_2.f90'
or similar instead?

Jakub


Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.

2018-11-07 Thread Jeff Law
On 10/30/18 6:28 AM, Martin Liška wrote:
> On 10/30/18 11:03 AM, Jakub Jelinek wrote:
>> On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška wrote:
>>> +hashtab_chk_error ()
>>> +{
>>> +  fprintf (stderr, "hash table checking failed: "
>>> +  "equal operator returns true for a pair "
>>> +  "of values with a different hash value");
>> BTW, either use internal_error here, or at least if using fprintf
>> terminate with \n, in your recent mail I saw:
>> ...different hash valueduring RTL pass: vartrack
>> ^^
> Sure, fixed in attached patch.
> 
> Martin
> 
>>> +  gcc_unreachable ();
>>> +}
>>  Jakub
>>
> 
> 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch
> 
> From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Mon, 29 Oct 2018 09:38:21 +0100
> Subject: [PATCH] Sanitize equals and hash functions in hash-tables.
> 
> ---
>  gcc/hash-table.h | 40 +++-
>  1 file changed, 39 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
> index bd83345c7b8..694eedfc4be 100644
> --- a/gcc/hash-table.h
> +++ b/gcc/hash-table.h
> @@ -503,6 +503,7 @@ private:
>  
>value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
>value_type *find_empty_slot_for_expand (hashval_t);
> +  void verify (const compare_type , hashval_t hash);
>bool too_empty_p (unsigned int);
>void expand ();
>static bool is_deleted (value_type )
> @@ -882,8 +883,12 @@ hash_table
>if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
>  expand ();
>  
> -  m_searches++;
> +#if ENABLE_EXTRA_CHECKING
> +if (insert == INSERT)
> +  verify (comparable, hash);
> +#endif
>  
> +  m_searches++;
>value_type *first_deleted_slot = NULL;
>hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
>hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
> @@ -930,6 +935,39 @@ hash_table
>return _entries[index];
>  }
>  
> +#if ENABLE_EXTRA_CHECKING
> +
> +/* Report a hash table checking error.  */
> +
> +ATTRIBUTE_NORETURN ATTRIBUTE_COLD
> +static void
> +hashtab_chk_error ()
> +{
> +  fprintf (stderr, "hash table checking failed: "
> +"equal operator returns true for a pair "
> +"of values with a different hash value\n");
> +  gcc_unreachable ();
> +}
I think an internal_error here is probably still better than a simple
fprintf, even if the fprintf is terminated with a \n :-)

The question then becomes can we bootstrap with this stuff enabled and
if not, are we likely to soon?  It'd be a shame to put it into
EXTRA_CHECKING, but then not be able to really use EXTRA_CHECKING
because we've got too many bugs to fix.

> +
> +/* Verify that all existing elements in th hash table which are
s/th/the/


Jeff


Re: [PATCH] Add sinh(tanh(x)) and cosh(tanh(x)) rules

2018-11-07 Thread Jeff Law
On 10/23/18 3:17 AM, Richard Biener wrote:
> On Mon, Oct 22, 2018 at 10:09 PM Jeff Law  wrote:
>>
>> On 10/20/18 9:47 AM, Giuliano Augusto Faulin Belinassi wrote:
>>> So I did some further investigation comparing the ULP error.
>>>
>>> With the formula that Wilco Dijkstra provided, there are cases where
>>> the substitution is super precise.
>>> With floats:
>>> with input  :  = 9.9940395355224609375000e-01
>>> sinh: before:  = 2.89631005859375e+03
>>> sinh: after :  = 2.896309326171875000e+03
>>> sinh: mpfr  :  = 2.89630924626497842670468162463283783344599446025119e+03
>>> ulp err befr:  = 3
>>> ulp err aftr:  = 0
>>>
>>> With doubles:
>>> with input  :  = 9.99888977697537484345957636833190917969e-01
>>> sinh: before:  = 6.710886400029802322387695312500e+07
>>> sinh: after :  = 6.71088632549419403076171875e+07
>>> sinh: mpfr  :  = 6.710886344120645523071287770030292885894208e+07
>>> ulp err befr:  = 3
>>> ulp err aftr:  = 0
>>>
>>> *However*, there are cases where some error shows up. The biggest ULP
>>> error that I could find was 2.
>>>
>>> With floats:
>>> with input  :  = 9.99968349933624267578125000e-01
>>> sinh: before:  = 1.2568613433837890625000e+02
>>> sinh: after :  = 1.2568614959716796875000e+02
>>> sinh: mpfr  :  = 1.25686137592274042266452526368087062890399889097864e+02
>>> ulp err befr:  = 0
>>> ulp err aftr:  = 2
>>>
>>> With doubles:
>>> with input  :  = 9.999463651256803586875321343541145324707031e-01
>>> sinh: before:  = 9.65520209507428342476487159729003906250e+05
>>> sinh: after :  = 9.6552020950742810964584350585937500e+05
>>> sinh: mpfr  :  = 9.65520209507428288553227922831618987450806468855883e+05
>>> ulp err befr:  = 0
>>> ulp err aftr:  = 2
>>>
>>> And with FMA we have the same results showed above. (super precise
>>> cases, and maximum ULP error equal 2).
>>>
>>> So maybe update the patch with the following rules?
>>>* If FMA is available, then compute 1 - x*x with it.
>>>* If FMA is not available, then do the dijkstra substitution when |x| > 
>>> 0.5.
>> So I think the runtime math libraries shoot for .5 ULP (yes, they don't
>> always make it, but that's their goal).  We should probably have the
>> same goal.  Going from 0 to 2 ULPs would be considered bad.
> 
> But we do that everywhere (with -funsafe-math-optimizations or
> -fassociative-math).
So if we're going from 0->2 ULPs in some cases, do we want to guard it
with one of the various options, if so, which?  Giuliano's follow-up
will still have the potential for 2ULPs.

jeff


Re: [PATCH] handle attribute positional arguments consistently (PR 87541, 87542)

2018-11-07 Thread Jeff Law
On 10/24/18 8:02 PM, Martin Sebor wrote:

>> No camel case.  Make the enum type lower case and its values upper case.
> 
> Done.
> 
> As an aside, I almost thought that after nearly fours years
> I've adjusted to most reviewers' preferences but I'm clearly
> not quite there yet.  As usual, this is not mentioned in
> the coding conventions (except for C++ template parameters
> where it is the preferred spelling), and there are also
> examples of different styles in GCC, including the one
> I chose. For instance, in c-format.c the format_type enum
> uses all lowercase enumerators.  There are also quite a few
> (over a hundred in fact) examples of CamelCase enums in
> various front-ends, back-ends, and other parts of GCC.
It happens.  Avoiding camel case is more important than the upper vs
lower case on the enum constants (and I thought avoiding camel case was
mentioned somewhere, but I could be wrong).  It it wasn't for the  camel
case I probably wouldn't have said anything about the enum constants.

Jeff


> 
>>> @@ -326,17 +331,18 @@ static bool
>>>  }
>>>  }
>>>
>>> -  if (!get_constant (format_num_expr, >format_num, validated_p))
>>> -    {
>>> -  error ("format string has invalid operand number");
>>> -  return false;
>>> -    }
>>> +  if (tree val = get_constant (fntype, atname, *format_num_expr,
>>> +   2, >format_num, 0, validated_p))
>>> +    *format_num_expr = val;
>>> +  else
>>> +    return false;
>> Is it really a good idea to be modifying something inside of ARGS like
>> this?  At the very least the function comments neeed updating.
> 
> That's what the code does.  My patch doesn't change it, it just
> adds a new variable.  I added a comment to mention it nonetheless.
Ah.  Must have missed that in the original.  In that case ignore my
comment.  Similarly for the other instance.
.
> 
>>
>>> +{
>>> +  /* Treat zero the same as an out-of-bounds argument number.  */
>>> +  if (!argno)
>>> +    return void_type_node;
>>> +
>>> +  unsigned i = 1;
>>> +
>>> +  for (tree t = TYPE_ARG_TYPES (type); ; t = TREE_CHAIN (t), ++i)
>> There's already iterators to walk over TYPE_ARG_TYPES.  See
>> FOREACH_FUNCTION_ARGS
> 
> As I explained above, I just copied another function just above
> the one I added.
> 
> Besides the one in the function I copied there are six other
> loops in this file and just one use of FOREACH_FUNCTION_ARGS.
> I also think the macro is harder to understand and poor style.
> It requires declaring an extra variable outside the scope of
> the loop even if it isn't used.  So I kept the loop as is.
We're generally trying to use iterators more than open coding loops all
the time.  The existence of code that doesn't use iterators (when
iterators exist) isn't a justification for adding new open coded loops.

Please use the iterator.  OK with that change.

jeff


Re: PR fortran/87919 patch for -fno-dec-structure

2018-11-07 Thread Fritz Reese
On 11/7/18, Jakub Jelinek  wrote:
> On Wed, Nov 07, 2018 at 03:07:04PM +, Mark Eggleston wrote:
>
>>  PR fortran/87919
>>  * options.c (gfc_handle_option): Removed case OPT_fdec_structure
>>  as it breaks the handling of -fno-dec-structure.
>
> No entries for the tests, i.e.
>   * gfortran.dg/pr87919-dec-structure-1.f: New test.
>   * gfortran.dg/pr87919-dec-structure-2.f: New test.
>   * gfortran.dg/pr87919-dec-structure-3.f: New test.
>   * gfortran.dg/pr87919-dec-structure-4.f: New test.
>
>> diff --git a/gcc/fortran/options.c b/gcc/fortran/options.c
>> index 73f5389361d9..3b7c2d40fe8a 100644
>> --- a/gcc/fortran/options.c
>> +++ b/gcc/fortran/options.c
>> @@ -761,10 +761,6 @@ gfc_handle_option (size_t scode, const char *arg,
>> HOST_WIDE_INT value,
>>/* Enable all DEC extensions.  */
>>set_dec_flags (1);
>>break;
>> -
>> -case OPT_fdec_structure:
>> -  flag_dec_structure = 1;
>> -  break;
>>  }
>>
>>Fortran_handle_option_auto (_options, _options_set,
>
> LGTM, but I'll defer the final review to Fortran maintainers.

Thanks for the patch Mark, I concur with Jakub that it is correct for
what it does. However, I have a few comments in addition to the fixes
recommended by Jakub regarding the test cases.

First, I would prefer to name these test cases as "dec_structure_*.f"
to align with the other (23) -fdec-structure test cases. Second, the
third case (*dec-structure-3.f) is unnecessary because it is identical
in function to dec_structure_1.f90. I concur with the remaining test
cases, as well as Jakub's suggestion to cover "-fdec-structure
-fno-dec-structure" with an additional test. I would name the final
four (= 4 - 1 + 1) tests as "dec_structure_[24-27].f".


I have taken the liberty of extending this patch to cover the
remainder of PR 87919. That is, to fix -fno-* for -fno-dec,
-fno-check-array-temporaries and -fno-init-local-zero. In the extended
patch, the 'value' set for the aforementioned options is no longer
ignored, so that value=1 truly means set and value=0 truly means
"unset". Previously, the aforementioned flags effectively ignored the
value=0 condition. Similarly to the tests Mark provided with
-fdec-structure, I've provided new tests for the various facets of
-fno-dec, -fno-check-array-temporaries, and -fno-init-local-zero.

Below is the changelog. Bootstraps and regtests fine for me on
x86_64-redhat-linux. If it looks OK I'll commit to trunk (and probably
backport to 8-branch and 7-branch since the affected code appears to
be the same for those branches).


>From 2d9e39bbf4a179ae433f33f4e7039b85078ba72f Mon Sep 17 00:00:00 2001
From: Fritz Reese 
Date: Wed, 7 Nov 2018 15:13:50 -0500
Subject: [PATCH] PR fortran/87919

Fix handling -fno-* prefix for init-local-zero, check-array-temporaries and dec.

gcc/fortran/
* options.c (SET_FLAG, SET_BITFLAG): New macros.
(set_dec_flags): Unset DEC flags with value==0.
(set_init_local_zero): New helper for -finit-local-zero flag group.
(gfc_init_options): Fix disabling of init flags, array temporaries
check, and dec flags when value is zero (from -fno-*).

gcc/testsuiste/
* gfortran.dg/array_temporaries_5.f90: New test.
* gfortran.dg/dec_bitwise_ops_3.f90: Ditto.
* gfortran.dg/dec_d_lines_3.f: Ditto.
* gfortran.dg/dec_exp_4.f90: Ditto.
* gfortran.dg/dec_exp_5.f90: Ditto.
* gfortran.dg/dec_io_7.f90: Ditto.
* gfortran.dg/dec_structure_24.f: Ditto.
* gfortran.dg/dec_structure_25.f: Ditto.
* gfortran.dg/dec_structure_26.f: Ditto.
* gfortran.dg/dec_structure_27.f: Ditto.
* gfortran.dg/dec_type_print_3.f90: Ditto.
* gfortran.dg/init_flag_20.f90: Ditto.
---
 gcc/fortran/options.c | 70 +++
 gcc/testsuite/gfortran.dg/array_temporaries_5.f90 | 20 +++
 gcc/testsuite/gfortran.dg/dec_bitwise_ops_3.f90   | 19 ++
 gcc/testsuite/gfortran.dg/dec_d_lines_3.f | 10 
 gcc/testsuite/gfortran.dg/dec_exp_4.f90   | 13 +
 gcc/testsuite/gfortran.dg/dec_exp_5.f90   | 15 +
 gcc/testsuite/gfortran.dg/dec_io_7.f90| 22 +++
 gcc/testsuite/gfortran.dg/dec_structure_24.f  | 21 +++
 gcc/testsuite/gfortran.dg/dec_structure_25.f  | 22 +++
 gcc/testsuite/gfortran.dg/dec_structure_26.f  | 22 +++
 gcc/testsuite/gfortran.dg/dec_structure_27.f  | 20 +++
 gcc/testsuite/gfortran.dg/dec_type_print_3.f90| 29 ++
 gcc/testsuite/gfortran.dg/init_flag_20.f90| 62 
 13 files changed, 320 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/array_temporaries_5.f90
 create mode 100644 gcc/testsuite/gfortran.dg/dec_bitwise_ops_3.f90
 create mode 100644 gcc/testsuite/gfortran.dg/dec_d_lines_3.f
 create mode 100644 gcc/testsuite/gfortran.dg/dec_exp_4.f90
 create mode 100644 

Re: [PATCH] detect attribute mismatches in alias declarations (PR 81824)

2018-11-07 Thread Jeff Law
On 10/23/18 7:50 PM, Martin Sebor wrote:
> On 10/23/2018 03:53 PM, Joseph Myers wrote:
>> On Mon, 22 Oct 2018, Martin Sebor wrote:
>>
>>> between aliases and ifunc resolvers.  With -Wattribute-alias=1
>>> that reduced the number of unique instances of the warnings for
>>> a Glibc build to just 27.  Of those, all but one of
>>> the -Wattributes instances are of the form:
>>>
>>>   warning: ‘leaf’ attribute has no effect on unit local functions
>>
>> What do the macro expansions look like there?  All the places where you're
>>
>> adding "copy" attributes are for extern declarations, not static ones,
>> whereas your list of warnings seems to indicate this is appearing for
>> ifunc resolvers (which are static, but should not be copying attributes
>> from anywhere).
> 
> These must have been caused by the bug in the patch (below).
> They have cleared up with it fixed.  I'm down to just 18
> instances of a -Wmissing-attributes warning, all for string
> functions.  The cause of those is described below.
> 
>>
>>> All the -Wmissing-attributes instances are due to a missing
>>> nonnull attribute on the __EI__ kinds of functions, like:
>>>
>>>   warning: ‘__EI_vfprintf’ specifies less restrictive attribute than its
>>> target ‘vfprintf’: ‘nonnull’
>>
>> That looks like a bug in the GCC patch to me; you appear to be adding copy
>>
>> attributes in the correct place.  Note that __EI_* gets declared twice
>> (first with __asm__, second with an alias attribute), so anything related
>> to handling of such duplicate declarations might be a cause for such a
>> bug (and an indication of what you need to add a test for when fixing such
>>
>> a bug).
> 
> There was a bug in the patch, but there is also an issue in Glibc
> that made it tricky to see the problem.
> 
> The tests I had in place were too simple to catch the GCC bug:
> the problem there was that when the decl didn't have an attribute
> the type of the "template" did the check would fail without also
> considering the decl's type.  Tricky stuff!  I've added tests to
> exercise this.
> 
> The Glibc issue has to do with the use of __hidden_ver1 macro
> to declare string functions.  sysdeps/x86_64/multiarch/strcmp.c
> for instance has:
> 
>   __hidden_ver1 (strcmp, __GI_strcmp, __redirect_strcmp)
> __attribute__ ((visibility ("hidden")));
> 
> and __redirect_strcmp is missing the nonnull attribute because
> it's #undefined in include/sys/cdefs.h.  An example of one of
> these warnings is attached.
> 
> Using strcmp instead of __redirect_strcmp would solve this but
> __redirect_strcmp should have all the same attributes as strcmp.
> But nonnull is removed from the declaration because the __nonnull
> macro that controls it is undefined in include/sys/cdefs.h.  There
> is a comment above the #undef in the header that reads:
> 
> /* The compiler will optimize based on the knowledge the parameter is
>    not NULL.  This will omit tests.  A robust implementation cannot allow
>    this so when compiling glibc itself we ignore this attribute.  */
> # undef __nonnull
> # define __nonnull(params)
> 
> I don't think this is actually true for recent versions of GCC.
> The nonnull optimization is controlled by
> -fisolate-erroneous-paths-attribute and according to the manual
> and common.opt the option is disabled by default.
> 
> But if you do want to avoid the attribute on declarations of
> these functions regardless it should be safe to add it after
> the declaration in the .c file, like so:
> 
> __hidden_ver1 (strcmp, __GI_strcmp, __redirect_strcmp)
>   __attribute__ ((visibility ("hidden"), copy (strcmp)));
> 
> That should make it straightforward to adopt the enhancement
> and experiment with -Wattribute-alias=2 to see if it does what
> you had  in mind.
> 
> The latest GCC patch with the fix mentioned above is attached.
> 
> Martin
> 
> gcc-81824.diff
> 
> PR middle-end/81824 - Warn for missing attributes with function aliases
> 
> gcc/c-family/ChangeLog:
> 
>   PR middle-end/81824
>   * c-attribs.c (handle_copy_attribute_impl): New function.
>   (handle_copy_attribute): Same.
> 
> gcc/cp/ChangeLog:
> 
>   PR middle-end/81824
>   * pt.c (warn_spec_missing_attributes): Move code to attribs.c.
>   Call decls_mismatched_attributes.
> 
> gcc/ChangeLog:
> 
>   PR middle-end/81824
>   * attribs.c (has_attribute): New helper function.
>   (decls_mismatched_attributes, maybe_diag_alias_attributes): Same.
>   * attribs.h (decls_mismatched_attributes): Declare.
>   * cgraphunit.c (handle_alias_pairs): Call maybe_diag_alias_attributes.
>   (maybe_diag_incompatible_alias): Use OPT_Wattribute_alias_.
>   * common.opt (-Wattribute-alias): Take an argument.
>   (-Wno-attribute-alias): New option.
>   * doc/extend.texi (Common Function Attributes): Document copy.
>   (Common Variable Attributes): Same.
>   * doc/invoke.texi (-Wmissing-attributes): Document enhancement.
>   (-Wattribute-alias): Document new option 

Re: [PATCH] Fix PR87691: transparent_union attribute does not work with MODE_PARTIAL_INT

2018-11-07 Thread Marek Polacek
On Tue, Oct 23, 2018 at 08:49:26PM +0100, Jozef Lawrynowicz wrote:
> msp430-elf uses the partial int type __int20 for pointers in the large memory
> model. __int20 has PSImode, with bitsize of 20.
> 
> A few DejaGNU tests fail when built with -mlarge for msp430-elf, when
> transparent unions are used containing pointers.
> These are:
> - gcc.c-torture/compile/pr34885.c
> - gcc.dg/transparent-union-{1,2,3,4,5}.c
> 
> The issue is that the union is considered to have size of 32 bits (the
> in-memory size of __int20), so unless mode_for_size as called by
> compute_record_mode (both in stor-layout.c) is explicitly told to look for a
> mode of class MODE_PARTIAL_INT, then a size of 32 will always return MODE_INT.
> In this case, the union will have TYPE_MODE of SImode, but its field is
> PSImode, so transparent_union has no effect.
> 
> The attached patch fixes the issue by allowing the TYPE_MODE of a union to be
> set to the DECL_MODE of the widest field, if the mode is of class
> MODE_PARTIAL_INT and the union would be passed by reference.
> 
> Some target ABIs mandate that unions be passed in integer registers, so to
> avoid any potential ABI violations, the mode of the union is only changed if
> it would be passed by reference.
> 
> Successfully bootstrapped and regstested trunk for x86_64-pc-linux-gnu, and
> msp430-elf with -mlarge. For msp430-elf with -mlarge, the above DejaGNU tests
> are also fixed.
> 
> Ok for trunk?
> 

> From cc1ccfcc0d8adf7b0e1ca95a47a8a8e7e12fc99c Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Mon, 22 Oct 2018 21:02:10 +0100
> Subject: [PATCH] Allow union TYPE_MODE to be set to the mode of the widest
>  element if the union would be passed by reference
> 
> 2018-10-23  Jozef Lawrynowicz  
> 
>   PR c/87691
>   * gcc/stor-layout.c (compute_record_mode): Set TYPE_MODE of UNION_TYPE
>   to the mode of the widest field iff the widest field has mode class
>   MODE_INT, or MODE_PARTIAL_INT and the union would be passed by
>   reference.
>   * gcc/testsuite/gcc.target/msp430/pr87691.c: New test.

I'll just point out that you should drop the gcc/ and gcc/testsuite/ prefixes;
the first entry will go to gcc/ChangeLog while the pr87691.c one to
gcc/testsuite/ChangeLog.

Marek


[PATCH] [aarch64] Correct the maximum shift amount for shifted operands.

2018-11-07 Thread christoph . muellner
From: Christoph Muellner 

The aarch64 ISA specification allows a left shift amount to be applied
after extension in the range of 0 to 4 (encoded in the imm3 field).

This is true for at least the following instructions:

 * ADD (extend register)
 * ADDS (extended register)
 * SUB (extended register)

The result of this patch can be seen, when compiling the following code:

uint64_t myadd(uint64_t a, uint64_t b)
{
return a+(((uint8_t)b)<<4);
}

Without the patch the following sequence will be generated:

 :
   0:   d37c1c21ubfiz   x1, x1, #4, #8
   4:   8b20add x0, x1, x0
   8:   d65f03c0ret

With the patch the ubfiz will be merged into the add instruction:

 :
   0:   8b211000add x0, x0, w1, uxtb #4
   4:   d65f03c0ret

*** gcc/ChangeLog ***

2018-xx-xx  Christoph Muellner 

* gcc/config/aarch64/aarch64.c: Correct the maximum shift amount
for shifted operands.
* gcc/testsuite/gcc.target/aarch64/extend.c: Adjust the
testcases to cover the changed shift amount.

Signed-off-by: Christoph Muellner 
Signed-off-by: Philipp Tomsich 
---
 gcc/config/aarch64/aarch64.c  |  2 +-
 gcc/testsuite/gcc.target/aarch64/extend.c | 16 
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index c82c7b6..c85988a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8190,7 +8190,7 @@ aarch64_output_casesi (rtx *operands)
 int
 aarch64_uxt_size (int shift, HOST_WIDE_INT mask)
 {
-  if (shift >= 0 && shift <= 3)
+  if (shift >= 0 && shift <= 4)
 {
   int size;
   for (size = 8; size <= 32; size *= 2)
diff --git a/gcc/testsuite/gcc.target/aarch64/extend.c 
b/gcc/testsuite/gcc.target/aarch64/extend.c
index f399e55..7986c5b 100644
--- a/gcc/testsuite/gcc.target/aarch64/extend.c
+++ b/gcc/testsuite/gcc.target/aarch64/extend.c
@@ -32,8 +32,8 @@ ldr_sxtw0 (char *arr, int i)
 unsigned long long
 adddi_uxtw (unsigned long long a, unsigned int i)
 {
-  /* { dg-final { scan-assembler "add\tx\[0-9\]+,.*uxtw #?3" } } */
-  return a + ((unsigned long long)i << 3);
+  /* { dg-final { scan-assembler "add\tx\[0-9\]+,.*uxtw #?4" } } */
+  return a + ((unsigned long long)i << 4);
 }
 
 unsigned long long
@@ -46,8 +46,8 @@ adddi_uxtw0 (unsigned long long a, unsigned int i)
 long long
 adddi_sxtw (long long a, int i)
 {
-  /* { dg-final { scan-assembler "add\tx\[0-9\]+,.*sxtw #?3" } } */
-  return a + ((long long)i << 3);
+  /* { dg-final { scan-assembler "add\tx\[0-9\]+,.*sxtw #?4" } } */
+  return a + ((long long)i << 4);
 }
 
 long long
@@ -60,8 +60,8 @@ adddi_sxtw0 (long long a, int i)
 unsigned long long
 subdi_uxtw (unsigned long long a, unsigned int i)
 {
-  /* { dg-final { scan-assembler "sub\tx\[0-9\]+,.*uxtw #?3" } } */
-  return a - ((unsigned long long)i << 3);
+  /* { dg-final { scan-assembler "sub\tx\[0-9\]+,.*uxtw #?4" } } */
+  return a - ((unsigned long long)i << 4);
 }
 
 unsigned long long
@@ -74,8 +74,8 @@ subdi_uxtw0 (unsigned long long a, unsigned int i)
 long long
 subdi_sxtw (long long a, int i)
 {
-  /* { dg-final { scan-assembler "sub\tx\[0-9\]+,.*sxtw #?3" } } */
-  return a - ((long long)i << 3);
+  /* { dg-final { scan-assembler "sub\tx\[0-9\]+,.*sxtw #?4" } } */
+  return a - ((long long)i << 4);
 }
 
 long long
-- 
2.9.5



Re: [PATCH] Fix PR87691: transparent_union attribute does not work with MODE_PARTIAL_INT

2018-11-07 Thread Jeff Law
On 10/23/18 1:49 PM, Jozef Lawrynowicz wrote:
> msp430-elf uses the partial int type __int20 for pointers in the large memory
> 
> model. __int20 has PSImode, with bitsize of 20.
> 
> A few DejaGNU tests fail when built with -mlarge for msp430-elf, when
> transparent unions are used containing pointers.
> These are:
> - gcc.c-torture/compile/pr34885.c
> - gcc.dg/transparent-union-{1,2,3,4,5}.c
> 
> The issue is that the union is considered to have size of 32 bits (the
> in-memory size of __int20), so unless mode_for_size as called by
> compute_record_mode (both in stor-layout.c) is explicitly told to look for a
> 
> mode of class MODE_PARTIAL_INT, then a size of 32 will always return MODE_INT.
> 
> In this case, the union will have TYPE_MODE of SImode, but its field is
> PSImode, so transparent_union has no effect.
> 
> The attached patch fixes the issue by allowing the TYPE_MODE of a union to be
> 
> set to the DECL_MODE of the widest field, if the mode is of class
> MODE_PARTIAL_INT and the union would be passed by reference.
> 
> Some target ABIs mandate that unions be passed in integer registers, so to
> avoid any potential ABI violations, the mode of the union is only changed if
> 
> it would be passed by reference.
> 
> Successfully bootstrapped and regstested trunk for x86_64-pc-linux-gnu, and
> msp430-elf with -mlarge. For msp430-elf with -mlarge, the above DejaGNU tests
> 
> are also fixed.
> 
> Ok for trunk?
> 
> 
> 0001-Allow-union-TYPE_MODE-to-be-set-to-the-mode-of-the-w.patch
> 
> From cc1ccfcc0d8adf7b0e1ca95a47a8a8e7e12fc99c Mon Sep 17 00:00:00 2001
> From: Jozef Lawrynowicz 
> Date: Mon, 22 Oct 2018 21:02:10 +0100
> Subject: [PATCH] Allow union TYPE_MODE to be set to the mode of the widest
>  element if the union would be passed by reference
> 
> 2018-10-23  Jozef Lawrynowicz  
> 
>   PR c/87691
>   * gcc/stor-layout.c (compute_record_mode): Set TYPE_MODE of UNION_TYPE
>   to the mode of the widest field iff the widest field has mode class
>   MODE_INT, or MODE_PARTIAL_INT and the union would be passed by
>   reference.
>   * gcc/testsuite/gcc.target/msp430/pr87691.c: New test.
OK.  SOrry for the delay.

jeff


Re: [PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-11-07 Thread Jeff Law
On 10/20/18 6:01 PM, Martin Sebor wrote:


> 
> The warning only triggers when the bound is less than or equal
> to the length of the constant source string (i.e, when strncpy
> truncates).  So IIUC, your suggestion would defer folding only
> such strncpy calls and let gimple_fold_builtin_strncpy fold
> those with a constant bound that's greater than the length of
> the constant source string.  That would be fine with me, but
> since strncpy calls with a bound that's greater than the length
> of the source are pointless I don't think they are important
> enough to worry about folding super early.  The constant ones
> that serve any purpose (and that are presumably important to
> optimize) are those that truncate.
I was focused exclusively on the case where we have to look for a
subsequent statement that handled termination.  The idea was to only
leave in the cases that we might need to warn for because we couldn't
search subsequent statement for the termination.

Splitting up was primarily meant to get the warning out of the folder
with a minimal impact on code generation.  But if the common case would
result in deferral of folding, then I'd fully expect Richi to object.

> 
> That said, when optimization isn't enabled, I don't think users
> expect calls to library functions to be transformed to calls to
> other  functions, or inlined.  Yet that's just what GCC does.
> For example, besides triggering the warning, the following:
I don't think we should drag this into the issue at hand.  Though I do
generally agree that folding this stuff into low level memory operations
is not what most would expect at -O0.


Jeff


Re: [PATCH] Add option to control warnings added through attribure "warning"

2018-11-07 Thread Jeff Law
On 10/15/18 9:21 AM, Nikolai Merinov wrote:
> Hi Martin,
> 
> On 10/15/18 6:20 PM, Martin Sebor wrote:
>> On 10/15/2018 01:55 AM, Nikolai Merinov wrote:
>>> Hi Martin,
>>>
>>> On 10/12/18 9:58 PM, Martin Sebor wrote:
 On 10/12/2018 04:14 AM, Nikolai Merinov wrote:
> Hello,
>
> In https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01795.html mail I
> suggested patch to have ability to control behavior of
> "__attribute__((warning))" in case when option "-Werror" enabled.
> Usage
> example:
>
>> #include 
>> int a() __attribute__((warning("Warning: `a' was used")));
>> int a() { return 1; }
>> int main () { return a(); }
>
>> $ gcc -Werror test.c
>> test.c: In function ‘main’:
>> test.c:4:22: error: call to ‘a’ declared with attribute warning:
>> Warning: `a' was used [-Werror]
>>  int main () { return a(); }
>>   ^
>> cc1: all warnings being treated as errors
>> $ gcc -Werror -Wno-error=warning-attribute test.c
>> test.c: In function ‘main’:
>> test.c:4:22: warning: call to ‘a’ declared with attribute warning:
>> Warning: `a' was used
>>  int main () { return a(); }
>>   ^
> Can you provide any feedback on suggested changes?

 It seems like a useful feature and in line with the philosophy
 that distinct warnings should be controlled by their own options.

 I would only suggest to consider changing the name to
 -Wattribute-warning, because it applies specifically to that
 attribute (as opposed to warnings about attributes in general).

 There are many attributes in GCC and diagnosing problems that
 are unique to each, under the same -Wattributes option, is
 becoming too coarse and overly limiting.  To make it more
 flexible, I expect new options will need to be introduced,
 such as -Wattribute-alias (to control aspects of the alias
 attribute and others related to it), or -Wattribute-const
 (to control diagnostics about functions declared with
 attribute const that violate the attribute's constraints).

 An alternative might be to introduce a single -Wattribute=
  option where the  gives
 the names of all the distinct attributes whose unique
 diagnostics one might need to control.

 Martin
>>>
>>> Currently there is several styles already in use:
>>>
>>> -Wattribute-alias where "attribute" word used as prefix for name of
>>> attribute,
>>> -Wsuggest-attribute=[pure|const|noreturn|format|malloc] where name of
>>> attribute passed as possible argument,
>>> -Wmissing-format-attribute where "attribute" word used as suffix,
>>> -Wdeprecated-declarations where "attribute" word not used at all even
>>> if this warning option was created especially for "deprecated"
>>> attribute.
>>>
>>> I changed name to "-Wattribute-warning" as you suggested, but
>>> unifying style for all attribute related warning looks like separate
>>> activity. Please check new patch in attachments.
>>>
>>
>> Thanks for survey!  I agree that making the existing options
>> consistent (if that's what we want) should be done separately.
>>
>> Martin
>>
>> PS It doesn't look like your latest attachments made it to
>> the list.
>>
> Thank you for mentioning. There was my mistake. Now it's attached
>>
>>> Updated changelog:
>>>
>>> gcc/Changelog
>>>
>>> 2018-10-14  Nikolai Merinov 
>>>
>>>  * gcc/common.opt: Add -Wattribute-warning.
>>>  * gcc/doc/invoke.texi: Add documentation for
>>> -Wno-attribute-warning.
>>>  * gcc/testsuite/gcc.dg/Wno-attribute-warning.c: New test.
>>>  * gcc/expr.c (expand_expr_real_1): Add new attribute to
>>> warning_at
>>>  call to allow user configure behavior of "warning" attribute
I split up the ChangeLog and fixed a very minor whitespace issue in the
docs and installed the patch.

I went round and round over whether or not to change the doc text.  It
discusses the attribute and the warning and it's easy to mix up the two.
 But ultimately I decided not to change it.

Thanks and sorry for the long delays.

Jeff
> 



[gomp5] Merge from trunk

2018-11-07 Thread Jakub Jelinek
Hi!

I've merged trunk into gomp-5_0-branch.  atomic-5.C testcase needed some
adjustments for recent C++ FE changes and the taskloop-reduction-1.c
testcase wasn't correct for 32-bit targets.

Tested on x86_64-linux and on i686-linux (the latter libgomp only),
committed to gomp-5_0-branch.

2018-11-07  Jakub Jelinek  

* g++.dg/gomp/atomic-5.C (f1): Adjust expected lines of read-only
variable messages.

* testsuite/libgomp.c-c++-common/taskloop-reduction-1.c (S): Change
type of s and t members from unsigned long int to
unsigned long long int.

--- gcc/testsuite/g++.dg/gomp/atomic-5.C(revision 265885)
+++ gcc/testsuite/g++.dg/gomp/atomic-5.C(working copy)
@@ -12,12 +12,12 @@ void f1(void)
 x = x + 1;
   #pragma omp atomic
 x = 1; /* { dg-error "invalid form" } */
-  #pragma omp atomic
+  #pragma omp atomic   /* { dg-error "read-only variable" } */
 ++y;   /* { dg-error "read-only variable" } */
-  #pragma omp atomic
+  #pragma omp atomic   /* { dg-error "read-only variable" } */
 y--;   /* { dg-error "read-only variable" } */
-  #pragma omp atomic
-y += 1;/* { dg-error "read-only variable" } */
+  #pragma omp atomic   /* { dg-error "read-only variable" } */
+y += 1;
   #pragma omp atomic
 bar(); /* { dg-error "invalid operator" } */
   #pragma omp atomic
--- libgomp/testsuite/libgomp.c-c++-common/taskloop-reduction-1.c   
(revision 265885)
+++ libgomp/testsuite/libgomp.c-c++-common/taskloop-reduction-1.c   
(working copy)
@@ -4,7 +4,7 @@ extern
 #endif
 void abort (void);
 
-struct S { unsigned long int s, t; };
+struct S { unsigned long long int s, t; };
 
 void
 rbar (struct S *p, struct S *o)

Jakub


Re: Free TYPE_VALUES of enums

2018-11-07 Thread Bernhard Reutner-Fischer
On Wed, 7 Nov 2018 14:09:24 +0100
Richard Biener  wrote:

> On Wed, Nov 7, 2018 at 1:34 PM Jan Hubicka  wrote:

> > Bootstrapped/regtested x86_64-linux, will commit it after
> > lto-bootstrapping uneless there are complains.  

> > +/* Save some WPA->ltrans streaming by freeing enum values.  */
> > +
> > +static void
> > +free_enum_values ()
> > +{
> > +  static bool enum_values_freed = false;
> > +  if (enum_values_freed || !flag_wpa || !odr_types_ptr)
> > +return;
> > +  enum_values_freed = true;
[]
> > +  enum_values_freed = true;
> >  }

Short of a "verytrue" choice for boolean, i think setting it to true
once should be enough though.

cheers,


Re: [PATCH, OpenACC] Properly handle wait clause with no arguments

2018-11-07 Thread Jakub Jelinek
On Wed, Nov 07, 2018 at 08:13:29PM +0100, Thomas Schwinge wrote:
> Isn't that sufficient for the ABI compatibility that we promise, which is
> (unless I'm confused now?) that old (existing) executables continue to
> run correctly when dynamically linking against a new libgomp.  Or do we
> also have to care about the case that an executable built with a new
> version of GCC has to work when dynamically linked against an old
> libgomp?

Only old executables/libraries need to continue running correctly when
linking against new libgomp.  New programs against old libgomp might work,
or might not.

Jakub


Re: [PATCH, OpenACC] Properly handle wait clause with no arguments

2018-11-07 Thread Thomas Schwinge
Hi Chung-Lin!

On Thu, 30 Aug 2018 21:27:22 +0800, Chung-Lin Tang  
wrote:
> Hi, this patch properly handles OpenACC 'wait' clauses without arguments, 
> making it an equivalent of "wait all".

Thanks!

> (current trunk basically discards and ignores such argument-less wait
> clauses)

Bugs should be filed, for later reference.  Now done:
 "OpenACC wait clauses without
async-arguments".  (I couldn't put you in CC because "clt...@gcc.gnu.org
did not match anything"?)

> This adds additional handling in
> the pack/unpack of the wait argument across the compiler/libgomp interface, 
> but is done in a matter that
> doesn't affect binary compatibility.

Hmm.  See below.  (Jakub, could you please review the last paragraph of
this email?)

> This patch was part of the OpenACC async re-work that was done on the gomp4 
> branch (later merged to OG7/OG8), see [1].
> I'm separating this part out and submitting it first because it's logically 
> independent.
> 
> [1] https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01842.html

Thanks for splitting it out!

> Re-tested with offloading to ensure no regressions, is this okay for trunk?

A few comments.

No test cases included.  I'm working on a few, will post/commit later.

>  gcc/c/
>  * c-parser.c (c_parser_oacc_clause_wait): Add representation of wait
>  clause without argument as 'wait (GOMP_ASYNC_NOVAL)', adjust 
> comments.
> 
>  gcc/cp/
>  * parser.c (cp_parser_oacc_clause_wait): Add representation of wait
>  clause without argument as 'wait (GOMP_ASYNC_NOVAL)', adjust 
> comments.
> 
>  gcc/fortran/
>  * trans-openmp.c (gfc_trans_omp_clauses_1): Add representation of 
> wait
>  clause without argument as 'wait (GOMP_ASYNC_NOVAL)'.
> 
>  gcc/
>  * omp-low.c (expand_omp_target): Add middle-end support for handling
>  OMP_CLAUSE_WAIT clause with a GOMP_ASYNC_NOVAL(-1) as the argument.
> 
>  include/
>  * gomp-constants.h (GOMP_LAUNCH_OP_MASK): Define.
>  (GOMP_LAUNCH_PACK): Add bitwise-and of GOMP_LAUNCH_OP_MASK.
>  (GOMP_LAUNCH_OP): Likewise.
> 
>  libgomp/
>  * oacc-parallel.c (GOACC_parallel_keyed): Interpret launch op as
>  signed 16-bit field, adjust num_waits handling.
>  (GOACC_enter_exit_data): Adjust num_waits handling.
>  (GOACC_update): Adjust num_waits handling.

> --- gcc/c/c-parser.c  (revision 263981)
> +++ gcc/c/c-parser.c  (working copy)
> @@ -12719,7 +12719,7 @@ c_parser_oacc_clause_tile (c_parser *parser, tree
>  }
>  
>  /* OpenACC:
> -   wait ( int-expr-list ) */
> +   wait [( int-expr-list )] */
>  
>  static tree
>  c_parser_oacc_clause_wait (c_parser *parser, tree list)
> @@ -12728,7 +12728,15 @@ c_parser_oacc_clause_wait (c_parser *parser, tree
>  
>if (c_parser_peek_token (parser)->type == CPP_OPEN_PAREN)
>  list = c_parser_oacc_wait_list (parser, clause_loc, list);
> +  else
> +{
> +  tree c = build_omp_clause (clause_loc, OMP_CLAUSE_WAIT);
>  
> +  OMP_CLAUSE_DECL (c) = build_int_cst (integer_type_node, 
> GOMP_ASYNC_NOVAL);
> +  OMP_CLAUSE_CHAIN (c) = list;
> +  list = c;
> +}
> +
>return list;
>  }

ACK.

> --- gcc/cp/parser.c   (revision 263981)
> +++ gcc/cp/parser.c   (working copy)
> @@ -32137,7 +32137,7 @@ cp_parser_oacc_wait_list (cp_parser *parser, locat
>  }
>  
>  /* OpenACC:
> -   wait ( int-expr-list ) */
> +   wait [( int-expr-list )] */
>  
>  static tree
>  cp_parser_oacc_clause_wait (cp_parser *parser, tree list)
> @@ -32144,10 +32144,16 @@ cp_parser_oacc_clause_wait (cp_parser *parser, tre
>  {
>location_t location = cp_lexer_peek_token (parser->lexer)->location;
>  
> -  if (cp_lexer_peek_token (parser->lexer)->type != CPP_OPEN_PAREN)
> -return list;
> +  if (cp_lexer_peek_token (parser->lexer)->type == CPP_OPEN_PAREN)
> +list = cp_parser_oacc_wait_list (parser, location, list);
> +  else
> +{
> +  tree c = build_omp_clause (location, OMP_CLAUSE_WAIT);
>  
> -  list = cp_parser_oacc_wait_list (parser, location, list);
> +  OMP_CLAUSE_DECL (c) = build_int_cst (integer_type_node, 
> GOMP_ASYNC_NOVAL);
> +  OMP_CLAUSE_CHAIN (c) = list;
> +  list = c;
> +}
>  
>return list;
>  }

ACK.

> --- gcc/fortran/trans-openmp.c(revision 263981)
> +++ gcc/fortran/trans-openmp.c(working copy)
> @@ -2922,6 +2922,13 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp
> omp_clauses = c;
>   }
>  }
> +  else if (clauses->wait)
> +{
> +  c = build_omp_clause (where.lb->location, OMP_CLAUSE_WAIT);
> +  OMP_CLAUSE_DECL (c) = build_int_cst (integer_type_node, 
> GOMP_ASYNC_NOVAL);
> +  OMP_CLAUSE_CHAIN (c) = omp_clauses;
> +  omp_clauses = c;
> +}
>if (clauses->num_gangs_expr)
>  {
>tree num_gangs_var

NACK.  Instead let's do the following, similar to C, C++, and also
similar to Fortran's OpenACC async 

Re: [PATCH] Implement std::pmr::unsynchronized_pool_resource

2018-11-07 Thread Jonathan Wakely

On 07/11/18 19:55 +0100, Rainer Orth wrote:

Hi Jonathan,


Implement std::pmr::unsynchronized_pool_resource
* config/abi/pre/gnu.ver: Add new symbols.
* include/std/memory_resource (std::pmr::__pool_resource): New class.
(std::pmr::unsynchronized_pool_resource): New class.
* src/c++17/Makefile.am: Add -fimplicit-templates to flags for
memory_resource.cc
* src/c++17/Makefile.in: Regenerate.
* src/c++17/memory_resource.cc (bitset, chunk, big_block): New
internal classes.
(__pool_resource::_Pool): Define new class.
(munge_options, pool_index, select_num_pools): New internal functions.
(__pool_resource::__pool_resource, __pool_resource::~__pool_resource)
(__pool_resource::allocate, __pool_resource::deallocate)
(__pool_resource::_M_alloc_pools): Define member functions.
(unsynchronized_pool_resource::unsynchronized_pool_resource)
(unsynchronized_pool_resource::~unsynchronized_pool_resource)
(unsynchronized_pool_resource::release)
(unsynchronized_pool_resource::_M_find_pool)
(unsynchronized_pool_resource::do_allocate)
(unsynchronized_pool_resource::do_deallocate): Define member
functions.
* testsuite/20_util/unsynchronized_pool_resource/allocate.cc: New
test.
* testsuite/20_util/unsynchronized_pool_resource/is_equal.cc: New
test.
* testsuite/20_util/unsynchronized_pool_resource/options.cc: New
test.
* testsuite/20_util/unsynchronized_pool_resource/release.cc: New
test.

The new tests being added here are pretty minimal, because we can't
assume machines running the testsuite will be able to allocate large
amounts of memory. I've tested it more thoroughly with much larger
tests though, and will try to get some of them in shape for the
testsuite/performance/20_util directory.

Tested powerpc64le-linux. Committed to trunk.


two of the new tests FAIL on 32-bit targets (seen on
i386-pc-solaris2.11, but there are other reports as well):

+FAIL: 20_util/unsynchronized_pool_resource/allocate.cc (test for excess errors)
+UNRESOLVED: 20_util/unsynchronized_pool_resource/allocate.cc compilation 
failed to produce executable

Excess errors:
Undefined   first referenced
symbol in file
std::pmr::unsynchronized_pool_resource::do_deallocate(void*, unsigned int, 
unsigned int) /var/tmp//ccUR6CSd.o
std::pmr::unsynchronized_pool_resource::do_allocate(unsigned int, unsigned int) 
/var/tmp//ccUR6CSd.o
ld: fatal: symbol referencing errors

+FAIL: 20_util/unsynchronized_pool_resource/release.cc (test for excess errors)
+UNRESOLVED: 20_util/unsynchronized_pool_resource/release.cc compilation failed 
to produce executable

Excess errors:
Undefined   first referenced
symbol in file
std::pmr::unsynchronized_pool_resource::do_allocate(unsigned int, unsigned int) 
/var/tmp//ccrQoKEb.o
ld: fatal: symbol referencing errors


Sorry about that, should be fixed by this patch. Committed to trunk.


commit f36c97256c4173516116f4c64c39c81ffec98d70
Author: Jonathan Wakely 
Date:   Wed Nov 7 19:07:26 2018 +

Fix linker script to use [jmy] to match size_t parameters

* config/abi/pre/gnu.ver: Fix patterns for size_t parameters.

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver
index b55038b8845..9d66f908e1a 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -2061,8 +2061,8 @@ GLIBCXX_3.4.26 {
 _ZNSt3pmr28unsynchronized_pool_resourceC[12]ERKNS_12pool_optionsEPNS_15memory_resourceE;
 _ZNSt3pmr28unsynchronized_pool_resourceD[12]Ev;
 _ZNSt3pmr28unsynchronized_pool_resource7releaseEv;
-_ZNSt3pmr28unsynchronized_pool_resource11do_allocateEmm;
-_ZNSt3pmr28unsynchronized_pool_resource13do_deallocateEPvmm;
+_ZNSt3pmr28unsynchronized_pool_resource11do_allocateE[jmy][jmy];
+_ZNSt3pmr28unsynchronized_pool_resource13do_deallocateEPv[jmy][jmy];
 
 } GLIBCXX_3.4.25;
 


Re: [PATCH] Implement std::pmr::unsynchronized_pool_resource

2018-11-07 Thread Rainer Orth
Hi Jonathan,

>   Implement std::pmr::unsynchronized_pool_resource
>   * config/abi/pre/gnu.ver: Add new symbols.
>   * include/std/memory_resource (std::pmr::__pool_resource): New class.
>   (std::pmr::unsynchronized_pool_resource): New class.
>   * src/c++17/Makefile.am: Add -fimplicit-templates to flags for
>   memory_resource.cc
>   * src/c++17/Makefile.in: Regenerate.
>   * src/c++17/memory_resource.cc (bitset, chunk, big_block): New
>   internal classes.
>   (__pool_resource::_Pool): Define new class.
>   (munge_options, pool_index, select_num_pools): New internal functions.
>   (__pool_resource::__pool_resource, __pool_resource::~__pool_resource)
>   (__pool_resource::allocate, __pool_resource::deallocate)
>   (__pool_resource::_M_alloc_pools): Define member functions.
>   (unsynchronized_pool_resource::unsynchronized_pool_resource)
>   (unsynchronized_pool_resource::~unsynchronized_pool_resource)
>   (unsynchronized_pool_resource::release)
>   (unsynchronized_pool_resource::_M_find_pool)
>   (unsynchronized_pool_resource::do_allocate)
>   (unsynchronized_pool_resource::do_deallocate): Define member
>   functions.
>   * testsuite/20_util/unsynchronized_pool_resource/allocate.cc: New
>   test.
>   * testsuite/20_util/unsynchronized_pool_resource/is_equal.cc: New
>   test.
>   * testsuite/20_util/unsynchronized_pool_resource/options.cc: New
>   test.
>   * testsuite/20_util/unsynchronized_pool_resource/release.cc: New
>   test.
>
> The new tests being added here are pretty minimal, because we can't
> assume machines running the testsuite will be able to allocate large
> amounts of memory. I've tested it more thoroughly with much larger
> tests though, and will try to get some of them in shape for the
> testsuite/performance/20_util directory.
>
> Tested powerpc64le-linux. Committed to trunk.

two of the new tests FAIL on 32-bit targets (seen on
i386-pc-solaris2.11, but there are other reports as well):

+FAIL: 20_util/unsynchronized_pool_resource/allocate.cc (test for excess errors)
+UNRESOLVED: 20_util/unsynchronized_pool_resource/allocate.cc compilation 
failed to produce executable

Excess errors:
Undefined   first referenced
 symbol in file
std::pmr::unsynchronized_pool_resource::do_deallocate(void*, unsigned int, 
unsigned int) /var/tmp//ccUR6CSd.o
std::pmr::unsynchronized_pool_resource::do_allocate(unsigned int, unsigned int) 
/var/tmp//ccUR6CSd.o
ld: fatal: symbol referencing errors

+FAIL: 20_util/unsynchronized_pool_resource/release.cc (test for excess errors)
+UNRESOLVED: 20_util/unsynchronized_pool_resource/release.cc compilation failed 
to produce executable

Excess errors:
Undefined   first referenced
 symbol in file
std::pmr::unsynchronized_pool_resource::do_allocate(unsigned int, unsigned int) 
/var/tmp//ccrQoKEb.o
ld: fatal: symbol referencing errors

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Fix PR87906

2018-11-07 Thread Rainer Orth
Hi Richard,

> This adds a workaround for LTO decl merging prevailing a
> non-ultimate origin decl, breaking invariants of the middle-end.
> In the future (GCC 10) I hope to have DIE references here so
> this will not be an issue there anymore.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
>
> Richard.
>
> From ff035da8314ea8e0889b99bb338e67dd5dae455b Mon Sep 17 00:00:00 2001
> From: Richard Guenther 
> Date: Wed, 7 Nov 2018 08:56:52 +0100
> Subject: [PATCH] fix-pr87906
>
> 2018-11-07  Richard Biener  
>
>   PR lto/87906
>   * tree-streamer-in.c (lto_input_ts_block_tree_pointers): Fixup
>   BLOCK_ABSTRACT_ORIGIN to be the ultimate origin.
>
>   * g++.dg/lto/pr87906_0.C: New testcase.
>   * g++.dg/lto/pr87906_1.C: Likewise.
>
> diff --git a/gcc/testsuite/g++.dg/lto/pr87906_0.C 
> b/gcc/testsuite/g++.dg/lto/pr87906_0.C
> new file mode 100644
> index 000..08e7ed3ba07
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/lto/pr87906_0.C
> @@ -0,0 +1,35 @@
> +// { dg-lto-do link }
> +// { dg-lto-options { { -O -fPIC -flto } } }
> +// { dg-extra-ld-options "-shared -nostdlib" }
> +
> +namespace com {
> +namespace sun {
> +namespace star {}
> +} // namespace sun
> +} // namespace com
> +namespace a = com::sun::star;
> +namespace com {
> +namespace sun {
> +namespace star {
> +namespace uno {

the new testcase FAILs on Solaris:

+FAIL: g++.dg/lto/pr87906 cp_lto_pr87906_0.o assemble,  -O -fPIC -flto 
+UNRESOLVED: g++.dg/lto/pr87906 cp_lto_pr87906_0.o-cp_lto_pr87906_1.o execute  
-O -fPIC -flto 
+UNRESOLVED: g++.dg/lto/pr87906 cp_lto_pr87906_0.o-cp_lto_pr87906_1.o link  -O 
-fPIC -flto 
+FAIL: g++.dg/lto/pr87906 cp_lto_pr87906_1.o assemble,  -O -fPIC -flto 

/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.dg/lto/pr87906_0.C:6:11: error: 
expected identifier before numeric constant
/vol/gcc/src/hg/trunk/local/gcc/testsuite/g++.dg/lto/pr87906_0.C:6:11: error: 
expected unqualified-id before numeric constant

and several more due to the -Dsun default.  How about
sed -e 's/sun/moon/g' instead ;-)

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH], Remove power9 fusion support

2018-11-07 Thread Michael Meissner
On Mon, Nov 05, 2018 at 04:09:23PM -0600, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Fri, Nov 02, 2018 at 02:37:34PM -0400, Michael Meissner wrote:
> > This patch removes all of the so-called power9 fusion support for the GCC
> > compiler.  It leaves -mpower9-fusion as a deprecated switch in case somebody
> > used it (the switch was never documented).
> 
> As Mike Stump says, please just remove it.  The option was never documented,
> most likely zero people use it, and those that do shouldn't have and can
> easily adjust.
> 
> > [gcc]
> > 2018-11-02  Michael Meissner  
> > 
> > * config/rs6000/constraints.md (wF constraint): Only document the
> > wF constraint for power8 fusion.  Remove documentation for power9
> > fusion.
> 
> It wasn't documented as being anything for p8 before.  So that was wrong?

The switch wasn't documented.  In the constraint (which is what I'm changing
here), the constraint mentioned p9 fusion in the documentation string.

> > (rs6000_option_override_internal): Delete power9 fusion option
> > support.  If we do -mcpu=power8 -mtune=power9, turn off power8
> > fusion.
> 
> That doesn't sound right.  Either the -mcpu= or the -mtune= should turn
> it on, but neither should turn it off.  It sounds like you want -mtune
> to say whether fusion is enabled or not?  That sounds fine, but this
> should be implemented more directly (or more generically).

Ok, I will look at it.

> >  mpower9-fusion
> > -Target Undocumented Report Mask(P9_FUSION) Var(rs6000_isa_flags)
> > -Fuse certain operations together for better performance on power9.
> > +Target Undocumented Mask(P9_FUSION) Var(rs6000_isa_flags) Deprecated
> 
> Yeah just delete this please.

Ok.
 
> > @@ -1692,11 +1650,7 @@ (define_predicate "fusion_gpr_addis"
> >  return 0;
> >  
> >/* Power8 currently will only do the fusion if the top 11 bits of the 
> > addis
> > - value are all 1's or 0's.  Ignore this restriction if we are testing
> > - advanced fusion.  */
> > -  if (TARGET_P9_FUSION)
> > -return 1;
> > -
> > + value are all 1's or 0's.  */
> >return (IN_RANGE (value >> 16, -32, 31));
> >  })
> 
> I think this is top 12 bits equal, not 11, so [-16..15].

It is 11 bits, check section 12.1.12 in the  power8 book IV.

addis(SI) first 11 bits must be all 0’s or all 1’s

> > @@ -1762,14 +1718,13 @@ (define_predicate "fusion_gpr_mem_load"
> >  ;; Match a GPR load (lbz, lhz, lwz, ld) that uses a combined address in the
> >  ;; memory field with both the addis and the memory offset.  Sign extension
> >  ;; is not handled here, since lha and lwa are not fused.
> > -;; With P9 fusion, also match a fpr/vector load and float_extend
> >  (define_predicate "fusion_addis_mem_combo_load"
> >(match_code "mem,zero_extend,float_extend")
> 
> So float_extend should be deleted here?

Yes.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



[PATCH, arm] Backport -- Fix ICE during thunk generation with -mlong-calls

2018-11-07 Thread Mihail Ionescu

Hi All,

This is a backport from trunk for GCC 8 and 7.

SVN revision: r264595.

Regression tested on arm-none-eabi.


gcc/ChangeLog

2018-11-02  Mihail Ionescu  

Backport from mainiline
2018-09-26  Eric Botcazou  

* config/arm/arm.c (arm_reorg): Skip Thumb reorg pass for thunks.
(arm32_output_mi_thunk): Deal with long calls.

gcc/testsuite/ChangeLog

2018-11-02  Mihail Ionescu  

Backport from mainiline
2018-09-17  Eric Botcazou  

* g++.dg/other/thunk2a.C: New test.
* g++.dg/other/thunk2b.C: Likewise.


If everything is ok, could someone commit it on my behalf?

Best regards,
   Mihail
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
2ece668219f3ca34883cd882431b0a3c390d4d3c..c68311e0fa192c350d03eb2dd37eca92ae7b3cfa
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17663,7 +17663,11 @@ arm_reorg (void)
 
   if (use_cmse)
 cmse_nonsecure_call_clear_caller_saved ();
-  if (TARGET_THUMB1)
+
+  /* We cannot run the Thumb passes for thunks because there is no CFG.  */
+  if (cfun->is_thunk)
+;
+  else if (TARGET_THUMB1)
 thumb1_reorg ();
   else if (TARGET_THUMB2)
 thumb2_reorg ();
@@ -26737,6 +26741,8 @@ static void
 arm32_output_mi_thunk (FILE *file, tree, HOST_WIDE_INT delta,
   HOST_WIDE_INT vcall_offset, tree function)
 {
+  const bool long_call_p = arm_is_long_call_p (function);
+
   /* On ARM, this_regno is R0 or R1 depending on
  whether the function returns an aggregate or not.
   */
@@ -26774,9 +26780,22 @@ arm32_output_mi_thunk (FILE *file, tree, HOST_WIDE_INT 
delta,
   TREE_USED (function) = 1;
 }
   rtx funexp = XEXP (DECL_RTL (function), 0);
+  if (long_call_p)
+{
+  emit_move_insn (temp, funexp);
+  funexp = temp;
+}
   funexp = gen_rtx_MEM (FUNCTION_MODE, funexp);
-  rtx_insn * insn = emit_call_insn (gen_sibcall (funexp, const0_rtx, 
NULL_RTX));
+  rtx_insn *insn = emit_call_insn (gen_sibcall (funexp, const0_rtx, NULL_RTX));
   SIBLING_CALL_P (insn) = 1;
+  emit_barrier ();
+
+  /* Indirect calls require a bit of fixup in PIC mode.  */
+  if (long_call_p)
+{
+  split_all_insns_noflow ();
+  arm_reorg ();
+}
 
   insn = get_insns ();
   shorten_branches (insn);
diff --git a/gcc/testsuite/g++.dg/other/thunk2a.C 
b/gcc/testsuite/g++.dg/other/thunk2a.C
new file mode 100644
index 
..8e5ebd4960df758fa77ff08b019e104870f36b45
--- /dev/null
+++ b/gcc/testsuite/g++.dg/other/thunk2a.C
@@ -0,0 +1,15 @@
+// { dg-do compile { target arm*-*-* } }
+// { dg-options "-mlong-calls -ffunction-sections" }
+
+class a {
+public:
+  virtual ~a();
+};
+
+class b : virtual a {};
+
+class c : b {
+  ~c();
+};
+
+c::~c() {}
diff --git a/gcc/testsuite/g++.dg/other/thunk2b.C 
b/gcc/testsuite/g++.dg/other/thunk2b.C
new file mode 100644
index 
..c8f4570923d8bde71547dd343de45edc0efeb2c7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/other/thunk2b.C
@@ -0,0 +1,16 @@
+// { dg-do compile { target arm*-*-* } }
+// { dg-options "-mlong-calls -ffunction-sections" }
+// { dg-additional-options "-fPIC" { target fpic } }
+
+class a {
+public:
+  virtual ~a();
+};
+
+class b : virtual a {};
+
+class c : b {
+  ~c();
+};
+
+c::~c() {}
diff --git a/gcc/testsuite/g++.dg/other/vthunk1.C 
b/gcc/testsuite/g++.dg/other/thunk1.C
similarity index 100%
rename from gcc/testsuite/g++.dg/other/vthunk1.C
rename to gcc/testsuite/g++.dg/other/thunk1.C


Re: [PATCH, ARM] Clean up arm backend using the @ construct for MD patterns

2018-11-07 Thread Mihail Ionescu



On 10/09/2018 09:52 AM, Ramana Radhakrishnan wrote:

On 09/10/2018 09:27, Mihail Ionescu wrote:

Hi all,

This patch removes some of the machine mode checks from the arm backend when
emitting instructions by using the '@' construct (Parameterized Names[2]). It
is based on the previous AArch64 patch[1].

[1]https://gcc.gnu.org/ml/gcc-patches/2018-07/msg00673.html
[2]https://gcc.gnu.org/onlinedocs/gccint/Parameterized-Names.html#Parameterized-Names

Ran the tests on arm-none-eabi.


Thanks for the patch - It would be good to split this into 2 patches,
one for the cleanup with the permute instructions and the other for the
atomics. That makes life easier with reviews and it's logically grouped
as well.

Testing on just arm-none-eabi for the atomic_compare_and_swap changes
are not sufficient. I would prefer a bootstrap and test run on
arm-none-linux-gnueabihf with (--with-arch=armv7-a --with-fpu=vfpv3-d16
--with-float=hard as your configure options).

Alternatively I'd be happy if you could ensure that the libraries built
for arm-none-eabi show no difference in code generation for your change
? That will give us some more confidence that nothing else is wrong here.






gcc/ChangeLog:
2018-10-03  Mihail 
Ionescu

  * config/arm/arm.c (arm_expand_compare_and_swap): Use 
gen_atomic_compare_and_swap_1
  instead of explicit mode checks.


Simplify and call gen_atomic_compare_swap_1.


  (arm_evpc_neon_vuzp): Likewise gen_neon_vuzp_internal.


Simplify and call gen_neon_vuzp_internal..


  (arm_evpc_neon_vtrn): Likewise gen_neon_vtrn_internal.
  (arm_evpc_neon_vext): Likewise gen_neon_vext.
  (arm_evpc_neon_vzip): Likewise gen_neon_vzip_internal.
  (arm_evpc_neon_vrev): Replaced the function pointer and simplified 
the mode
  checks.


and so on...


  * config/arm/arm.md (neon_vext)
  (neon_vrev64, neon_vrev32)
  (neon_vrev16, neon_vtrn_internal)
  (neon_vzip_internal, neon_vuzp_internal): Add an 
'@'character
  before the pattern name.


Separate all pattern names across lines with ,'s .


  * config/arm/sync.md:
  (atomic_compare_and_swap_1)
  (atomic_compare_and_swap_1): Likewise.


Same as above.



If everything is ok for trunk, can someone commit it on my behalf?

Best regards,
  Mihail


diff.txt

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
8810df53aa34798b5e3e1eb3a870101d530702e4..51441efa934f5f2a5963750fcd7e077951406d5a
 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -28539,8 +28539,7 @@ void
   arm_expand_compare_and_swap (rtx operands[])
   {
 rtx bval, bdst, rval, mem, oldval, newval, is_weak, mod_s, mod_f, x;
-  machine_mode mode;
-  rtx (*gen) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx);
+  machine_mode mode, arch_mode;


s/arch_mode/compare_mode

The mode here is the mode for the comparison , whether we use SImode as
for Thumb1 or a CC_Zmode in !TARGET_THUMB1 cases. arch_mode doesn't tell
me much on reading the name.

   
 bval = operands[0];

 rval = operands[1];
@@ -28588,32 +28587,13 @@ arm_expand_compare_and_swap (rtx operands[])
   }
   
 if (TARGET_THUMB1)

-{
-  switch (mode)
-   {
-   case E_QImode: gen = gen_atomic_compare_and_swapt1qi_1; break;
-   case E_HImode: gen = gen_atomic_compare_and_swapt1hi_1; break;
-   case E_SImode: gen = gen_atomic_compare_and_swapt1si_1; break;
-   case E_DImode: gen = gen_atomic_compare_and_swapt1di_1; break;
-   default:
- gcc_unreachable ();
-   }
-}
+arch_mode = E_SImode;
 else
-{
-  switch (mode)
-   {
-   case E_QImode: gen = gen_atomic_compare_and_swap32qi_1; break;
-   case E_HImode: gen = gen_atomic_compare_and_swap32hi_1; break;
-   case E_SImode: gen = gen_atomic_compare_and_swap32si_1; break;
-   case E_DImode: gen = gen_atomic_compare_and_swap32di_1; break;
-   default:
- gcc_unreachable ();
-   }
-}
+arch_mode = CC_Zmode;
   
 bdst = TARGET_THUMB1 ? bval : gen_rtx_REG (CC_Zmode, CC_REGNUM);

-  emit_insn (gen (bdst, rval, mem, oldval, newval, is_weak, mod_s, mod_f));
+  emit_insn (gen_atomic_compare_and_swap_1 (arch_mode, mode, bdst, rval, mem, 
oldval,
+  newval, is_weak, mod_s, mod_f));
   
 if (mode == QImode || mode == HImode)

   emit_move_insn (operands[1], gen_lowpart (mode, rval));
@@ -28979,7 +28959,6 @@ arm_evpc_neon_vuzp (struct expand_vec_perm_d *d)
   {
 unsigned int i, odd, mask, nelt = d->perm.length ();
 rtx out0, out1, in0, in1;
-  rtx (*gen)(rtx, rtx, rtx, rtx);
 int first_elem;
 int swap_nelt;
   
@@ -29013,22 +28992,6 @@ arm_evpc_neon_vuzp (struct expand_vec_perm_d *d)

 if (d->testing_p)
   return true;
   
-  switch (d->vmode)

-{
-case E_V16QImode: gen = gen_neon_vuzpv16qi_internal; break;
-case E_V8QImode:  

Re: [RFC][PATCH LRA] WIP patch to fix one part of PR87507

2018-11-07 Thread Peter Bergner
On 11/7/18 11:36 AM, Jeff Law wrote:
> I was referring to a more fundamental check in the IL checkers.

Yes, I understood that.  I was just replying to Segher's specific issue
with this code.  I do plan on looking at adding IL verifier checks for
subregs of subregs like you requested.


> Segher may have been referring to this specific code.  This is obviously
> safe to do as well.
> 
> OK with this change.

He was.  Thanks, I'll do one more bootstrap and then commit.  Thanks!

Peter



Re: [RFC][PATCH LRA] WIP patch to fix one part of PR87507

2018-11-07 Thread Jeff Law
On 11/7/18 9:29 AM, Peter Bergner wrote:
> On 11/6/18 6:14 PM, Segher Boessenkool wrote:
>> Or more general, that what is inside the subreg is a reg, because the
>> code does rely on that.
> 
> I think you mean to beef up the following from:
> 
> + if (HARD_REGISTER_P (nop_reg)
> + && REG_USERVAR_P (nop_reg)
> + && HARD_REGISTER_P (m_reg)
> + && REG_USERVAR_P (m_reg))
> +   break;
> 
> to:
> 
> +   if (REG_P (nop_reg)
> +   && HARD_REGISTER_P (nop_reg)
> +   && REG_USERVAR_P (nop_reg)
> +   && REG_P (m_reg)
> +   && HARD_REGISTER_P (m_reg)
> +   && REG_USERVAR_P (m_reg))
> + break;
> 
> ...correct?  I can add that.  I don't think we need to modify
> the other patch hunks, since we know operand_reg[x] is already
> a reg.
I was referring to a more fundamental check in the IL checkers.  Segher
may have been referring to this specific code.  This is obviously safe
to do as well.

OK with this change.
jeff


Re: [PATCH] Update soft-fp from glibc.

2018-11-07 Thread Joseph Myers
This patch is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v2] MIPS: Default to --with-llsc for the R5900 Linux target as well

2018-11-07 Thread Fredrik Noring
Hello global GCC reviewers,

Would it be possible to apply the reviewed patch below?

Thank you,
Fredrik

On Fri, Oct 19, 2018 at 08:33:33PM +0200, Fredrik Noring wrote:
> The Linux kernel requires and emulates LL and SC for the R5900 too.  The
> special --without-llsc default for the R5900 is therefore not applicable
> in that case.
> 
> Reviewed-by: Maciej W. Rozycki 
> ---
> Changes in v2:
> - Double spacing instead of single spacing in commit message
> 
> ---
>  gcc/config.gcc | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 720e6a7373d..68c34b16123 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -3711,14 +3711,14 @@ fi
>  # Infer a default setting for --with-llsc.
>  if test x$with_llsc = x; then
>case ${target} in
> -mips64r5900-*-* | mips64r5900el-*-* | mipsr5900-*-* | mipsr5900el-*-*)
> -  # The R5900 doesn't support LL(D) and SC(D).
> -  with_llsc=no
> -  ;;
>  mips*-*-linux*)
># The kernel emulates LL and SC where necessary.
>with_llsc=yes
>;;
> +mips64r5900-*-* | mips64r5900el-*-* | mipsr5900-*-* | mipsr5900el-*-*)
> +  # The R5900 doesn't support LL(D) and SC(D).
> +  with_llsc=no
> +  ;;
>esac
>  fi
>  
> -- 
> 2.18.1
> 


Re: [RFC][PATCH LRA] WIP patch to fix one part of PR87507

2018-11-07 Thread Peter Bergner
On 11/6/18 6:14 PM, Segher Boessenkool wrote:
> Or more general, that what is inside the subreg is a reg, because the
> code does rely on that.

I think you mean to beef up the following from:

+   if (HARD_REGISTER_P (nop_reg)
+   && REG_USERVAR_P (nop_reg)
+   && HARD_REGISTER_P (m_reg)
+   && REG_USERVAR_P (m_reg))
+ break;

to:

+   if (REG_P (nop_reg)
+   && HARD_REGISTER_P (nop_reg)
+   && REG_USERVAR_P (nop_reg)
+   && REG_P (m_reg)
+   && HARD_REGISTER_P (m_reg)
+   && REG_USERVAR_P (m_reg))
+ break;

...correct?  I can add that.  I don't think we need to modify
the other patch hunks, since we know operand_reg[x] is already
a reg.

Peter



[PATCH] MIPS: Add `-mfix-r5900' option for the R5900 short loop erratum

2018-11-07 Thread Fredrik Noring
The short loop bug under certain conditions causes loops to
execute only once or twice, due to a hardware bug in the R5900 chip.

`-march=r5900' already enables the R5900 short loop workaround.
However, the R5900 ISA and most other MIPS ISAs are mutually
exclusive since R5900-specific instructions are generated as well.

The `-mfix-r5900' option can be used in combination with e.g.
`-mips2' or `-mips3' to generate generic MIPS binaries that also
work with the R5900 target.  The workaround is implemented by GAS
rather than by GCC.

The following small `shortloop.c' file has been used as a test
with GCC 8.2.0:

void shortloop(void)
{
__asm__ __volatile__ (
"   li $3, 300\n"
"loop:\n"
"   addi $3, -1\n"
"   addi $4, -1\n"
"   bne $3, $0, loop\n"
"   li $4, 3\n"
::);
}

The following six combinations have been tested:

% mipsr5900el-unknown-linux-gnu-gcc -O1 -c shortloop.c
% mipsr5900el-unknown-linux-gnu-gcc -O1 -c shortloop.c -mfix-r5900
% mipsr5900el-unknown-linux-gnu-gcc -O1 -c shortloop.c -mno-fix-r5900

% mipsr4000el-unknown-linux-gnu-gcc -O1 -c shortloop.c
% mipsr4000el-unknown-linux-gnu-gcc -O1 -c shortloop.c -mfix-r5900
% mipsr4000el-unknown-linux-gnu-gcc -O1 -c shortloop.c -mno-fix-r5900

The R5900 short loop erratum is corrected in exactly three cases:

1. for the target `mipsr5900el' by default;

2. for the target `mipsr5900el' with `-mfix-r5900';

3. for any other MIPS target (e.g. `mipsr4000el') with `-mfix-r5900'.

In all other cases the correction is not made.

* gcc/config/mips/mips.c (mips_reorg_process_insns)
  (mips_option_override): Default to working around R5900
  errata only if the processor was selected explicitly.
* gcc/config/mips/mips.h: Declare `mfix-r5900' and
  `mno-fix-r5900'.
* gcc/config/mips/mips.opt: Define MASK_FIX_R5900.
* gcc/doc/invoke.texi: Document the R5900, `mfix-r5900' and
  `mno-fix-r5900'.
---
 gcc/config/mips/mips.c   | 14 ++
 gcc/config/mips/mips.h   |  1 +
 gcc/config/mips/mips.opt |  4 
 gcc/doc/invoke.texi  | 14 +-
 4 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index ea2fae1d6db..5763ce21427 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -18881,13 +18881,13 @@ mips_reorg_process_insns (void)
   if (crtl->profile)
 cfun->machine->all_noreorder_p = false;
 
-  /* Code compiled with -mfix-vr4120, -mfix-rm7000 or -mfix-24k can't be
- all noreorder because we rely on the assembler to work around some
- errata.  The R5900 too has several bugs.  */
+  /* Code compiled with -mfix-vr4120, -mfix-r5900, -mfix-rm7000 or
+ -mfix-24k can't be all noreorder because we rely on the assembler
+ to work around some errata.  The R5900 target has several bugs.  */
   if (TARGET_FIX_VR4120
   || TARGET_FIX_RM7000
   || TARGET_FIX_24K
-  || TARGET_MIPS5900)
+  || TARGET_FIX_R5900)
 cfun->machine->all_noreorder_p = false;
 
   /* The same is true for -mfix-vr4130 if we might generate MFLO or
@@ -20244,6 +20244,12 @@ mips_option_override (void)
   && strcmp (mips_arch_info->name, "r4400") == 0)
 target_flags |= MASK_FIX_R4400;
 
+  /* Default to working around R5900 errata only if the processor
+ was selected explicitly.  */
+  if ((target_flags_explicit & MASK_FIX_R5900) == 0
+  && strcmp (mips_arch_info->name, "r5900") == 0)
+target_flags |= MASK_FIX_R5900;
+
   /* Default to working around R1 errata only if the processor
  was selected explicitly.  */
   if ((target_flags_explicit & MASK_FIX_R1) == 0
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 32a88edc910..7dd19fc6f2d 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -1363,6 +1363,7 @@ struct mips_cpu_info {
 %{mmsa} %{mno-msa} \
 %{msmartmips} %{mno-smartmips} \
 %{mmt} %{mno-mt} \
+%{mfix-r5900} %{mno-fix-r5900} \
 %{mfix-rm7000} %{mno-fix-rm7000} \
 %{mfix-vr4120} %{mfix-vr4130} \
 %{mfix-24k} \
diff --git a/gcc/config/mips/mips.opt b/gcc/config/mips/mips.opt
index 5a9f255fe20..427ac4913fc 100644
--- a/gcc/config/mips/mips.opt
+++ b/gcc/config/mips/mips.opt
@@ -165,6 +165,10 @@ mfix-r4400
 Target Report Mask(FIX_R4400)
 Work around certain R4400 errata.
 
+mfix-r5900
+Target Report Mask(FIX_R5900)
+Work around the R5900 short loop erratum.
+
 mfix-rm7000
 Target Report Var(TARGET_FIX_RM7000)
 Work around certain RM7000 errata.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e290128f535..c9846d96304 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -939,6 +939,7 @@ Objective-C and Objective-C++ Dialects}.
 -mmad  -mno-mad  -mimadd  -mno-imadd  -mfused-madd  -mno-fused-madd  -nocpp 
@gol
 -mfix-24k  -mno-fix-24k @gol
 -mfix-r4000  -mno-fix-r4000  -mfix-r4400  -mno-fix-r4400 @gol
+-mfix-r5900  -mno-fix-r5900 @gol
 -mfix-r1  

[PR C++/87904] lookup ICE

2018-11-07 Thread Nathan Sidwell
My recent relaxing of overload ordering broken an invariant that 
unhiding a hidden decl was assuming.  Fixed thusly.


nathan
--
Nathan Sidwell
2018-11-07  Nathan Sidwell  

	PR c++/87904
	* cp-tree.h (struct tree_overload): Fix comment.
	* tree.c (ovl_iterator::reveal_node): Propagate OVL_DEDUP_P.

	PR c++/87904
	* g++.dg/lookup/pr87904.C: New.

Index: gcc/cp/cp-tree.h
===
--- gcc/cp/cp-tree.h	(revision 265851)
+++ gcc/cp/cp-tree.h	(working copy)
@@ -723,8 +723,7 @@ typedef struct ptrmem_cst * ptrmem_cst_t
 #define OVL_SINGLE_P(NODE) \
   (TREE_CODE (NODE) != OVERLOAD || !OVL_CHAIN (NODE))
 
-/* OVL_HIDDEN_P nodes come first, then OVL_USING_P nodes, then regular
-   fns.  */
+/* OVL_HIDDEN_P nodes come before other nodes.  */
 
 struct GTY(()) tree_overload {
   struct tree_common common;
Index: gcc/cp/tree.c
===
--- gcc/cp/tree.c	(revision 265851)
+++ gcc/cp/tree.c	(working copy)
@@ -2261,13 +2261,17 @@ ovl_iterator::reveal_node (tree overload
 
   OVL_HIDDEN_P (node) = false;
   if (tree chain = OVL_CHAIN (node))
-if (TREE_CODE (chain) == OVERLOAD
-	&& (OVL_USING_P (chain) || OVL_HIDDEN_P (chain)))
+if (TREE_CODE (chain) == OVERLOAD)
   {
-	/* The node needs moving, and the simplest way is to remove it
-	   and reinsert.  */
-	overload = remove_node (overload, node);
-	overload = ovl_insert (OVL_FUNCTION (node), overload);
+	if (OVL_HIDDEN_P (chain))
+	  {
+	/* The node needs moving, and the simplest way is to remove it
+	   and reinsert.  */
+	overload = remove_node (overload, node);
+	overload = ovl_insert (OVL_FUNCTION (node), overload);
+	  }
+	else if (OVL_DEDUP_P (chain))
+	  OVL_DEDUP_P (node) = true;
   }
   return overload;
 }
Index: gcc/testsuite/g++.dg/lookup/pr87904.C
===
--- gcc/testsuite/g++.dg/lookup/pr87904.C	(nonexistent)
+++ gcc/testsuite/g++.dg/lookup/pr87904.C	(working copy)
@@ -0,0 +1,21 @@
+// PR c++ 87904 ICE failing to initiate deduping
+
+namespace X {
+  void Foo (char);
+}
+
+struct B {
+  friend void Foo (int);
+};
+
+using X::Foo;
+
+void Foo (float);
+void Foo(int);
+
+void frob ()
+{
+  using namespace X;
+
+  Foo (1);
+}


[PATCH 2/4] dump_printf: add "%C" for dumping cgraph_node *

2018-11-07 Thread David Malcolm
This patch implements support for %C in dump_printf for dumping
cgraph_node *.
(I would have preferred to have a code for printing symtab_node *
and both subclasses, but there doesn't seem to be a good way for
-Wformat to handle inheritance, so, failing that, I went with
this approach).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu, in
conjunction with the rest of the patch kit.

OK for trunk?

gcc/c-family/ChangeLog:
* c-format.c (local_cgraph_node_ptr_node): New variable.
(gcc_dump_printf_char_table): Add entry for %C.
(get_pointer_to_named_type): New function, taken from the handling
code for "gimple *" from...
(init_dynamic_diag_info): ...here.  Add handling for
"cgraph_node *".
* c-format.h (T_CGRAPH_NODE): New.

gcc/ChangeLog:
* dump-context.h (ASSERT_IS_CGRAPH_NODE): New macro.
* dumpfile.c (make_item_for_dump_cgraph_node): Move to before...
(dump_pretty_printer::decode_format): Implement "%C" for
cgraph_node *.
(selftest::test_capture_of_dump_calls): Rename "where" to
"stmt_loc".  Convert test_decl to a function decl and set its
location.  Add a symbol_table_test RAII instance and a
cgraph_node, using it to test "%C" and dump_symtab_node.

gcc/testsuite/ChangeLog:
* gcc.dg/format/gcc_diag-10.c (cgraph_node): New typedef.
(test_dump): Add testing of %C.
---
 gcc/c-family/c-format.c   |  56 ++-
 gcc/c-family/c-format.h   |   1 +
 gcc/dump-context.h|   8 +++
 gcc/dumpfile.c| 115 ++
 gcc/testsuite/gcc.dg/format/gcc_diag-10.c |   5 +-
 5 files changed, 135 insertions(+), 50 deletions(-)

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index a1133c7..385ee1a 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -60,6 +60,7 @@ struct function_format_info
 /* Initialized in init_dynamic_diag_info.  */
 static GTY(()) tree local_tree_type_node;
 static GTY(()) tree local_gimple_ptr_node;
+static GTY(()) tree local_cgraph_node_ptr_node;
 static GTY(()) tree locus;
 
 static bool decode_format_attr (tree, function_format_info *, int);
@@ -803,6 +804,9 @@ static const format_char_info gcc_dump_printf_char_table[] =
   /* E and G require a "gimple *" argument at runtime.  */
   { "EG",   1, STD_C89, { T89_G,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN, 
 BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"",   NULL },
 
+  /* C requires a "cgraph_node *" argument at runtime.  */
+  { "C",   1, STD_C89, { T_CGRAPH_NODE,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"",   
NULL },
+
   /* T requires a "tree" at runtime.  */
   { "T",   1, STD_C89, { T89_T,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"",   NULL },
 
@@ -3879,6 +3883,33 @@ init_dynamic_gfc_info (void)
 }
 }
 
+/* Lookup the type named NAME and return a pointer-to-NAME type if found.
+   Otherwise, return void_type_node if NAME has not been used yet, or 
NULL_TREE if
+   NAME is not a type (issuing an error).  */
+
+static tree
+get_pointer_to_named_type (const char *name)
+{
+  tree result;
+  if ((result = maybe_get_identifier (name)))
+{
+  result = identifier_global_value (result);
+  if (result)
+   {
+ if (TREE_CODE (result) != TYPE_DECL)
+   {
+ error ("%qs is not defined as a type", name);
+ result = NULL_TREE;
+   }
+ else
+   result = TREE_TYPE (result);
+   }
+}
+  else
+result = void_type_node;
+  return result;
+}
+
 /* Determine the types of "tree" and "location_t" in the code being
compiled for use in GCC's diagnostic custom format attributes.  You
must have set dynamic_format_types before calling this function.  */
@@ -3932,25 +3963,12 @@ init_dynamic_diag_info (void)
   /* Similar to the above but for gimple*.  */
   if (!local_gimple_ptr_node
   || local_gimple_ptr_node == void_type_node)
-{
-  if ((local_gimple_ptr_node = maybe_get_identifier ("gimple")))
-   {
- local_gimple_ptr_node
-   = identifier_global_value (local_gimple_ptr_node);
- if (local_gimple_ptr_node)
-   {
- if (TREE_CODE (local_gimple_ptr_node) != TYPE_DECL)
-   {
- error ("% is not defined as a type");
- local_gimple_ptr_node = 0;
-   }
- else
-   local_gimple_ptr_node = TREE_TYPE (local_gimple_ptr_node);
-   }
-   }
-  else
-   local_gimple_ptr_node = void_type_node;
-}
+local_gimple_ptr_node = get_pointer_to_named_type ("gimple");
+
+  /* Similar to the above but for cgraph_node*.  */
+  if (!local_cgraph_node_ptr_node
+  || 

[PATCH 3/4] support %f in pp_format

2018-11-07 Thread David Malcolm
Numerous formatted messages from the inliner use %f, mostly as %f, but
occasionally with length modifiers.

This patch implements the simplest case of "%f" for pp_format (with no
modifier support) to make it easier to port these messages from fprintf
to dump_printf_loc.

The selftest has an assertion that %f on 1.0 is printed as "1.00".
This comes from the host's sprintf, and I believe this is guaranteed by
POSIX: "If the precision is missing, it shall be taken as 6".  If this is
an issue I can drop the selftest.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu, in
conjunction with the rest of the patch kit.

OK for trunk?

gcc/c-family/ChangeLog:
* c-format.c (gcc_dump_printf_char_table): Add entry for %f.

gcc/ChangeLog:
* pretty-print.c (pp_format): Handle %f.
(selftest::test_pp_format): Add test of %f.
* pretty-print.h (pp_double): New macro.

gcc/testsuite/ChangeLog:
* gcc.dg/format/gcc_diag-10.c: Add coverage for %f.
---
 gcc/c-family/c-format.c   | 3 +++
 gcc/pretty-print.c| 6 ++
 gcc/pretty-print.h| 1 +
 gcc/testsuite/gcc.dg/format/gcc_diag-10.c | 2 ++
 4 files changed, 12 insertions(+)

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 385ee1a..8d91a77 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -810,6 +810,9 @@ static const format_char_info gcc_dump_printf_char_table[] =
   /* T requires a "tree" at runtime.  */
   { "T",   1, STD_C89, { T89_T,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"",   NULL },
 
+  /* %f requires a "double"; it doesn't support modifiers.  */
+  { "f",   0, STD_C89, { T89_D,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"",   NULL },
+
   { NULL,  0, STD_C89, NOLENGTHS, NULL, NULL, NULL }
 };
 
diff --git a/gcc/pretty-print.c b/gcc/pretty-print.c
index 7dd900b..19ef75b 100644
--- a/gcc/pretty-print.c
+++ b/gcc/pretty-print.c
@@ -977,6 +977,7 @@ pp_indent (pretty_printer *pp)
%ld, %li, %lo, %lu, %lx: long versions of the above.
%lld, %lli, %llo, %llu, %llx: long long versions.
%wd, %wi, %wo, %wu, %wx: HOST_WIDE_INT versions.
+   %f: double
%c: character.
%s: string.
%p: pointer (printed in a host-dependent manner).
@@ -1307,6 +1308,10 @@ pp_format (pretty_printer *pp, text_info *text)
  (pp, *text->args_ptr, precision, unsigned, "u");
  break;
 
+   case 'f':
+ pp_double (pp, va_arg (*text->args_ptr, double));
+ break;
+
case 'Z':
  {
int *v = va_arg (*text->args_ptr, int *);
@@ -2160,6 +2165,7 @@ test_pp_format ()
   ASSERT_PP_FORMAT_2 ("17 12345678", "%wo %x", (HOST_WIDE_INT)15, 0x12345678);
   ASSERT_PP_FORMAT_2 ("0xcafebabe 12345678", "%wx %x", 
(HOST_WIDE_INT)0xcafebabe,
  0x12345678);
+  ASSERT_PP_FORMAT_2 ("1.00 12345678", "%f %x", 1.0, 0x12345678);
   ASSERT_PP_FORMAT_2 ("A 12345678", "%c %x", 'A', 0x12345678);
   ASSERT_PP_FORMAT_2 ("hello world 12345678", "%s %x", "hello world",
  0x12345678);
diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
index 2decc51..a6e60f1 100644
--- a/gcc/pretty-print.h
+++ b/gcc/pretty-print.h
@@ -330,6 +330,7 @@ pp_get_prefix (const pretty_printer *pp) { return 
pp->prefix; }
   pp_string (PP, pp_buffer (PP)->digit_buffer);\
 }  \
   while (0)
+#define pp_double(PP, F)   pp_scalar (PP, "%f", F)
 #define pp_pointer(PP, P)  pp_scalar (PP, "%p", P)
 
 #define pp_identifier(PP, ID)  pp_string (PP, (pp_translate_identifiers (PP) \
diff --git a/gcc/testsuite/gcc.dg/format/gcc_diag-10.c 
b/gcc/testsuite/gcc.dg/format/gcc_diag-10.c
index 97a1993..ba2629b 100644
--- a/gcc/testsuite/gcc.dg/format/gcc_diag-10.c
+++ b/gcc/testsuite/gcc.dg/format/gcc_diag-10.c
@@ -183,4 +183,6 @@ void test_dump (tree t, gimple *stmt, cgraph_node *node)
   dump ("%T", t);
   dump ("%G", stmt);
   dump ("%C", node);
+  dump ("%f", 1.0);
+  dump ("%4.2f", 1.0); /* { dg-warning "format" } */
 }
-- 
1.8.5.3



[PATCH 4/4] ipa-inline.c/tree-inline.c: port from fprintf to dump API (PR ipa/86395)

2018-11-07 Thread David Malcolm
This patch ports various fprintf calls in the inlining code to using
the dump API, using the %C format code for printing cgraph_node *.
I focused on the dump messages that seemed most significant to
end-users; I didn't port all of the calls.

Doing so makes this information appear in -fopt-info and in
optimization records, rather than just in the dump_file.

It also changes the affected dumpfile-dumps from being unconditional
(assuming the dump_file is enabled) to being guarded by the MSG_*
status.  Hence various tests with dg-final scan-*-dump directives
need to gain "-all" or "-optimized" suffixes to -fdump-ipa-inline.

The use of %C throughout also slightly changes the dump format for
several messages, e.g. changing:

 Inlining void inline_me(char*) into int main(int, char**).

to:

../../src/gcc/testsuite/g++.dg/tree-ssa/inline-1.C:13:8: optimized:  Inlining 
void inline_me(char*)/0 into int main(int, char**)/2.

amongst other things adding "/order" suffixes to the cgraph node
names.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu, in
conjunction with the rest of the patch kit.

OK for trunk?

gcc/ChangeLog:
PR ipa/86395
* doc/invoke.texi (-fdump-ipa-): Document the "-optimized",
"-missed", "-note", and "-all" sub-options.
* ipa-inline.c (caller_growth_limits): Port from fprintf to dump
API.
(can_early_inline_edge_p): Likewise.
(want_early_inline_function_p): Likewise.
(want_inline_self_recursive_call_p): Likewise.
(recursive_inlining): Likewise.
(inline_small_functions): Likewise.
(flatten_function): Likewise.
(ipa_inline): Likewise.
(inline_always_inline_functions): Likewise.
(early_inline_small_functions): Likewise.
(early_inliner): Likewise.
* tree-inline.c (expand_call_inline): Likewise.

gcc/testsuite/ChangeLog:
PR ipa/86395
* g++.dg/ipa/devirt-12.C: Add "-all" suffix to
"-fdump-ipa-inline".
* g++.dg/ipa/imm-devirt-1.C: Add "-optimized" suffix to
"-fdump-tree-einline".
* g++.dg/tree-prof/inline_mismatch_args.C: Add "-all" suffix to
"-fdump-tree-einline".
* g++.dg/tree-ssa/inline-1.C: Add "-optimized" suffix to
"-fdump-tree-einline".
* g++.dg/tree-ssa/inline-2.C: Likewise.
* g++.dg/tree-ssa/inline-3.C: Likewise.
* g++.dg/tree-ssa/inline-4.C: New test, based on inline-1.C, but
using "-fopt-info-inline".
* gcc.dg/ipa/fopt-info-inline-1.c: New test.
* gcc.dg/ipa/inline-4.c:  Add "-all" suffix to
"-fdump-ipa-inline".  Add "-fopt-info-inline" and dg-optimized
directive.
* gcc.dg/ipa/inline-7.c: Add "-optimized" suffix to
"-fdump-tree-einline".  Add "-fopt-info-inline" and dg-optimized
directive.  Update scan-tree-dump-times to reflect /order
suffixes.
* gcc.dg/ipa/inlinehint-4.c: Update scan-tree-dump-times to
reflect /order suffixes.
* gcc.dg/plugin/dump-1.c: Add "-loop" to "-fopt-info-note" to
avoid getting extra messages from inliner.
* gcc.dg/plugin/dump-2.c: Likewise.
* gcc.dg/pr26570.c: Add dg-prune-output to ignore new
"function body not available" missed optimization messages.
* gcc.dg/pr71969-2.c: Update scan-tree-dump-times to reflect
/order suffixes.
* gcc.dg/pr71969-3.c: Likewise.
* gcc.dg/tree-ssa/inline-11.c: Add "-all" suffix to
"-fdump-tree-einline".
* gcc.dg/tree-ssa/inline-3.c: Add "-optimized" suffix to
"-fdump-tree-einline".  Update scan-tree-dump-times to reflect
/order suffixes.
* gcc.dg/tree-ssa/inline-4.c: Add "-optimized" suffix to
"-fdump-tree-einline".  Add "-fopt-info-inline" and dg-optimized
directive.
* gcc.dg/tree-ssa/inline-8.c: Add "-optimized" suffix to
"-fdump-tree-einline".
* gfortran.dg/pr79966.f90: Update scan-ipa-dump to reflect /order
suffixes.
---
 gcc/doc/invoke.texi|  13 ++
 gcc/ipa-inline.c   | 191 +++--
 gcc/testsuite/g++.dg/ipa/devirt-12.C   |   2 +-
 gcc/testsuite/g++.dg/ipa/imm-devirt-1.C|   2 +-
 .../g++.dg/tree-prof/inline_mismatch_args.C|   2 +-
 gcc/testsuite/g++.dg/tree-ssa/inline-1.C   |   2 +-
 gcc/testsuite/g++.dg/tree-ssa/inline-2.C   |   2 +-
 gcc/testsuite/g++.dg/tree-ssa/inline-3.C   |   2 +-
 gcc/testsuite/g++.dg/tree-ssa/inline-4.C   |  32 
 gcc/testsuite/gcc.dg/ipa/fopt-info-inline-1.c  |  44 +
 gcc/testsuite/gcc.dg/ipa/inline-4.c|   4 +-
 gcc/testsuite/gcc.dg/ipa/inline-7.c|   6 +-
 gcc/testsuite/gcc.dg/ipa/inlinehint-4.c|   4 +-
 gcc/testsuite/gcc.dg/plugin/dump-1.c   |   2 +-
 gcc/testsuite/gcc.dg/plugin/dump-2.c   |   2 +-
 

[PATCH 1/4] cgraph: add selftest::symbol_table_test

2018-11-07 Thread David Malcolm
This patch adds a selftest fixture for overriding the "symtab" global,
so that selftests involving symtab nodes can be isolated from each
other: each selftest can have its own symbol_table instance.

In particular, this ensures that nodes can have a predictable "order"
and thus predictable dump names within selftests.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu, in
conjunction with the rest of the patch kit.

OK for trunk?

gcc/ChangeLog:
* cgraph.c: Include "selftest.h".
(saved_symtab): New variable.
(selftest::symbol_table_test::symbol_table_test): New ctor.
(selftest::symbol_table_test::~symbol_table_test): New dtor.
(selftest::test_symbol_table_test): New test.
(selftest::cgraph_c_tests): New.
* cgraph.h (saved_symtab): New decl.
(selftest::symbol_table_test): New class.
* selftest-run-tests.c (selftest::run_tests): Call
selftest::cgraph_c_tests.
* selftest.h (selftest::cgraph_c_tests): New decl.
---
 gcc/cgraph.c | 67 
 gcc/cgraph.h | 23 +
 gcc/selftest-run-tests.c |  1 +
 gcc/selftest.h   |  1 +
 4 files changed, 92 insertions(+)

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index b432f7e..b3dd429 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "stringpool.h"
 #include "attribs.h"
+#include "selftest.h"
 
 /* FIXME: Only for PROP_loops, but cgraph shouldn't have to know about this.  
*/
 #include "tree-pass.h"
@@ -3765,4 +3766,70 @@ cgraph_edge::sreal_frequency ()
   : caller->count);
 }
 
+/* A stashed copy of "symtab" for use by selftest::symbol_table_test.
+   This needs to be a global so that it can be a GC root, and thus
+   prevent the stashed copy from being garbage-collected if the GC runs
+   during a symbol_table_test.  */
+
+symbol_table *saved_symtab;
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* class selftest::symbol_table_test.  */
+
+/* Constructor.  Store the old value of symtab, and create a new one.  */
+
+symbol_table_test::symbol_table_test ()
+{
+  gcc_assert (saved_symtab == NULL);
+  saved_symtab = symtab;
+  symtab = new (ggc_cleared_alloc  ()) symbol_table ();
+}
+
+/* Destructor.  Restore the old value of symtab.  */
+
+symbol_table_test::~symbol_table_test ()
+{
+  gcc_assert (saved_symtab != NULL);
+  symtab = saved_symtab;
+  saved_symtab = NULL;
+}
+
+/* Verify that symbol_table_test works.  */
+
+static void
+test_symbol_table_test ()
+{
+  /* Simulate running two selftests involving symbol tables.  */
+  for (int i = 0; i < 2; i++)
+{
+  symbol_table_test stt;
+  tree test_decl = build_decl (UNKNOWN_LOCATION, FUNCTION_DECL,
+  get_identifier ("test_decl"),
+  build_function_type_list (void_type_node,
+NULL_TREE));
+  cgraph_node *node = cgraph_node::get_create (test_decl);
+  gcc_assert (node);
+
+  /* Verify that the node has order 0 on both iterations,
+and thus that nodes have predictable dump names in selftests.  */
+  ASSERT_EQ (node->order, 0);
+  ASSERT_STREQ (node->dump_name (), "test_decl/0");
+}
+}
+
+/* Run all of the selftests within this file.  */
+
+void
+cgraph_c_tests ()
+{
+  test_symbol_table_test ();
+}
+
+} // namespace selftest
+
+#endif /* CHECKING_P */
+
 #include "gt-cgraph.h"
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 71c5453..d326866 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -3350,4 +3350,27 @@ xstrdup_for_dump (const char *transient_str)
   return ggc_strdup (transient_str);
 }
 
+extern GTY(()) symbol_table *saved_symtab;
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* An RAII-style class for use in selftests for temporarily using a different
+   symbol_table, so that such tests can be isolated from each other.  */
+
+class symbol_table_test
+{
+ public:
+  /* Constructor.  Override "symtab".  */
+  symbol_table_test ();
+
+  /* Constructor.  Restore the saved_symtab.  */
+  ~symbol_table_test ();
+};
+
+} // namespace selftest
+
+#endif /* CHECKING_P */
+
 #endif  /* GCC_CGRAPH_H  */
diff --git a/gcc/selftest-run-tests.c b/gcc/selftest-run-tests.c
index 562ada7..6d65d24 100644
--- a/gcc/selftest-run-tests.c
+++ b/gcc/selftest-run-tests.c
@@ -73,6 +73,7 @@ selftest::run_tests ()
   unique_ptr_tests_cc_tests ();
   opt_proposer_c_tests ();
   json_cc_tests ();
+  cgraph_c_tests ();
   optinfo_emit_json_cc_tests ();
   opt_problem_cc_tests ();
 
diff --git a/gcc/selftest.h b/gcc/selftest.h
index 8da7c4a..4e4c755 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -215,6 +215,7 @@ class test_runner
alphabetical order.  */
 extern void attribute_c_tests ();
 extern void bitmap_c_tests ();
+extern void cgraph_c_tests ();
 extern void 

[PATCH 0/4] Fix -fopt-info-inline (PR ipa/86395)

2018-11-07 Thread David Malcolm
Currently -fopt-info-inline does nothing, as all of the dumping
relating to inlining uses fprintf rather than dumpfile.h's dump_*
interface.

This patch kit adds a %C format code to dumpfile.c for printing
cgraph_node *, and uses it to port many of the IPA dump messages
to the dump API.  I focused on the dump messages that seemed most
significant to end-users.

With this kit, -fopt-info-inline prints messages at callsites
such as e.g.:

t.c:16:6: optimized:  Inlining bar/1 into boo/2.
t.c:15:11: optimized:  Inlining foo/0 into boo/2.
t.c:24:12: optimized:  Inlining boo/2 into compute/3.

and this can be checked for via the dg-optimized test directive
(and appears in the -fsave-optimization-record result, etc).

Dave

David Malcolm (4):
  cgraph: add selftest::symbol_table_test
  dump_printf: add "%C" for dumping cgraph_node *
  support %f in pp_format
  ipa-inline.c/tree-inline.c: port from fprintf to dump API (PR ipa/86395)

 gcc/c-family/c-format.c|  59 +--
 gcc/c-family/c-format.h|   1 +
 gcc/cgraph.c   |  67 
 gcc/cgraph.h   |  23 +++
 gcc/doc/invoke.texi|  13 ++
 gcc/dump-context.h |   8 +
 gcc/dumpfile.c | 115 +
 gcc/ipa-inline.c   | 191 +++--
 gcc/pretty-print.c |   6 +
 gcc/pretty-print.h |   1 +
 gcc/selftest-run-tests.c   |   1 +
 gcc/selftest.h |   1 +
 gcc/testsuite/g++.dg/ipa/devirt-12.C   |   2 +-
 gcc/testsuite/g++.dg/ipa/imm-devirt-1.C|   2 +-
 .../g++.dg/tree-prof/inline_mismatch_args.C|   2 +-
 gcc/testsuite/g++.dg/tree-ssa/inline-1.C   |   2 +-
 gcc/testsuite/g++.dg/tree-ssa/inline-2.C   |   2 +-
 gcc/testsuite/g++.dg/tree-ssa/inline-3.C   |   2 +-
 gcc/testsuite/g++.dg/tree-ssa/inline-4.C   |  32 
 gcc/testsuite/gcc.dg/format/gcc_diag-10.c  |   7 +-
 gcc/testsuite/gcc.dg/ipa/fopt-info-inline-1.c  |  44 +
 gcc/testsuite/gcc.dg/ipa/inline-4.c|   4 +-
 gcc/testsuite/gcc.dg/ipa/inline-7.c|   6 +-
 gcc/testsuite/gcc.dg/ipa/inlinehint-4.c|   4 +-
 gcc/testsuite/gcc.dg/plugin/dump-1.c   |   2 +-
 gcc/testsuite/gcc.dg/plugin/dump-2.c   |   2 +-
 gcc/testsuite/gcc.dg/pr26570.c |   1 +
 gcc/testsuite/gcc.dg/pr71969-2.c   |   2 +-
 gcc/testsuite/gcc.dg/pr71969-3.c   |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/inline-11.c  |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/inline-3.c   |   6 +-
 gcc/testsuite/gcc.dg/tree-ssa/inline-4.c   |   6 +-
 gcc/testsuite/gcc.dg/tree-ssa/inline-8.c   |   2 +-
 gcc/testsuite/gfortran.dg/pr79966.f90  |   2 +-
 gcc/tree-inline.c  |  20 ++-
 35 files changed, 468 insertions(+), 174 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/inline-4.C
 create mode 100644 gcc/testsuite/gcc.dg/ipa/fopt-info-inline-1.c

-- 
1.8.5.3



[PATCH, testsuite]: Use int128 effective target for gcc.dg/pr87874.c

2018-11-07 Thread Uros Bizjak
2018-11-07  Uros Bizjak  

* gcc.dg/pr87874.c: Compile only for int128 effective target.

Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN.

Uros.
diff --git a/gcc/testsuite/gcc.dg/pr87874.c b/gcc/testsuite/gcc.dg/pr87874.c
index 3ab5dcf68ffb..1480a5e54937 100644
--- a/gcc/testsuite/gcc.dg/pr87874.c
+++ b/gcc/testsuite/gcc.dg/pr87874.c
@@ -1,9 +1,8 @@
-/* { dg-do compile } */
+/* { dg-do compile { target int128 } } */
 /* { dg-options "-g -O1 -fgcse -fno-dce -fno-tree-ccp -fno-tree-coalesce-vars 
-fno-tree-copy-prop -fno-tree-dce -fno-tree-dominator-opts -fno-tree-fre 
-fno-tree-loop-optimize -fno-tree-sink" } */
 
 int *vk;
 int m2;
-#if __SIZEOF_INT128__
 __int128 nb;
 
 void
@@ -32,4 +31,3 @@ em (int u5, int fo, int s7)
   }
 }
 }
-#endif


Simplify types of arrays

2018-11-07 Thread Jan Hubicka
Hi,
this patch simplfies types of arrays so we don't propagate duplicates
when record/union contains array of pointers.
With this we still miss simplification of pointers to arrays of
structures (where we need to rebuild array same way as we rebuild
pointers) and enumerate types. That should make simplification of types
complete. Neither of those two seems very critical for GCC build,
with the patch we are down to 24 duplicated types in bootstrap, I will
collect data on firefox, but things looks quite good.
(from tens of thousdant week ago).

The patch works, but I am somewhat nervous because modyfing type inplace
affects its type_hash_canon_hash and friends.  There are pre-existing
modifications to function parameters and things seems to just work,
but I wonder if we have any strategy on keeping hashes in tree.c
consitent across free-lang data? Or are all those hashes unused/freed at
this time?

lto-bootstrapped/regtested x86_64-linux.

Honza

* tree.c (free_lang_data_in_type): Simplify types of arrays.
Index: tree.c
===
--- tree.c  (revision 265877)
+++ tree.c  (working copy)
@@ -5320,6 +5320,8 @@ free_lang_data_in_type (tree type, struc
  TREE_PURPOSE (p) = NULL;
}
 }
+  else if (TREE_CODE (type) == ARRAY_TYPE)
+TREE_TYPE (type) = fld_simplified_type (TREE_TYPE (type), fld);
   else if (RECORD_OR_UNION_TYPE_P (type))
 {
   /* Remove members that are not FIELD_DECLs from the field list


Re: [PATCH, GCC, AARCH64, 6/6] Enable BTI: Add configure option for BTI and PAC-RET

2018-11-07 Thread James Greenhalgh
On Fri, Nov 02, 2018 at 01:38:46PM -0500, Sudakshina Das wrote:
> Hi
> 
> This patch is part of a series that enables ARMv8.5-A in GCC and
> adds Branch Target Identification Mechanism.
> (https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)
> 
> This patch is adding a new configure option for enabling and return
> address signing by default with --enable-standard-branch-protection.
> This is equivalent to -mbranch-protection=standard which would
> imply -mbranch-protection=pac-ret+bti.
> 
> Bootstrapped and regression tested with aarch64-none-linux-gnu with
> and without the configure option turned on.
> Also tested on aarch64-none-elf with and without configure option with a
> BTI enabled aem. Only 2 regressions and these were because newlib
> requires patches to protect hand coded libraries with BTI.
> 
> Is this ok for trunk?

With a tweak to the comment above your changes in aarch64.c, yes this is OK.

> *** gcc/ChangeLog ***
> 
> 2018-xx-xx  Sudakshina Das  
> 
>   * config/aarch64/aarch64.c (aarch64_override_options): Add case to check
>   configure option to set BTI and Return Address Signing.
>   * configure.ac: Add --enable-standard-branch-protection and
>   --disable-standard-branch-protection.
>   * configure: Regenerated.
>   * doc/install.texi: Document the same.
> 
> *** gcc/testsuite/ChangeLog ***
> 
> 2018-xx-xx  Sudakshina Das  
> 
>   * gcc.target/aarch64/bti-1.c: Update test to not add command
>   line option when configure with bti.
>   * gcc.target/aarch64/bti-2.c: Likewise.
>   * lib/target-supports.exp
>   (check_effective_target_default_branch_protection):
>   Add configure check for --enable-standard-branch-protection.
> 

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 12a55a640de4fdc5df21d313c7ea6841f1daf3f2..a1a5b7b464eaa2ce67ac66d9aea837159590aa07
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -11558,6 +11558,26 @@ aarch64_override_options (void)
>if (!selected_tune)
>  selected_tune = selected_cpu;
>  
> +  if (aarch64_enable_bti == 2)
> +{
> +#ifdef TARGET_ENABLE_BTI
> +  aarch64_enable_bti = 1;
> +#else
> +  aarch64_enable_bti = 0;
> +#endif
> +}
> +
> +  /* No command-line option yet.  */

This is too broad. Can you narrow this down to which command line option this
relates to, and what the expected default behaviours are (for both LP64 and
ILP32).

Thanks,
James

> +  if (accepted_branch_protection_string == NULL && !TARGET_ILP32)
> +{
> +#ifdef TARGET_ENABLE_PAC_RET
> +  aarch64_ra_sign_scope = AARCH64_FUNCTION_NON_LEAF;
> +  aarch64_ra_sign_key = AARCH64_KEY_A;
> +#else
> +  aarch64_ra_sign_scope = AARCH64_FUNCTION_NONE;
> +#endif
> +}
> +
>  #ifndef HAVE_AS_MABI_OPTION
>/* The compiler may have been configured with 2.23.* binutils, which does
>   not have support for ILP32.  */



Re: [PATCH, GCC, AARCH64, 4/6] Enable BTI: Add new to -mbranch-protection.

2018-11-07 Thread James Greenhalgh
On Fri, Nov 02, 2018 at 01:38:25PM -0500, Sudakshina Das wrote:
> Hi
> 
> This patch is part of a series that enables ARMv8.5-A in GCC and
> adds Branch Target Identification Mechanism.
> (https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)
> 
> NOTE: This patch is dependent on Sam Tebbs patch to deprecate
> -msign-return-address and add new -mbranch-protection option
> https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00104.html
> 
> This pass updates the CLI of -mbranch-protection to add "bti" as a new
> type of branch protection and also add it its definition of "none" and
> "standard". Since the BTI instructions, just like the return address
> signing instructions are in the HINT space, this option is not limited
> to ARMv8.5-A architecture version.
> 
> The option does not really do anything functional.
> The functional changes are in the next patch. I am initializing the 
> target variable aarch64_enable_bti to 2 since I am also adding a
> configure option in a later patch and a value different from 0 and 1
> would help identify if its already been updated.
> 
> Bootstrapped and regression tested with aarch64-none-linux-gnu.
> Is this ok for trunk?

OK.

Thanks,
James

> *** gcc/ChangeLog ***
> 
> 2018-xx-xx  Sudakshina Das  
> 
>   * config/aarch64/aarch64-protos.h (aarch64_bti_enabled):
>   Declare.
>   * config/aarch64/aarch64.c
>   (aarch64_handle_no_branch_protection): Disable bti for
>   -mbranch-protection=none.
>   (aarch64_handle_standard_branch_protection): Enable bti for
>   -mbranch-protection=standard.
>   (aarch64_handle_bti_protection): Enable bti for "bti" in the
>   string to -mbranch-protection.
>   (aarch64_bti_enabled): Check if bti is enabled.
>   * config/aarch64/aarch64.opt: Declare target variable.
>   * doc/invoke.texi: Add bti to the -mbranch-protection
>   documentation.



Re: Small typo in iconv.m4

2018-11-07 Thread Simon Marchi

On 2018-11-06 11:37, Hafiz Abid Qadeer wrote:

Hi All,
I was investigating a character set related problem with windows hosted
GDB and I tracked it down to a typo in iconv.m4. This typo caused
libiconv detection to fail and related support was not built into gdb.

The problem is with the following line.
CPPFLAGS="$LIBS $INCICONV"
which should have been
CPPFLAGS="$CPPFLAGS $INCICONV"

OK to commit the attached patch?

2018-11-06  Hafiz Abid Qadeer  

* config/iconv.m4 (AM_ICONV_LINK): Don't overwrite CPPFLAGS.
Append $INCICONV to it.
* gcc/configure: Regenerate.
* libcpp/configure: Likewise.
* libstdc++-v3/configure: Likewise.
* intl/configure: Likewise.

Thanks,


Seems good from my point of view, but I can't approve.

Simon


[PATCH] Fix part of PR87913

2018-11-07 Thread Richard Biener


The following fixes MIN/MAX recognition for comparisons that
we turned into equality compares (for tests like unsigned < 1).

It turns out we don't do a very good job in expanding them,
nevertheless this GIMPLE level fix is good and we get slight
improvements in code generation.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

>From 0b801cf0ed81d8bd0945e68196efc7d1fc676562 Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Wed, 7 Nov 2018 16:20:54 +0100
Subject: [PATCH] fix-pr87913

PR tree-optimization/87913
* tree-ssa-phiopt.c (minmax_replacement): Turn EQ/NE compares
of extreme values to ordered comparisons.

* gcc.dg/tree-ssa/phi-opt-20.c: New testcase.

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-20.c 
b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-20.c
new file mode 100644
index 000..c310308e3a6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-20.c
@@ -0,0 +1,25 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-phiopt1" } */
+
+unsigned int f(unsigned int num)
+{
+  return num < 1 ? 1 : num;
+}
+
+unsigned int g(unsigned int num)
+{
+  return num > (unsigned)__INT_MAX__ * 2 ? (unsigned)__INT_MAX__ * 2 : num;
+}
+
+int h(int num)
+{
+  return num < -__INT_MAX__ ? -__INT_MAX__ : num;
+}
+
+int i(int num)
+{
+  return num > __INT_MAX__-1 ? __INT_MAX__-1 : num;
+}
+
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 2 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 2 "phiopt1" } } */
diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
index 07845101b86..64039e2484e 100644
--- a/gcc/tree-ssa-phiopt.c
+++ b/gcc/tree-ssa-phiopt.c
@@ -1204,7 +1204,7 @@ minmax_replacement (basic_block cond_bb, basic_block 
middle_bb,
edge e0, edge e1, gimple *phi,
tree arg0, tree arg1)
 {
-  tree result, type;
+  tree result, type, rhs;
   gcond *cond;
   gassign *new_stmt;
   edge true_edge, false_edge;
@@ -1220,6 +1220,25 @@ minmax_replacement (basic_block cond_bb, basic_block 
middle_bb,
 
   cond = as_a  (last_stmt (cond_bb));
   cmp = gimple_cond_code (cond);
+  rhs = gimple_cond_rhs (cond);
+
+  /* Turn EQ/NE of extreme values to order comparisons.  */
+  if ((cmp == NE_EXPR || cmp == EQ_EXPR)
+  && TREE_CODE (rhs) == INTEGER_CST)
+{
+  if (wi::eq_p (wi::to_wide (rhs), wi::min_value (TREE_TYPE (rhs
+   {
+ cmp = (cmp == EQ_EXPR) ? LT_EXPR : GE_EXPR;
+ rhs = wide_int_to_tree (TREE_TYPE (rhs),
+ wi::min_value (TREE_TYPE (rhs)) + 1);
+   }
+  else if (wi::eq_p (wi::to_wide (rhs), wi::max_value (TREE_TYPE (rhs
+   {
+ cmp = (cmp == EQ_EXPR) ? GT_EXPR : LE_EXPR;
+ rhs = wide_int_to_tree (TREE_TYPE (rhs),
+ wi::max_value (TREE_TYPE (rhs)) - 1);
+   }
+}
 
   /* This transformation is only valid for order comparisons.  Record which
  operand is smaller/larger if the result of the comparison is true.  */
@@ -1228,7 +1247,7 @@ minmax_replacement (basic_block cond_bb, basic_block 
middle_bb,
   if (cmp == LT_EXPR || cmp == LE_EXPR)
 {
   smaller = gimple_cond_lhs (cond);
-  larger = gimple_cond_rhs (cond);
+  larger = rhs;
   /* If we have smaller < CST it is equivalent to smaller <= CST-1.
 Likewise smaller <= CST is equivalent to smaller < CST+1.  */
   if (TREE_CODE (larger) == INTEGER_CST)
@@ -1255,7 +1274,7 @@ minmax_replacement (basic_block cond_bb, basic_block 
middle_bb,
 }
   else if (cmp == GT_EXPR || cmp == GE_EXPR)
 {
-  smaller = gimple_cond_rhs (cond);
+  smaller = rhs;
   larger = gimple_cond_lhs (cond);
   /* If we have larger > CST it is equivalent to larger >= CST+1.
 Likewise larger >= CST is equivalent to larger > CST-1.  */


Re: PR fortran/87919 patch for -fno-dec-structure

2018-11-07 Thread Jakub Jelinek
On Wed, Nov 07, 2018 at 03:07:04PM +, Mark Eggleston wrote:

>   PR fortran/87919
>   * options.c (gfc_handle_option): Removed case OPT_fdec_structure
>   as it breaks the handling of -fno-dec-structure.

No entries for the tests, i.e.
* gfortran.dg/pr87919-dec-structure-1.f: New test.
* gfortran.dg/pr87919-dec-structure-2.f: New test.
* gfortran.dg/pr87919-dec-structure-3.f: New test.
* gfortran.dg/pr87919-dec-structure-4.f: New test.

> diff --git a/gcc/fortran/options.c b/gcc/fortran/options.c
> index 73f5389361d9..3b7c2d40fe8a 100644
> --- a/gcc/fortran/options.c
> +++ b/gcc/fortran/options.c
> @@ -761,10 +761,6 @@ gfc_handle_option (size_t scode, const char *arg, 
> HOST_WIDE_INT value,
>/* Enable all DEC extensions.  */
>set_dec_flags (1);
>break;
> -
> -case OPT_fdec_structure:
> -  flag_dec_structure = 1;
> -  break;
>  }
>  
>Fortran_handle_option_auto (_options, _options_set, 

LGTM, but I'll defer the final review to Fortran maintainers.

> diff --git a/gcc/testsuite/gfortran.dg/pr87919-dec-structure-1.f 
> b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-1.f
> new file mode 100644
> index ..4dd34082b97a
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-1.f
> @@ -0,0 +1,21 @@
> +! { dg-do compile }
> +!
> +! PR/fortran/87919

Without the first /, i.e.
! PR fortran/87919
(in all tests).

> +++ b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-2.f
> @@ -0,0 +1,22 @@
> +! { dg-do run }
> +! { dg-options "-fdec" }
> +!
> +! PR/fortran/87919
> +!
> +! Should compile anf run with the -fdec option

s/anf/and/ (in several tests).

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-3.f
> @@ -0,0 +1,22 @@
> +! { dg-do run }
> +! { dg-options "-fdec-structure" }
> +!
> +! PR/fortran/87919
> +!
> +! Should compile anf run with the -fdec option

s/-fdec/-fdec-structure/ in this case.

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-4.f
> @@ -0,0 +1,22 @@
> +! { dg-do compile }
> +! { dg-options "-fdec -fno-dec-structure" }
> +!
> +! PR/fortran/87919
> +!
> +! Should fail to compile with the -fdec and -fno-dec-structure option

s/option/options/

I'd suggest to add another test, with
! { dg-options "-fdec-structure -fno-dec-structure" }
where the options cancel each other and the result is no DEC structure
support.

Jakub


Re: [PATCH, GCC, AARCH64, 2/6] Add new arch command line feaures from ARMv8.5-A

2018-11-07 Thread James Greenhalgh
On Fri, Nov 02, 2018 at 01:37:41PM -0500, Sudakshina Das wrote:
> Hi
> 
> This patch is part of a series that enables ARMv8.5-A in GCC and
> adds Branch Target Identification Mechanism.
> (https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)
> 
> This patch add all the command line feature that are added by ARMv8.5.
> Optional extensions to armv8.5-a:
> +rng : Random number Generation Instructions.
> +memtag : Memory Tagging Extension.
> 
> ARMv8.5-A features that are optional to older arch:
> +sb : Speculation barrier instruction.
> +ssbs: Speculative Store Bypass Safe instruction.
> +predres: Execution and Data Prediction Restriction instructions.
> 
> All of the above only effect the assembler and have already (or almost
> for a couple of cases) gone in the trunk of binutils.
> 
> Bootstrapped and regression tested with aarch64-none-linux-gnu.
> 
> Is this ok for trunk?

OK, but will need rebased to keep the AARCH64_FL_* in order.

Thanks,
James

> *** gcc/ChangeLog ***
> 
> 2018-xx-xx  Sudakshina Das  
> 
>   * config/aarch64/aarch64-option-extensions.def: Define
>   AARCH64_OPT_EXTENSION for memtag, rng, sb, ssbs and predres.
>   * gcc/config/aarch64/aarch64.h (AARCH64_FL_RNG): New.
>   (AARCH64_FL_MEMTAG, ARCH64_FL_SB, AARCH64_FL_SSBS): New.
>   (AARCH64_FL_PREDRES): New.
>   (AARCH64_FL_FOR_ARCH8_5): Add AARCH64_FL_SB, AARCH64_FL_SSBS and
>   AARCH64_FL_PREDRES by default.
>   * gcc/doc/invoke.texi: Document rng, memtag, sb, ssbs and
>   predres.
> 


Re: [PATCH, GCC, AARCH64, 1/6] Enable ARMv8.5-A in gcc

2018-11-07 Thread James Greenhalgh
On Fri, Nov 02, 2018 at 01:37:33PM -0500, Sudakshina Das wrote:
> Hi
> 
> This patch is part of a series that enables ARMv8.5-A in GCC and
> adds Branch Target Identification Mechanism.
> (https://developer.arm.com/products/architecture/cpu-architecture/a-profile/exploration-tools)
> 
> This patch add the march option for armv8.5-a.
> 
> Bootstrapped and regression tested with aarch64-none-linux-gnu.
> Is this ok for trunk?

One minor tweak, otherwise OK.

> *** gcc/ChangeLog ***
> 
> 2018-xx-xx  Sudakshina Das  
> 
>   * config/aarch64/aarch64-arches.def: Define AARCH64_ARCH for
>   ARMv8.5-A.
>   * gcc/config/aarch64/aarch64.h (AARCH64_FL_V8_5): New.
>   (AARCH64_FL_FOR_ARCH8_5, AARCH64_ISA_V8_5): New.
>   * gcc/doc/invoke.texi: Document ARMv8.5-A.

> diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
> index 
> fa9af26fd40fd23b1c9cd6da9b6300fd77089103..b324cdd2fede33af13c03362750401f9eb1c9a90
>  100644
> --- a/gcc/config/aarch64/aarch64.h
> +++ b/gcc/config/aarch64/aarch64.h
> @@ -170,6 +170,8 @@ extern unsigned aarch64_architecture_version;
>  #define AARCH64_FL_SHA3(1 << 18)  /* Has ARMv8.4-a SHA3 and 
> SHA512.  */
>  #define AARCH64_FL_F16FML (1 << 19)  /* Has ARMv8.4-a FP16 extensions.  
> */
>  #define AARCH64_FL_RCPC8_4(1 << 20)  /* Has ARMv8.4-a RCPC extensions.  
> */
> +/* ARMv8.5-A architecture extensions.  */
> +#define AARCH64_FL_V8_5(1 << 22)  /* Has ARMv8.5-A features.  */
>  
>  /* Statistical Profiling extensions.  */
>  #define AARCH64_FL_PROFILE(1 << 21)

Let's keep this in order. 20, 21, 22.

Thanks,
James




Re: Clear more useless flags

2018-11-07 Thread Richard Biener
On Wed, 7 Nov 2018, Jan Hubicka wrote:

> > > +  /* TREE_PUBLIC is used to tell if type is anonymous.  */
> > > +  DECL_EXTERNAL (decl) = 0;
> > > +  TYPE_DECL_SUPPRESS_DEBUG (decl) = 0;
> > 
> > DECL_EXTERNAL and TYPE_DECL_SUPPRESS_DEBUG map to the same decl_flag_1 ...
> > so I'd say you should use TYPE_DECL_SUPPRESS_DEBUG only here.
> 
> I see, print_tree prints both that is how I added both. Probably
> something to fix :)

Yeah ;)

> thanks!
> Honza
> > 
> > >DECL_INITIAL (decl) = NULL_TREE;
> > >DECL_ORIGINAL_TYPE (decl) = NULL_TREE;
> > > +  DECL_MODE (decl) = VOIDmode;
> > >TREE_TYPE (decl) = void_type_node;
> > >SET_DECL_ALIGN (decl, 0);
> > >  }
> > 
> > OK with that change.
> > 
> > Thanks,
> > Richard.
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: Clear more useless flags

2018-11-07 Thread Jan Hubicka
> > +  /* TREE_PUBLIC is used to tell if type is anonymous.  */
> > +  DECL_EXTERNAL (decl) = 0;
> > +  TYPE_DECL_SUPPRESS_DEBUG (decl) = 0;
> 
> DECL_EXTERNAL and TYPE_DECL_SUPPRESS_DEBUG map to the same decl_flag_1 ...
> so I'd say you should use TYPE_DECL_SUPPRESS_DEBUG only here.

I see, print_tree prints both that is how I added both. Probably
something to fix :)

thanks!
Honza
> 
> >DECL_INITIAL (decl) = NULL_TREE;
> >DECL_ORIGINAL_TYPE (decl) = NULL_TREE;
> > +  DECL_MODE (decl) = VOIDmode;
> >TREE_TYPE (decl) = void_type_node;
> >SET_DECL_ALIGN (decl, 0);
> >  }
> 
> OK with that change.
> 
> Thanks,
> Richard.


PR fortran/87919 patch for -fno-dec-structure

2018-11-07 Thread Mark Eggleston
Please find attached the patch and a ChangeLog entry. This is my first 
patch, apologies for any mistakes in the submission process.


This patch is the simple removal of an OPT_dec_structure case from a 
switch statement, it was noticeable as there was no corresponding code 
for the other dec specific options. I spotted this before the creation 
of PR fortran/87919 so I now know that this fixes the broken behaviour 
of -fno-dec-structure.


The patch contains a change to gcc/fortran/options.c and four testcases 
pr87919-dec-structure-*.f where * is one of 1, 2, 3 and 4.


The testcases are specified to compile with no specific options, -fdec, 
-fdec-structure and both -fdec and -fno-dec-structure.


After building the compiler on x86_64 I got the following results 
aggregated from make -j 5 check-fortran:


        === gfortran Summary ===

# of expected passes        48184
# of expected failures        103
# of unsupported tests        79


--
https://www.codethink.co.uk/privacy.html

PR fortran/87919
* options.c (gfc_handle_option): Removed case OPT_fdec_structure
as it breaks the handling of -fno-dec-structure.

diff --git a/gcc/fortran/options.c b/gcc/fortran/options.c
index 73f5389361d9..3b7c2d40fe8a 100644
--- a/gcc/fortran/options.c
+++ b/gcc/fortran/options.c
@@ -761,10 +761,6 @@ gfc_handle_option (size_t scode, const char *arg, HOST_WIDE_INT value,
   /* Enable all DEC extensions.  */
   set_dec_flags (1);
   break;
-
-case OPT_fdec_structure:
-  flag_dec_structure = 1;
-  break;
 }
 
   Fortran_handle_option_auto (_options, _options_set, 
diff --git a/gcc/testsuite/gfortran.dg/pr87919-dec-structure-1.f b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-1.f
new file mode 100644
index ..4dd34082b97a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-1.f
@@ -0,0 +1,21 @@
+! { dg-do compile }
+!
+! PR/fortran/87919
+!
+! Should fail to compile without the -fdec or -fdec-structure options
+!
+! Contributed by Mark Eggleston 
+
+  program test
+
+structure /info/ ! { dg-error "-fdec-structure" }
+  integer a
+  real b
+ end structure   ! { dg-error "END PROGRAM" }
+
+record /info/ s  ! { dg-error "-fdec-structure" }
+s.a = 199! { dg-error "Unclassifiable" }
+s.b = 7.6! { dg-error "Unclassifiable" }
+write (*,*) s.a  ! { dg-error "Syntax error in WRITE" }
+write (*,*) s.b  ! { dg-error "Syntax error in WRITE" }
+  end program test
diff --git a/gcc/testsuite/gfortran.dg/pr87919-dec-structure-2.f b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-2.f
new file mode 100644
index ..e2ebfe344abb
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-2.f
@@ -0,0 +1,22 @@
+! { dg-do run }
+! { dg-options "-fdec" }
+!
+! PR/fortran/87919
+!
+! Should compile anf run with the -fdec option
+!
+! Contributed by Mark Eggleston 
+!
+ program test
+
+structure /info/
+  integer a
+  real b
+ end structure
+
+record /info/ s
+s.a = 199
+s.b = 7.6
+if (s.a .ne. 199) stop 1
+if (abs(s.b - 7.6) .gt. 1e-5) stop 2
+  end program test
diff --git a/gcc/testsuite/gfortran.dg/pr87919-dec-structure-3.f b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-3.f
new file mode 100644
index ..e543b37a4371
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-3.f
@@ -0,0 +1,22 @@
+! { dg-do run }
+! { dg-options "-fdec-structure" }
+!
+! PR/fortran/87919
+!
+! Should compile anf run with the -fdec option
+!
+! Contributed by Mark Eggleston 
+!
+  program test
+
+structure /info/
+  integer a
+  real b
+ end structure
+
+record /info/ s
+s.a = 199
+s.b = 7.6
+if (s.a .ne. 199) stop 1
+if (abs(s.b - 7.6) .gt. 1e-5) stop 2
+  end program test
diff --git a/gcc/testsuite/gfortran.dg/pr87919-dec-structure-4.f b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-4.f
new file mode 100644
index ..fa5f1d7a3436
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr87919-dec-structure-4.f
@@ -0,0 +1,22 @@
+! { dg-do compile }
+! { dg-options "-fdec -fno-dec-structure" }
+!
+! PR/fortran/87919
+!
+! Should fail to compile with the -fdec and -fno-dec-structure option
+!
+! Contributed by Mark Eggleston 
+!
+  program test
+
+structure /info/ ! { dg-error "-fdec-structure" }
+  integer a
+  real b
+ end structure   ! { dg-error "END PROGRAM" }
+
+record /info/ s  ! { dg-error "-fdec-structure" }
+s.a = 199! { dg-error "Unclassifiable" }
+s.b = 7.6! { dg-error "Unclassifiable" }
+write (*,*) s.a  ! { dg-error "Syntax error in WRITE" }
+write (*,*) s.b  ! { dg-error "Syntax error in WRITE" }
+  end program test


[PATCH] Fix PR87914

2018-11-07 Thread Richard Biener


This PR shows one example (IIRC I've seen others recently) where we
fail to handle outer loop vectorization because we do a poor job
identifying "safe" nested cycles.  This improves the situation.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

I've also built SPEC 2006 CPU with and without LTO on a Haswell machine.

I do expect fallout since the reduction code is still incredibly 
fragile...

Richard.

>From 854d80f1822ae6b37afa865ae49d64ceaee68b26 Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Wed, 7 Nov 2018 12:19:45 +0100
Subject: [PATCH] fix-pr87914

2018-11-07  Richard Biener  

PR tree-optimization/87914
* tree-vect-loop.c (vect_is_simple_reduction): Improve detection
of nested cycles.
(vectorizable_reduction): Handle shifts and rotates by dispatching
to vectorizable_shift.
* tree-vect-stmts.c (vect_get_vec_def_for_operand_1): Handle
in-loop uses of vect_nested_cycle defs.  Merge cycle and internal
def cases.
(vectorizable_shift): Export and handle being called as
vect_nested_cycle.
(vect_analyze_stmt): Call vectorizable_shift after
vectorizable_reduction.
* tree-vectorizer.h (vectorizable_shift): Declare.

* lib/target-supports.exp (check_effective_target_vect_var_shift): New.
(check_avx2_available): Likewise.
* g++.dg/vect/pr87914.cc: New testcase.

diff --git a/gcc/testsuite/g++.dg/vect/pr87914.cc 
b/gcc/testsuite/g++.dg/vect/pr87914.cc
new file mode 100644
index 000..12fbba3af2f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr87914.cc
@@ -0,0 +1,49 @@
+// { dg-do run }
+// { dg-additional-options "-fopenmp-simd" }
+// { dg-additional-options "-mavx2" { target { avx2_runtime } } }
+
+extern "C" int memcmp(const void *s1, const void *s2, __SIZE_TYPE__ n);
+extern "C" void abort(void);
+
+template 
+T reverseBits(T x)
+{
+  unsigned int s = sizeof(x) * 8;
+  T mask = ~T(0);
+  while ((s >>= 1) > 0)
+{
+  mask ^= (mask << s);
+  x = ((x >> s) & mask) | ((x << s) & ~mask); // unsupported use in stmt
+}
+  return x;
+}
+
+void __attribute__((noinline,noipa))
+test_reverseBits(unsigned* x)
+{
+#pragma omp simd aligned(x:32)
+  for (int i = 0; i < 16; ++i)
+x[i] = reverseBits(x[i]); // couldn't vectorize loop
+}
+
+int main()
+{
+  unsigned arr[16] __attribute__((aligned(32)))
+= { 0x01020304, 0x05060708, 0x0a0b0c0d, 0x0e0f1011,
+0x11121314, 0x45065708, 0xfa0b3c0du, 0x0e0f1211,
+0x21222324, 0x55066708, 0xfa0b2c0du, 0x1e0f1011,
+0x31323334, 0x65067708, 0xfa0b5c0du, 0x0e3f1011 };
+  unsigned arr2[16]
+= { 0x20c04080, 0x10e060a0, 0xb030d050, 0x8808f070u,
+0x28c84888, 0x10ea60a2, 0xb03cd05f, 0x8848f070u,
+0x24c44484, 0x10e660aa, 0xb034d05f, 0x8808f078u, 
+0x2ccc4c8c, 0x10ee60a6, 0xb03ad05f, 0x8808fc70u };
+
+  test_reverseBits (arr);
+
+  if (memcmp (arr, arr2, sizeof (arr)) != 0)
+abort ();
+  return 0;
+}
+
+// { dg-final { scan-tree-dump "OUTER LOOP VECTORIZED" "vect" { target { 
vect_var_shift && vect_int } } } }
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 9780e53dfc0..1d5ad9abdca 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -5316,6 +5316,15 @@ proc check_effective_target_vect_shift { } {
 && [check_effective_target_s390_vx]) }}]
 }
 
+# Return 1 if the target supports hardware vector shift by register operation.
+
+proc check_effective_target_vect_var_shift { } {
+return [check_cached_effective_target_indexed vect_var_shift {
+  expr {(([istarget i?86-*-*] || [istarget x86_64-*-*])
+&& [check_avx2_available])
+  }}]
+}
+
 proc check_effective_target_whole_vector_shift { } {
 if { [istarget i?86-*-*] || [istarget x86_64-*-*]
 || [istarget ia64-*-*]
@@ -7150,6 +7159,19 @@ proc check_avx_available { } {
   return 0;
 }
 
+# Return true if we are compiling for AVX2 target.
+
+proc check_avx2_available { } {
+  if { [check_no_compiler_messages avx_available assembly {
+#ifndef __AVX2__
+#error unsupported
+#endif
+  } ""] } {
+return 1;
+  }
+  return 0;
+}
+
 # Return true if we are compiling for SSSE3 target.
 
 proc check_ssse3_available { } {
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 51be405b5a0..e392aab1d52 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -2880,6 +2880,11 @@ vect_is_simple_reduction (loop_vec_info loop_info, 
stmt_vec_info phi_info,
   return NULL;
 }
 
+  /* For inner loop reductions in nested vectorization there are no
+ constraints on the number of uses in the inner loop.  */
+  if (loop == vect_loop->inner)
+   continue;
+
   nloop_uses++;
   if (nloop_uses > 1)
 {
@@ -2938,13 +2943,19 @@ vect_is_simple_reduction (loop_vec_info loop_info, 
stmt_vec_info phi_info,
   else
  

Re: Clear more useless flags

2018-11-07 Thread Richard Biener
On Wed, 7 Nov 2018, Jan Hubicka wrote:

> Hi,
> this patch enables bit more merging by clearing more flags that are
> unnecesary and differ in practice across different copies of same ODR
> types.
> 
> lto-bootstrapped/regtested x86_64-linux, OK?
>   * tree.c (fld_incomplete_type_of): Clear TREE_ADDRESSABLE flag.
>   (free_lang_data_in_decl): Set TREE_ADDRESSABLE for public functions
>   and variables; clear DECL_EXTERNAL, TYPE_DECL_SUPPRESS_DEBUG and
>   DECL_MODE on TYPE_DECLs.
> Index: tree.c
> ===
> --- tree.c(revision 265875)
> +++ tree.c(working copy)
> @@ -5197,6 +5197,7 @@ fld_incomplete_type_of (tree t, struct f
> TYPE_SIZE_UNIT (copy) = NULL;
> TYPE_CANONICAL (copy) = TYPE_CANONICAL (t);
> TYPE_TYPELESS_STORAGE (copy) = 0;
> +   TREE_ADDRESSABLE (copy) = 0;
> if (AGGREGATE_TYPE_P (t))
>   {
> TYPE_FIELDS (copy) = NULL;
> @@ -5496,6 +5497,17 @@ free_lang_data_in_decl (tree decl, struc
>   if (TREE_CODE (decl) == FUNCTION_DECL)
>  {
>struct cgraph_node *node;
> +  /* Frontends do not set TREE_ADDRESSABLE on public variables even 
> though
> +  the address may be taken in other unit, so this flag has no practical
> +  use for middle-end.
> +
> +  It would make more sense if frontends set TREE_ADDRESSABLE to 0 only
> +  for public objects that indeed can not be adressed, but it is not
> +  the case.  Set the flag to true so we do not get merge failures for
> +  i.e. virtual tables between units that take address of it and
> +  units that don't.  */
> +  if (TREE_PUBLIC (decl))
> + TREE_ADDRESSABLE (decl) = true;
>TREE_TYPE (decl) = fld_simplified_type (TREE_TYPE (decl), fld);
>if (!(node = cgraph_node::get (decl))
> || (!node->definition && !node->clones))
> @@ -5551,6 +5563,9 @@ free_lang_data_in_decl (tree decl, struc
>  }
>else if (VAR_P (decl))
>  {
> +  /* See comment above why we set the flag for functions.  */
> +  if (TREE_PUBLIC (decl))
> + TREE_ADDRESSABLE (decl) = true;
>if ((DECL_EXTERNAL (decl)
>  && (!TREE_STATIC (decl) || !TREE_READONLY (decl)))
> || (decl_function_context (decl) && !TREE_STATIC (decl)))
> @@ -5560,8 +5575,12 @@ free_lang_data_in_decl (tree decl, struc
>  {
>DECL_VISIBILITY (decl) = VISIBILITY_DEFAULT;
>DECL_VISIBILITY_SPECIFIED (decl) = 0;
> +  /* TREE_PUBLIC is used to tell if type is anonymous.  */
> +  DECL_EXTERNAL (decl) = 0;
> +  TYPE_DECL_SUPPRESS_DEBUG (decl) = 0;

DECL_EXTERNAL and TYPE_DECL_SUPPRESS_DEBUG map to the same decl_flag_1 ...
so I'd say you should use TYPE_DECL_SUPPRESS_DEBUG only here.

>DECL_INITIAL (decl) = NULL_TREE;
>DECL_ORIGINAL_TYPE (decl) = NULL_TREE;
> +  DECL_MODE (decl) = VOIDmode;
>TREE_TYPE (decl) = void_type_node;
>SET_DECL_ALIGN (decl, 0);
>  }

OK with that change.

Thanks,
Richard.


Re: GCC options for kernel live-patching (Was: Add a new option to control inlining only on static functions)

2018-11-07 Thread Martin Liška
On 11/7/18 3:27 PM, Jan Hubicka wrote:
>> On 11/5/18 10:51 AM, Jan Hubicka wrote:
 @honza: PING

 On 10/3/18 12:53 PM, Martin Liška wrote:
> On 10/3/18 11:04 AM, Jan Hubicka wrote:
>>>
>>> That was promised to be done by Honza Hubička. He's very skilled in IPA 
>>> optimizations and he's aware
>>> of optimizations that cause troubles for live-patching.
>>
>> :) I am not sure how skilful I am, but here is what I arrived to.
>
> Heh! Thanks for the analysis.
>
>>
>>  We have transformations that are modeled as clonning, which are
>>   - inlining  (can't be disabled completely because of always inline, 
>> but -fno-inline
>> does most of stuff)
>>   - cloning (disabled via -fno-ipa-cp)
>>   - ipa-sra (-fno-ipa-sra)
>>   - splitting (-fno-partial-inlining)
>>  These should play well with Martin's tracking code
>
> I hope so!
>
>>
>>  We propagate info about side effects of function:
>>   - function attribute discovery (pure, const, nothrow, malloc)
>> Some of this can be disabled by -fno-ipa-pure-const, but not all
>> of it.
>
> Would it be possible to add option for the remaining ones?
>>>
>>> Sure, I can prepare patch unless you beat me :)
>>
>> Are you sure there's a call to 'analyze_function' where the analysis is done
>> when one sets -fno-ipa-pure-const?
> 
> In set_nothrow_function_flags.  Probably would be good to grep for
> places where node->set__flag is used.

Ok, so for nothrow there are 2 extra passes (GIMPLE and RTL "nothrow" pass) 
that set it.
But, If I'm correct we should not case in case of C language, right?

- set_malloc_flag - only called from pass_local_pure_const pass
- set_pure_flag
  a) call from set_const_flag_1 - should be fine as it's from set_const context
  b) from pass_ipa_pure_const::
  c) from tree-profile.c:   node->set_pure_flag (false, false); - which is 
fine
- set_const_flag - likewise to set_pure_const (except set_const_flag_1)

>> 2018-11-07  Martin Liska  
>>
>>  * common.opt: Add -fipa-stack-alignment flag.
>>  * doc/invoke.texi: Document it.
>>  * final.c (rest_of_clean_state): Guard stack
>>  shrinking with flag.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-11-07  Martin Liska  
>>
>>  * gcc.target/i386/ipa-stack-alignment.c: New test.
>> From 8691490a142228021ed65313a72d176d06966829 Mon Sep 17 00:00:00 2001
>> From: marxin 
>> Date: Wed, 7 Nov 2018 13:31:41 +0100
>> Subject: [PATCH 1/2] Come up with -fipa-reference-addressable flag.
>>
>> gcc/ChangeLog:
>>
>> 2018-11-07  Martin Liska  
>>
>>  * cgraph.h (ipa_discover_readonly_nonaddressable_vars): Rename
>>  to ...
>>  (ipa_discover_nonaddressable_vars): ... this.
>>  * common.opt: Come up with new flag -fipa-reference-addressable.
>>  * doc/invoke.texi: Document it.
>>  * ipa-reference.c (propagate): Call the renamed fn.
>>  * ipa-visibility.c (whole_program_function_and_variable_visibility):
>>  Likewise.
>>  * ipa.c (ipa_discover_readonly_nonaddressable_vars): Renamed to
>>  ...
>>  (ipa_discover_nonaddressable_vars): ... this.  Discove
>>  non-addressable variables only with the newly added flag.
>>  * opts.c: Enable the newly added flag with -O1 and higher
>>  optimization level.
> 
> Hmm, the write-only and readonly flags are not handled in here?

You mean the ChangeLog is not mentioning that ipa_discover_nonaddressable_vars
does read-only and write-only discovery?

Martin

> 
> Honza
> 



Re: GCC options for kernel live-patching (Was: Add a new option to control inlining only on static functions)

2018-11-07 Thread Jan Hubicka
> On 11/5/18 10:51 AM, Jan Hubicka wrote:
> >> @honza: PING
> >>
> >> On 10/3/18 12:53 PM, Martin Liška wrote:
> >>> On 10/3/18 11:04 AM, Jan Hubicka wrote:
> >
> > That was promised to be done by Honza Hubička. He's very skilled in IPA 
> > optimizations and he's aware
> > of optimizations that cause troubles for live-patching.
> 
>  :) I am not sure how skilful I am, but here is what I arrived to.
> >>>
> >>> Heh! Thanks for the analysis.
> >>>
> 
>   We have transformations that are modeled as clonning, which are
>    - inlining  (can't be disabled completely because of always inline, 
>  but -fno-inline
>  does most of stuff)
>    - cloning (disabled via -fno-ipa-cp)
>    - ipa-sra (-fno-ipa-sra)
>    - splitting (-fno-partial-inlining)
>   These should play well with Martin's tracking code
> >>>
> >>> I hope so!
> >>>
> 
>   We propagate info about side effects of function:
>    - function attribute discovery (pure, const, nothrow, malloc)
>  Some of this can be disabled by -fno-ipa-pure-const, but not all
>  of it.
> >>>
> >>> Would it be possible to add option for the remaining ones?
> > 
> > Sure, I can prepare patch unless you beat me :)
> 
> Are you sure there's a call to 'analyze_function' where the analysis is done
> when one sets -fno-ipa-pure-const?

In set_nothrow_function_flags.  Probably would be good to grep for
places where node->set__flag is used.
> 2018-11-07  Martin Liska  
> 
>   * common.opt: Add -fipa-stack-alignment flag.
>   * doc/invoke.texi: Document it.
>   * final.c (rest_of_clean_state): Guard stack
>   shrinking with flag.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-11-07  Martin Liska  
> 
>   * gcc.target/i386/ipa-stack-alignment.c: New test.
> From 8691490a142228021ed65313a72d176d06966829 Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Wed, 7 Nov 2018 13:31:41 +0100
> Subject: [PATCH 1/2] Come up with -fipa-reference-addressable flag.
> 
> gcc/ChangeLog:
> 
> 2018-11-07  Martin Liska  
> 
>   * cgraph.h (ipa_discover_readonly_nonaddressable_vars): Rename
>   to ...
>   (ipa_discover_nonaddressable_vars): ... this.
>   * common.opt: Come up with new flag -fipa-reference-addressable.
>   * doc/invoke.texi: Document it.
>   * ipa-reference.c (propagate): Call the renamed fn.
>   * ipa-visibility.c (whole_program_function_and_variable_visibility):
>   Likewise.
>   * ipa.c (ipa_discover_readonly_nonaddressable_vars): Renamed to
>   ...
>   (ipa_discover_nonaddressable_vars): ... this.  Discove
>   non-addressable variables only with the newly added flag.
>   * opts.c: Enable the newly added flag with -O1 and higher
>   optimization level.

Hmm, the write-only and readonly flags are not handled in here?

Honza


Re: GCC options for kernel live-patching (Was: Add a new option to control inlining only on static functions)

2018-11-07 Thread Martin Liška
On 11/5/18 10:51 AM, Jan Hubicka wrote:
>> @honza: PING
>>
>> On 10/3/18 12:53 PM, Martin Liška wrote:
>>> On 10/3/18 11:04 AM, Jan Hubicka wrote:
>
> That was promised to be done by Honza Hubička. He's very skilled in IPA 
> optimizations and he's aware
> of optimizations that cause troubles for live-patching.

 :) I am not sure how skilful I am, but here is what I arrived to.
>>>
>>> Heh! Thanks for the analysis.
>>>

  We have transformations that are modeled as clonning, which are
   - inlining  (can't be disabled completely because of always inline, but 
 -fno-inline
 does most of stuff)
   - cloning (disabled via -fno-ipa-cp)
   - ipa-sra (-fno-ipa-sra)
   - splitting (-fno-partial-inlining)
  These should play well with Martin's tracking code
>>>
>>> I hope so!
>>>

  We propagate info about side effects of function:
   - function attribute discovery (pure, const, nothrow, malloc)
 Some of this can be disabled by -fno-ipa-pure-const, but not all
 of it.
>>>
>>> Would it be possible to add option for the remaining ones?
> 
> Sure, I can prepare patch unless you beat me :)

Are you sure there's a call to 'analyze_function' where the analysis is done
when one sets -fno-ipa-pure-const?

>>>
>>> Nothrow does not have flag but it is obviously not a concern
 for C++
>>>
>>> s/C++/C?
> 
> Yep for C
>>>
   - ipa-pta (disabled by default, -fno-ipa-pta)
   - ipa-reference (list of accessed/modified global vars), disable by 
 -fno-ipa-refernece
   - stack alignment requirements (no flag to disable)
>>>
>>> Would it be possible to add flag for it? Can you please point to a location 
>>> where
>>> the optimization happen?
> 
> In expand_call
> 
>   /* Figure out the amount to which the stack should be aligned.  */
>   preferred_stack_boundary = PREFERRED_STACK_BOUNDARY;
>   if (fndecl)
> {
>   struct cgraph_rtl_info *i = cgraph_node::rtl_info (fndecl);
>   /* Without automatic stack alignment, we can't increase preferred
>  stack boundary.  With automatic stack alignment, it is
>  unnecessary since unless we can guarantee that all callers will
>  align the outgoing stack properly, callee has to align its
>  stack anyway.  */
>   if (i
>   && i->preferred_incoming_stack_boundary
>   && i->preferred_incoming_stack_boundary < preferred_stack_boundary)
> preferred_stack_boundary = i->preferred_incoming_stack_boundary;
> }

I'm attaching patch candidate for that.

> 
>>>
   - inter-procedural register allocation (-fno-ipa-ra)

  We perform discovery of functions/variables with no address taken and
  optimizations that are not valid otherwise such as duplicating them
  or doing skipping them for alias analysis (no flag to disable)
>>>
>>> Can you be please more verbose here? What optimizations do you mean?
> 
> See ipa_discover_readonly_nonaddressable_vars. If addressable bit is
> cleared we start analyzing uses of the variable via ipa_reference or so.
> If writeonly bit is set, we start removing writes to the variable and if
> readonly bit is set we skip any analysis about whether vairable changed.

Likewise for this.

>>>

  Identical code folding merges function bodies that are semanticaly 
 equivalent
  and thus one can't patch one without patching another, -fno-ipa-icf
>>>
>>> Agree, I recommend disabling that.
>>>

  Unreachable code/variable removal may be concern too (no flag to disable)
>>>
>>> For functions that should be fine and handled by my script.
>>> For variables can be problem when a variable becomes alive But that
>>> should be extremely rare for live-patching.
>>>

  Write only global variable discovery (no flag to dosable)
>>>
>>> Similarly.
>>>

  Visibility changes with -flto and/or -fwhole-program

  We also have profile propagation (discovery of cuntions used only in cold 
 regions,
  but that I guess is only performance issue not correctness)
  No flag to disable
>>>
>>> Hope these 2 does not happen for current Linux kernel.
> 
> 2 will happen in kernel.  We will try to propagate cold code
> inter-procedurally based on what we think will be undefined effect at
> runtime.  Still i guess it is not big deal as it only affects 
> size optimization.

Then let's ignore it.

Thoughts about the patches?
Martin

> 
> Honza
>>>
>>> Martin
>>>

 Honza

>
> Martin
>
>>
>> thanks.
>>
>> Qing
>>
>
>>>
>>

>From ee912514f61ec2c4d126cf6d43b69d01a08886c8 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 7 Nov 2018 13:47:40 +0100
Subject: [PATCH 2/2] Come up with the flag -fipa-stack-alignment.

gcc/ChangeLog:

2018-11-07  Martin Liska  

	* common.opt: Add -fipa-stack-alignment flag.
	* doc/invoke.texi: Document it.
	* final.c (rest_of_clean_state): Guard stack
	shrinking with flag.


Clear more useless flags

2018-11-07 Thread Jan Hubicka
Hi,
this patch enables bit more merging by clearing more flags that are
unnecesary and differ in practice across different copies of same ODR
types.

lto-bootstrapped/regtested x86_64-linux, OK?
* tree.c (fld_incomplete_type_of): Clear TREE_ADDRESSABLE flag.
(free_lang_data_in_decl): Set TREE_ADDRESSABLE for public functions
and variables; clear DECL_EXTERNAL, TYPE_DECL_SUPPRESS_DEBUG and
DECL_MODE on TYPE_DECLs.
Index: tree.c
===
--- tree.c  (revision 265875)
+++ tree.c  (working copy)
@@ -5197,6 +5197,7 @@ fld_incomplete_type_of (tree t, struct f
  TYPE_SIZE_UNIT (copy) = NULL;
  TYPE_CANONICAL (copy) = TYPE_CANONICAL (t);
  TYPE_TYPELESS_STORAGE (copy) = 0;
+ TREE_ADDRESSABLE (copy) = 0;
  if (AGGREGATE_TYPE_P (t))
{
  TYPE_FIELDS (copy) = NULL;
@@ -5496,6 +5497,17 @@ free_lang_data_in_decl (tree decl, struc
  if (TREE_CODE (decl) == FUNCTION_DECL)
 {
   struct cgraph_node *node;
+  /* Frontends do not set TREE_ADDRESSABLE on public variables even though
+the address may be taken in other unit, so this flag has no practical
+use for middle-end.
+
+It would make more sense if frontends set TREE_ADDRESSABLE to 0 only
+for public objects that indeed can not be adressed, but it is not
+the case.  Set the flag to true so we do not get merge failures for
+i.e. virtual tables between units that take address of it and
+units that don't.  */
+  if (TREE_PUBLIC (decl))
+   TREE_ADDRESSABLE (decl) = true;
   TREE_TYPE (decl) = fld_simplified_type (TREE_TYPE (decl), fld);
   if (!(node = cgraph_node::get (decl))
  || (!node->definition && !node->clones))
@@ -5551,6 +5563,9 @@ free_lang_data_in_decl (tree decl, struc
 }
   else if (VAR_P (decl))
 {
+  /* See comment above why we set the flag for functions.  */
+  if (TREE_PUBLIC (decl))
+   TREE_ADDRESSABLE (decl) = true;
   if ((DECL_EXTERNAL (decl)
   && (!TREE_STATIC (decl) || !TREE_READONLY (decl)))
  || (decl_function_context (decl) && !TREE_STATIC (decl)))
@@ -5560,8 +5575,12 @@ free_lang_data_in_decl (tree decl, struc
 {
   DECL_VISIBILITY (decl) = VISIBILITY_DEFAULT;
   DECL_VISIBILITY_SPECIFIED (decl) = 0;
+  /* TREE_PUBLIC is used to tell if type is anonymous.  */
+  DECL_EXTERNAL (decl) = 0;
+  TYPE_DECL_SUPPRESS_DEBUG (decl) = 0;
   DECL_INITIAL (decl) = NULL_TREE;
   DECL_ORIGINAL_TYPE (decl) = NULL_TREE;
+  DECL_MODE (decl) = VOIDmode;
   TREE_TYPE (decl) = void_type_node;
   SET_DECL_ALIGN (decl, 0);
 }


Re: [PATCH] doc: Use @: where needed

2018-11-07 Thread Jeff Law
On 11/6/18 5:56 PM, Segher Boessenkool wrote:
> When an abbreviation ends with a dot followed by whitespace, Texinfo
> thinks the dot ends a sentence, and applies spacing rules etc. based
> on that.  To prevent this, there is the @: macro.
> 
> This patch puts @: after every vs., e.g., and i.e. where it is needed.
> In a few cases there was "@ " already, or "@\n", but @: is slightly
> better, and more consistent.
> 
> I only spot checked the output.
> 
> Is this okay for trunk?
> 
> 
> Segher
> 
> 
> 2018-11-06  Segher Boessenkool  
> 
>   * target.def: Put @: after every vs., e.g., and i.e. if it is followed
>   by whitespace.
>   * doc/extend.texi: Ditto.
>   * doc/fragments.texi: Ditto.
>   * doc/gimple.texi: Ditto.
>   * doc/implement-c.texi: Ditto.
>   * doc/install.texi: Ditto.
>   * doc/invoke.texi: Ditto.
>   * doc/md.texi: Ditto.
>   * doc/plugins.texi: Ditto.
>   * doc/rtl.texi: Ditto.
>   * doc/sourcebuild.texi: Ditto.
>   * doc/tm.texi.in: Ditto.
>   * doc/ux.texi: Ditto.
>   * doc/tm.texi: Regenerate.
OK
jeff


Re: Update libquadmath fmaq from glibc, fix nanq issues

2018-11-07 Thread Jakub Jelinek
On Wed, Nov 07, 2018 at 01:40:11PM +, Joseph Myers wrote:
> On Wed, 7 Nov 2018, Jakub Jelinek wrote:
> 
> > Don't know about the dropping of HAVE_FENV_H/USE_FENV_H stuff, don't we
> > support libquadmath on targets that don't have fenv.h?
> > In other sources, like e.g. expq.c, the USE_FENV_H guards are still kept.
> 
> All those conditionals are now meant to be handled through quadmath-imp.h, 
> which defines various functions as macros in the no-fenv.h case (macros 
> that ignore their arguments, where there might be issues with calls using 
> FE_* macros that themselves aren't defined).
> 
> I haven't tested any systems without fenv.h, but any issues in that regard 
> are intended to be addressed through further quadmath-imp.h, to avoid 
> needing to insert conditionals in individual source files in an automated 
> way.

Ah, ok then.

Jakub


Re: Update libquadmath fmaq from glibc, fix nanq issues

2018-11-07 Thread Joseph Myers
On Wed, 7 Nov 2018, Jakub Jelinek wrote:

> Don't know about the dropping of HAVE_FENV_H/USE_FENV_H stuff, don't we
> support libquadmath on targets that don't have fenv.h?
> In other sources, like e.g. expq.c, the USE_FENV_H guards are still kept.

All those conditionals are now meant to be handled through quadmath-imp.h, 
which defines various functions as macros in the no-fenv.h case (macros 
that ignore their arguments, where there might be issues with calls using 
FE_* macros that themselves aren't defined).

I haven't tested any systems without fenv.h, but any issues in that regard 
are intended to be addressed through further quadmath-imp.h, to avoid 
needing to insert conditionals in individual source files in an automated 
way.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Free TYPE_VALUES of enums

2018-11-07 Thread Richard Biener
On Wed, Nov 7, 2018 at 1:34 PM Jan Hubicka  wrote:
>
> Hi,
> this patch make free_lang_data_in_type to free TYPE_VALUE of enum
> unless it is an main variant of ODR type (in that case we use them to
> produce ODR warnings).  C++ represents enum values as CONST_DECLs that
> are expensive to stream and unused, so I also updated free_lang_data to
> replace them by C representation that only uses integer constants.
>
> This needs little change to type verifier.
>
> In addition to that ipa-devirt now frees the values after ODR warnings.
> This reduces ltrans files from 450 to 370MB.
> I have also updated duplicate printing to be more pareseable.
>
> Bootstrapped/regtested x86_64-linux, will commit it after
> lto-bootstrapping uneless there are complains.

LGTM

> Honza
>
> * ipa-devirt.c (odr_types_equivalent_p): Expect constants
> than const decls in TREE_VALUE of enum.
> (dump_type_inheritance_graph): Improve duplicate dumping.
> (free_enum_values): New.
> (build_type_inheritance_graph): Use it.
> * tree.c (free_lang_data_in_type): Free TYPE_VALUES of enums
> which are not main variants or not ODR types.
> (verify_type_variant): Expect variants to have no TYPE_VALUES.
> Index: ipa-devirt.c
> ===
> --- ipa-devirt.c(revision 265872)
> +++ ipa-devirt.c(working copy)
> @@ -1328,9 +1328,7 @@ odr_types_equivalent_p (tree t1, tree t2
>" is defined in another translation unit"));
>   return false;
> }
> - if (TREE_VALUE (v1) != TREE_VALUE (v2)
> - && !operand_equal_p (DECL_INITIAL (TREE_VALUE (v1)),
> -  DECL_INITIAL (TREE_VALUE (v2)), 0))
> + if (TREE_VALUE (v1) != TREE_VALUE (v2))
> {
>   warn_odr (t1, t2, NULL, NULL, warn, warned,
> G_("an enum with different values is defined"
> @@ -2191,6 +2189,7 @@ static void
>  dump_type_inheritance_graph (FILE *f)
>  {
>unsigned int i;
> +  unsigned int num_all_types = 0, num_types = 0, num_duplicates = 0;
>if (!odr_types_ptr)
>  return;
>fprintf (f, "\n\nType inheritance graph:\n");
> @@ -2201,26 +2200,70 @@ dump_type_inheritance_graph (FILE *f)
>  }
>for (i = 0; i < odr_types.length (); i++)
>  {
> -  if (odr_types[i] && odr_types[i]->types && odr_types[i]->types->length 
> ())
> -   {
> - unsigned int j;
> - fprintf (f, "Duplicate tree types for odr type %i\n", i);
> - print_node (f, "", odr_types[i]->type, 0);
> - for (j = 0; j < odr_types[i]->types->length (); j++)
> -   {
> - tree t;
> - fprintf (f, "duplicate #%i\n", j);
> - print_node (f, "", (*odr_types[i]->types)[j], 0);
> - t = (*odr_types[i]->types)[j];
> - while (TYPE_P (t) && TYPE_CONTEXT (t))
> -   {
> - t = TYPE_CONTEXT (t);
> - print_node (f, "", t, 0);
> -   }
> - putc ('\n',f);
> +  if (!odr_types[i])
> +   continue;
> +
> +  num_all_types++;
> +  if (!odr_types[i]->types || !odr_types[i]->types->length ())
> +   continue;
> +
> +  /* To aid ODR warnings we also mangle integer constants but do
> +not consinder duplicates there.  */
> +  if (TREE_CODE (odr_types[i]->type) == INTEGER_TYPE)
> +   continue;
> +
> +  /* It is normal to have one duplicate and one normal variant.  */
> +  if (odr_types[i]->types->length () == 1
> + && COMPLETE_TYPE_P (odr_types[i]->type)
> + && !COMPLETE_TYPE_P ((*odr_types[i]->types)[0]))
> +   continue;
> +
> +  num_types ++;
> +
> +  unsigned int j;
> +  fprintf (f, "Duplicate tree types for odr type %i\n", i);
> +  print_node (f, "", odr_types[i]->type, 0);
> +  print_node (f, "", TYPE_NAME (odr_types[i]->type), 0);
> +  putc ('\n',f);
> +  for (j = 0; j < odr_types[i]->types->length (); j++)
> +   {
> + tree t;
> + num_duplicates ++;
> + fprintf (f, "duplicate #%i\n", j);
> + print_node (f, "", (*odr_types[i]->types)[j], 0);
> + t = (*odr_types[i]->types)[j];
> + while (TYPE_P (t) && TYPE_CONTEXT (t))
> +   {
> + t = TYPE_CONTEXT (t);
> + print_node (f, "", t, 0);
> }
> + print_node (f, "", TYPE_NAME ((*odr_types[i]->types)[j]), 0);
> + putc ('\n',f);
> }
>  }
> +  fprintf (f, "Out of %i types there are %i types with duplicates; "
> +  "%i duplicates overall\n", num_all_types, num_types, 
> num_duplicates);
> +}
> +
> +/* Save some WPA->ltrans streaming by freeing enum values.  */
> +
> +static void
> +free_enum_values ()
> +{
> +  static bool enum_values_freed = false;
> +  if (enum_values_freed || !flag_wpa || !odr_types_ptr)
> +return;
> +  

Re: [RFC][PR87528][PR86677] Disable builtin popcount detection when back-end does not define it

2018-11-07 Thread Richard Biener
On Fri, Nov 2, 2018 at 10:02 AM Kugan Vivekanandarajah
 wrote:
>
> Hi Richard,
> Thanks for the review.
> On Tue, 30 Oct 2018 at 01:25, Richard Biener  
> wrote:
> >
> > On Mon, Oct 29, 2018 at 2:06 AM Kugan Vivekanandarajah
> >  wrote:
> > >
> > > Hi Richard and Jeff,
> > >
> > > Thanks for your comments.
> > >
> > > On Fri, 26 Oct 2018 at 19:40, Richard Biener  
> > > wrote:
> > > >
> > > > On Fri, Oct 26, 2018 at 4:55 AM Jeff Law  wrote:
> > > > >
> > > > > On 10/25/18 4:33 PM, Kugan Vivekanandarajah wrote:
> > > > > > Hi,
> > > > > >
> > > > > > PR87528 showed a case where libgcc generated popcount is causing
> > > > > > regression for Skylake.
> > > > > > We also have PR86677 where kernel build is failing because the 
> > > > > > kernel
> > > > > > does not use the libgcc (when backend is not defining popcount
> > > > > > pattern).  While I agree that the kernel should implement its own
> > > > > > functionality when it is not using the libgcc, I am afraid that the
> > > > > > implementation can have the same performance issues reported for
> > > > > > Skylake in PR87528.
> > > > > >
> > > > > > Therefore, I would like to propose that we disable popcount 
> > > > > > detection
> > > > > > when we don't have a pattern for that. The attached patch (based on
> > > > > > previous discussions) does this.
> > > > > >
> > > > > > Bootstrapped and regression tested on x86_64-linux-gnu with no new
> > > > > > regressions. We need to disable the popcount* testcases. I will have
> > > > > > to define a effective_target_with_popcount in
> > > > > > gcc/testsuite/lib/target-supports.exp if this patch is OK?
> > > > > > Thanks,
> > > > > > Kugan
> > > > > >
> > > > > >
> > > > > > gcc/ChangeLog:
> > > > > >
> > > > > > 2018-10-25  Kugan Vivekanandarajah  
> > > > > >
> > > > > > * tree-scalar-evolution.c (expression_expensive_p): Make 
> > > > > > BUILTIN POPCOUNT
> > > > > > as expensive when backend does not define it.
> > > > > >
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >
> > > > > > 2018-10-25  Kugan Vivekanandarajah  
> > > > > >
> > > > > > * gcc.target/aarch64/popcount4.c: New test.
> > > > > >
> > > > > FWIW, I've been disabling by checking direct_optab_handler elsewhere
> > > > > (number_of_iterations_popcount) in my tester.  It may in fact be an 
> > > > > old
> > > > > patch from you.
> > > > >
> > > > > Richi argued that it's the kernel team's responsibility to provide a
> > > > > popcount since they don't link with libgcc.  And I'm generally in
> > > > > agreement with that position, though it does tend to generate some
> > > > > friction with the kernel developers.  We also run the real risk of 
> > > > > GCC 9
> > > > > not being able to build the kernel which, IMHO, would be a disaster 
> > > > > from
> > > > > a PR standpoint.
> > > > >
> > > > > I'd like to hear from others here.  I fully realize we're beyond the
> > > > > realm of what is strictly technically correct here from a review 
> > > > > standpoint.
> > > >
> > > > As said final value replacement to a library call is probably not wanted
> > > > for optimization purpose, so adjusting expression_expensive_p is OK with
> > > > me.  It might not fully solve the (non-)issue in case another 
> > > > optimization pass
> > > > chooses to materialize niter computation result.
> > > >
> > > > Few comments on the patch:
> > > >
> > > > +  tree fndecl = get_callee_fndecl (expr);
> > > > +
> > > > +  if (fndecl && DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
> > > > +   {
> > > > + combined_fn cfn = as_combined_fn (DECL_FUNCTION_CODE 
> > > > (fndecl));
> > > >
> > > >   combined_fn cfn = gimple_call_combined_fn (expr);
> > > >   switch (cfn)
> > > > {
> > >
> > > Did you mean:
> > > combined_fn cfn = get_call_combined_fn (expr);
> >
> > Yes.
> >
> > > > ...
> > > >
> > > > cfn will be CFN_LAST for a non-builtin/internal call.  I know Richard 
> > > > is mostly
> > > > offline but eventually he knows whether there is a better way to query
> > > >
> > > > +   CASE_CFN_POPCOUNT:
> > > > + /* Check if opcode for popcount is available.  */
> > > > + if (optab_handler (popcount_optab,
> > > > +TYPE_MODE (TREE_TYPE (CALL_EXPR_ARG
> > > > (expr, 0
> > > > + == CODE_FOR_nothing)
> > > > +   return true;
> > > >
> > > > note that we currently generate builtin calls rather than IFN calls
> > > > (when a direct
> > > > optab is supported).
> > > >
> > > > Another comment on the patch is that you probably have to adjust 
> > > > existing
> > > > popcount testcases to add architecture specific flags enabling suport 
> > > > for
> > > > the instructions, otherwise you won't see loop replacement.
> > > Indeed.
> > > In lib/target-supports.exp, I will try to add support for
> > > check_effective_target_popcount_long.
> > > When I grep for the popcount pattern in md files, I see it is defined for:
> > >
> > > 

Re: introduce --enable-mingw-full32 to default to --large-address-aware

2018-11-07 Thread JonY
On 11/07/2018 08:34 AM, Alexandre Oliva wrote:
> On Nov  1, 2018, JonY wrote:
> 
>> Looks like it causes an error on 64bit:
>> /usr/libexec/gcc/x86_64-w64-mingw32/ld: unrecognized option
>> '--large-address-aware'
> 
> What does?  The patch I suggested?  The current trunk?
> 
> What was the command in this case?  How was the toolchain configured?
> 
> 
> I've been looking into this, getting progressively puzzled, though I
> actually managed to duplicated the problem you mentioned, but only on
> x86_64-mingw32, NOT on x86_64-w64-mingw32.
> 
> 

No it's just a quick test to see how x86_64-w64-mingw32 reacts to
--large-address-aware, it doesn't play well.

> Here's what I found out in my investigation:
> 
> configured for i686-mingw32, GNU ld supports only the i386pe emulation,
> that supports the --large-address-aware flag.
> 
> Configured for x86_64-*-mingw32, it supports i386pe, but it defaults to
> i386pep, the 64-bit binary format, that does NOT support
> --large-address-aware.
> 
> x86_64-w64-mingw32 passes -mi386pe or -mi386pep to the linker, depending
> on -m32 or -m64, so the code to pass --large-address-aware to link -m32
> binaries in mingw-w64.h looks correct to me.  But x86_64-mingw32 does
> NOT use that: it uses the LINK_SPEC from mingw32.h, so it doesn't
> specify the emulation, ever.  That seems awfully broken to me.  If you
> ask for a 32-bit binary, using the default 64-bit linker format is
> unlikely to produce the desired results.
> 
> Is x86_64-mingw32 really supposed to be a usable target name?  It might
> even work as a 64-bit only target, but I don't see how its biarch
> support could possibly be functional.
> 
> If it is to be usable, is it really supposed to be different from
> x86_64-w64-mingw32?  Using mingw-w64.h besides mingw32.h would fix the
> biarch problems, but perhaps that's not desired for other reasons.
> 
> Fixing that is way beyond my knowledge or interest on Windows-based
> platforms, but given clarification as to whether x86_64-mingw32 is
> supposed to support biarch at all, I might be able to fix the
> implementation of --enable-large-address-aware there.
> 
> As for the problem you reported on x86_64-w64-mingw32, I'm afraid I'll
> need some more information to be able to duplicate that and try to fix
> it.
> 
> Thanks,
> 

x86_64-mingw32 is not used as far as I know, only with "w64" or "pc".

The "w64" carries a special meaning to gcc dating back to the early
64bit port. It basically tells gcc to use mingw-w64 specific features
that are not found on the regular mingw.org CRT at the time.

This might be affecting the "pc" vendor build, can you check
x86_64-pc-mingw32 just to see if it is affected?

Thanks.



signature.asc
Description: OpenPGP digital signature


Free TYPE_VALUES of enums

2018-11-07 Thread Jan Hubicka
Hi,
this patch make free_lang_data_in_type to free TYPE_VALUE of enum
unless it is an main variant of ODR type (in that case we use them to
produce ODR warnings).  C++ represents enum values as CONST_DECLs that
are expensive to stream and unused, so I also updated free_lang_data to
replace them by C representation that only uses integer constants.

This needs little change to type verifier.

In addition to that ipa-devirt now frees the values after ODR warnings.
This reduces ltrans files from 450 to 370MB.
I have also updated duplicate printing to be more pareseable.

Bootstrapped/regtested x86_64-linux, will commit it after
lto-bootstrapping uneless there are complains.

Honza

* ipa-devirt.c (odr_types_equivalent_p): Expect constants
than const decls in TREE_VALUE of enum.
(dump_type_inheritance_graph): Improve duplicate dumping.
(free_enum_values): New.
(build_type_inheritance_graph): Use it.
* tree.c (free_lang_data_in_type): Free TYPE_VALUES of enums
which are not main variants or not ODR types.
(verify_type_variant): Expect variants to have no TYPE_VALUES.
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 265872)
+++ ipa-devirt.c(working copy)
@@ -1328,9 +1328,7 @@ odr_types_equivalent_p (tree t1, tree t2
   " is defined in another translation unit"));
  return false;
}
- if (TREE_VALUE (v1) != TREE_VALUE (v2)
- && !operand_equal_p (DECL_INITIAL (TREE_VALUE (v1)),
-  DECL_INITIAL (TREE_VALUE (v2)), 0))
+ if (TREE_VALUE (v1) != TREE_VALUE (v2))
{
  warn_odr (t1, t2, NULL, NULL, warn, warned,
G_("an enum with different values is defined"
@@ -2191,6 +2189,7 @@ static void
 dump_type_inheritance_graph (FILE *f)
 {
   unsigned int i;
+  unsigned int num_all_types = 0, num_types = 0, num_duplicates = 0;
   if (!odr_types_ptr)
 return;
   fprintf (f, "\n\nType inheritance graph:\n");
@@ -2201,26 +2200,70 @@ dump_type_inheritance_graph (FILE *f)
 }
   for (i = 0; i < odr_types.length (); i++)
 {
-  if (odr_types[i] && odr_types[i]->types && odr_types[i]->types->length 
())
-   {
- unsigned int j;
- fprintf (f, "Duplicate tree types for odr type %i\n", i);
- print_node (f, "", odr_types[i]->type, 0);
- for (j = 0; j < odr_types[i]->types->length (); j++)
-   {
- tree t;
- fprintf (f, "duplicate #%i\n", j);
- print_node (f, "", (*odr_types[i]->types)[j], 0);
- t = (*odr_types[i]->types)[j];
- while (TYPE_P (t) && TYPE_CONTEXT (t))
-   {
- t = TYPE_CONTEXT (t);
- print_node (f, "", t, 0);
-   }
- putc ('\n',f);
+  if (!odr_types[i])
+   continue;
+
+  num_all_types++;
+  if (!odr_types[i]->types || !odr_types[i]->types->length ())
+   continue;
+
+  /* To aid ODR warnings we also mangle integer constants but do
+not consinder duplicates there.  */
+  if (TREE_CODE (odr_types[i]->type) == INTEGER_TYPE)
+   continue;
+
+  /* It is normal to have one duplicate and one normal variant.  */
+  if (odr_types[i]->types->length () == 1
+ && COMPLETE_TYPE_P (odr_types[i]->type)
+ && !COMPLETE_TYPE_P ((*odr_types[i]->types)[0]))
+   continue;
+
+  num_types ++;
+
+  unsigned int j;
+  fprintf (f, "Duplicate tree types for odr type %i\n", i);
+  print_node (f, "", odr_types[i]->type, 0);
+  print_node (f, "", TYPE_NAME (odr_types[i]->type), 0);
+  putc ('\n',f);
+  for (j = 0; j < odr_types[i]->types->length (); j++)
+   {
+ tree t;
+ num_duplicates ++;
+ fprintf (f, "duplicate #%i\n", j);
+ print_node (f, "", (*odr_types[i]->types)[j], 0);
+ t = (*odr_types[i]->types)[j];
+ while (TYPE_P (t) && TYPE_CONTEXT (t))
+   {
+ t = TYPE_CONTEXT (t);
+ print_node (f, "", t, 0);
}
+ print_node (f, "", TYPE_NAME ((*odr_types[i]->types)[j]), 0);
+ putc ('\n',f);
}
 }
+  fprintf (f, "Out of %i types there are %i types with duplicates; "
+  "%i duplicates overall\n", num_all_types, num_types, num_duplicates);
+}
+
+/* Save some WPA->ltrans streaming by freeing enum values.  */
+
+static void
+free_enum_values ()
+{
+  static bool enum_values_freed = false;
+  if (enum_values_freed || !flag_wpa || !odr_types_ptr)
+return;
+  enum_values_freed = true;
+  unsigned int i;
+  for (i = 0; i < odr_types.length (); i++)
+if (odr_types[i] && TREE_CODE (odr_types[i]->type) == ENUMERAL_TYPE)
+  {
+   TYPE_VALUES (odr_types[i]->type) = NULL;
+   if (odr_types[i]->types)
+  for (unsigned int j = 0; j < 

Re: [PATCH] Reduce number of sreal operator* calls

2018-11-07 Thread Jan Hubicka
> 
> This reduces the number of $subject calls by computing big_speedup_p
> lazily.  This caller accounts for roughly a quarter of all operator*
> calls for PR38474 and operator* is top of the profile of the whole
> compilation.
> 
> Next offenders (callers) are compute_inlined_call_time and
> edge_badness.  profile_count::to_sreal_scale is also quite
> bad in performance btw (probably due to the sreal division).
> 
> Bootstrap/regtest in progress.
> 
> OK?

OK, thanks!
Honza
> 
> Thanks,
> Richard.
> 
> 2018-11-07  Richard Biener  
> 
>   * ipa-inline.c (want_inline_small_function_p): Compute
>   big_speedup_p lazily and last.
> 
> Index: gcc/ipa-inline.c
> ===
> --- gcc/ipa-inline.c  (revision 265860)
> +++ gcc/ipa-inline.c  (working copy)
> @@ -779,7 +779,7 @@ want_inline_small_function_p (struct cgr
>  {
>int growth = estimate_edge_growth (e);
>ipa_hints hints = estimate_edge_hints (e);
> -  bool big_speedup = big_speedup_p (e);
> +  int big_speedup = -1; /* compute this lazily */
>  
>if (growth <= 0)
>   ;
> @@ -787,13 +787,13 @@ want_inline_small_function_p (struct cgr
>hints suggests that inlining given function is very profitable.  */
>else if (DECL_DECLARED_INLINE_P (callee->decl)
>  && growth >= MAX_INLINE_INSNS_SINGLE
> -&& ((!big_speedup
> - && !(hints & (INLINE_HINT_indirect_call
> +&& (growth >= MAX_INLINE_INSNS_SINGLE * 16
> +|| (!(hints & (INLINE_HINT_indirect_call
> | INLINE_HINT_known_hot
> | INLINE_HINT_loop_iterations
> | INLINE_HINT_array_index
> -   | INLINE_HINT_loop_stride)))
> -|| growth >= MAX_INLINE_INSNS_SINGLE * 16))
> +   | INLINE_HINT_loop_stride))
> +&& !(big_speedup = big_speedup_p (e)
>   {
>e->inline_failed = CIF_MAX_INLINE_INSNS_SINGLE_LIMIT;
> want_inline = false;
> @@ -813,7 +813,6 @@ want_inline_small_function_p (struct cgr
>Upgrade it to MAX_INLINE_INSNS_SINGLE when hints suggests that
>inlining given function is very profitable.  */
>else if (!DECL_DECLARED_INLINE_P (callee->decl)
> -&& !big_speedup
>  && !(hints & INLINE_HINT_known_hot)
>  && growth >= ((hints & (INLINE_HINT_indirect_call
>  | INLINE_HINT_loop_iterations
> @@ -821,7 +820,8 @@ want_inline_small_function_p (struct cgr
>  | INLINE_HINT_loop_stride))
>? MAX (MAX_INLINE_INSNS_AUTO,
>   MAX_INLINE_INSNS_SINGLE)
> -  : MAX_INLINE_INSNS_AUTO))
> +  : MAX_INLINE_INSNS_AUTO)
> +&& !(big_speedup == -1 ? big_speedup_p (e) : big_speedup))
>   {
> /* growth_likely_positive is expensive, always test it last.  */
>if (growth >= MAX_INLINE_INSNS_SINGLE


[PATCH] Reduce number of sreal operator* calls

2018-11-07 Thread Richard Biener


This reduces the number of $subject calls by computing big_speedup_p
lazily.  This caller accounts for roughly a quarter of all operator*
calls for PR38474 and operator* is top of the profile of the whole
compilation.

Next offenders (callers) are compute_inlined_call_time and
edge_badness.  profile_count::to_sreal_scale is also quite
bad in performance btw (probably due to the sreal division).

Bootstrap/regtest in progress.

OK?

Thanks,
Richard.

2018-11-07  Richard Biener  

* ipa-inline.c (want_inline_small_function_p): Compute
big_speedup_p lazily and last.

Index: gcc/ipa-inline.c
===
--- gcc/ipa-inline.c(revision 265860)
+++ gcc/ipa-inline.c(working copy)
@@ -779,7 +779,7 @@ want_inline_small_function_p (struct cgr
 {
   int growth = estimate_edge_growth (e);
   ipa_hints hints = estimate_edge_hints (e);
-  bool big_speedup = big_speedup_p (e);
+  int big_speedup = -1; /* compute this lazily */
 
   if (growth <= 0)
;
@@ -787,13 +787,13 @@ want_inline_small_function_p (struct cgr
 hints suggests that inlining given function is very profitable.  */
   else if (DECL_DECLARED_INLINE_P (callee->decl)
   && growth >= MAX_INLINE_INSNS_SINGLE
-  && ((!big_speedup
-   && !(hints & (INLINE_HINT_indirect_call
+  && (growth >= MAX_INLINE_INSNS_SINGLE * 16
+  || (!(hints & (INLINE_HINT_indirect_call
  | INLINE_HINT_known_hot
  | INLINE_HINT_loop_iterations
  | INLINE_HINT_array_index
- | INLINE_HINT_loop_stride)))
-  || growth >= MAX_INLINE_INSNS_SINGLE * 16))
+ | INLINE_HINT_loop_stride))
+  && !(big_speedup = big_speedup_p (e)
{
   e->inline_failed = CIF_MAX_INLINE_INSNS_SINGLE_LIMIT;
  want_inline = false;
@@ -813,7 +813,6 @@ want_inline_small_function_p (struct cgr
 Upgrade it to MAX_INLINE_INSNS_SINGLE when hints suggests that
 inlining given function is very profitable.  */
   else if (!DECL_DECLARED_INLINE_P (callee->decl)
-  && !big_speedup
   && !(hints & INLINE_HINT_known_hot)
   && growth >= ((hints & (INLINE_HINT_indirect_call
   | INLINE_HINT_loop_iterations
@@ -821,7 +820,8 @@ want_inline_small_function_p (struct cgr
   | INLINE_HINT_loop_stride))
 ? MAX (MAX_INLINE_INSNS_AUTO,
MAX_INLINE_INSNS_SINGLE)
-: MAX_INLINE_INSNS_AUTO))
+: MAX_INLINE_INSNS_AUTO)
+  && !(big_speedup == -1 ? big_speedup_p (e) : big_speedup))
{
  /* growth_likely_positive is expensive, always test it last.  */
   if (growth >= MAX_INLINE_INSNS_SINGLE


Re: introduce --enable-mingw-full32 to default to --large-address-aware

2018-11-07 Thread Alexandre Oliva
On Nov  1, 2018, JonY <10wa...@gmail.com> wrote:

> Looks like it causes an error on 64bit:
> /usr/libexec/gcc/x86_64-w64-mingw32/ld: unrecognized option
> '--large-address-aware'

What does?  The patch I suggested?  The current trunk?

What was the command in this case?  How was the toolchain configured?


I've been looking into this, getting progressively puzzled, though I
actually managed to duplicated the problem you mentioned, but only on
x86_64-mingw32, NOT on x86_64-w64-mingw32.


Here's what I found out in my investigation:

configured for i686-mingw32, GNU ld supports only the i386pe emulation,
that supports the --large-address-aware flag.

Configured for x86_64-*-mingw32, it supports i386pe, but it defaults to
i386pep, the 64-bit binary format, that does NOT support
--large-address-aware.

x86_64-w64-mingw32 passes -mi386pe or -mi386pep to the linker, depending
on -m32 or -m64, so the code to pass --large-address-aware to link -m32
binaries in mingw-w64.h looks correct to me.  But x86_64-mingw32 does
NOT use that: it uses the LINK_SPEC from mingw32.h, so it doesn't
specify the emulation, ever.  That seems awfully broken to me.  If you
ask for a 32-bit binary, using the default 64-bit linker format is
unlikely to produce the desired results.

Is x86_64-mingw32 really supposed to be a usable target name?  It might
even work as a 64-bit only target, but I don't see how its biarch
support could possibly be functional.

If it is to be usable, is it really supposed to be different from
x86_64-w64-mingw32?  Using mingw-w64.h besides mingw32.h would fix the
biarch problems, but perhaps that's not desired for other reasons.

Fixing that is way beyond my knowledge or interest on Windows-based
platforms, but given clarification as to whether x86_64-mingw32 is
supposed to support biarch at all, I might be able to fix the
implementation of --enable-large-address-aware there.

As for the problem you reported on x86_64-w64-mingw32, I'm afraid I'll
need some more information to be able to duplicate that and try to fix
it.

Thanks,

-- 
Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
Be the change, be Free! FSF Latin America board member
GNU Toolchain EngineerFree Software Evangelist
Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe


Re: [PATCH v4 3/6,Committed] [MIPS] Add Loongson EXTensions R2 (EXT2) instructions support

2018-11-07 Thread Paul Hua
sorry, i commits a wrong version patch. Fix the typo and bad logical
by commits attached patch.
On Wed, Nov 7, 2018 at 5:14 PM Paul Hua  wrote:
>
> On Tue, Oct 16, 2018 at 10:50 AM Paul Hua  wrote:
> >
> >
From 16a357d8f844e4bdc45bf385e98b8dc6c0723720 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Wed, 7 Nov 2018 18:15:03 +0800
Subject: [PATCH] Fix some typo and brain twister logical.

gcc/
	* config/mips/mips.c: Fix typo in documentation of
	mips_loongson_ext2_prefetch_cookie.
	(mips_option_override): fix brain twister logical.
	* config/mips/mips.h: Fix typo in documentation of
	ISA_HAS_CTZ_CTO and define pattern.
	* config/mips/mips.md (prefetch): Hoist EXT2 above
	the 2EF/EXT block.
	(prefetch_indexed): Hoist EXT2 above the EXT block.

gcc/testsuite/
	* gcc.target/mips/loongson-ctz.c: Fix typo.
	* gcc.target/mips/loongson-dctz.c: Fix typo.
---
 gcc/config/mips/mips.c|  4 +--
 gcc/config/mips/mips.h|  2 +-
 gcc/config/mips/mips.md   | 34 +--
 gcc/testsuite/gcc.target/mips/loongson-ctz.c  |  2 +-
 gcc/testsuite/gcc.target/mips/loongson-dctz.c |  2 +-
 5 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 2b83e4ec679..d78e2056ec2 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -15151,7 +15151,7 @@ mips_prefetch_cookie (rtx write, rtx locality)
   return GEN_INT (INTVAL (write) + 6);
 }
 
-/* Loongson EXT2 only implements perf hint=0 (prefetch for load) and hint=1
+/* Loongson EXT2 only implements pref hint=0 (prefetch for load) and hint=1
(prefetch for store), other hint just scale to hint = 0 and hint = 1.  */
 
 rtx
@@ -20202,7 +20202,7 @@ mips_option_override (void)
 	 is true.  If a user explicitly says -mloongson-ext2 -mno-loongson-ext
 	 then that is an error.  */
   if (!TARGET_LOONGSON_EXT
-	  && !((target_flags_explicit & MASK_LOONGSON_EXT) == 0))
+	  && (target_flags_explicit & MASK_LOONGSON_EXT) != 0)
 	error ("%<-mloongson-ext2%> must be used with %<-mloongson-ext%>");
   target_flags |= MASK_LOONGSON_EXT;
 }
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 0a92cf6788a..11ca364d752 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -1158,7 +1158,7 @@ struct mips_cpu_info {
 /* ISA has count leading zeroes/ones instruction (not implemented).  */
 #define ISA_HAS_CLZ_CLO		(mips_isa_rev >= 1 && !TARGET_MIPS16)
 
-/* ISA has count tailing zeroes/ones instruction.  */
+/* ISA has count trailing zeroes/ones instruction.  */
 #define ISA_HAS_CTZ_CTO		(TARGET_LOONGSON_EXT2)
 
 /* ISA has three operand multiply instructions that put
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index 9e222dc0df0..0cb0cb80bcd 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -3153,7 +3153,7 @@
 ;;
 ;;  ...
 ;;
-;;  Count tailing zeroes.
+;;  Count trailing zeroes.
 ;;
 ;;  ...
 ;;
@@ -7157,21 +7157,21 @@
 	 (match_operand 2 "const_int_operand" "n"))]
   "ISA_HAS_PREFETCH && TARGET_EXPLICIT_RELOCS"
 {
-  if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT || TARGET_LOONGSON_EXT2)
+  if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT)
 {
-  /* Loongson ext2 implementation pref insnstructions.  */
-  if (TARGET_LOONGSON_EXT2)
-	{
-  	  operands[1] = mips_loongson_ext2_prefetch_cookie (operands[1],
-			operands[2]);
-	  return "pref\t%1, %a0";
-	}
   /* Loongson 2[ef] and Loongson ext use load to $0 for prefetching.  */
   if (TARGET_64BIT)
 	return "ld\t$0,%a0";
   else
 	return "lw\t$0,%a0";
 }
+  /* Loongson ext2 implementation pref instructions.  */
+  if (TARGET_LOONGSON_EXT2)
+{
+  operands[1] = mips_loongson_ext2_prefetch_cookie (operands[1],
+			operands[2]);
+  return "pref\t%1, %a0";
+}
   operands[1] = mips_prefetch_cookie (operands[1], operands[2]);
   return "pref\t%1,%a0";
 }
@@ -7184,21 +7184,21 @@
 	 (match_operand 3 "const_int_operand" "n"))]
   "ISA_HAS_PREFETCHX && TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT"
 {
-  if (TARGET_LOONGSON_EXT || TARGET_LOONGSON_EXT2)
+  if (TARGET_LOONGSON_EXT)
 {
-  /* Loongson ext2 implementation pref insnstructions.  */
-  if (TARGET_LOONGSON_EXT2)
-	{
-  	  operands[2] = mips_loongson_ext2_prefetch_cookie (operands[2],
-			operands[3]);
-	  return "prefx\t%2,%1(%0)";
-	}
   /* Loongson Loongson ext use index load to $0 for prefetching.  */
   if (TARGET_64BIT)
 	return "gsldx\t$0,0(%0,%1)";
   else
 	return "gslwx\t$0,0(%0,%1)";
 }
+  /* Loongson ext2 implementation pref instructions.  */
+  if (TARGET_LOONGSON_EXT2)
+{
+  operands[2] = mips_loongson_ext2_prefetch_cookie (operands[2],
+			operands[3]);
+  return "prefx\t%2,%1(%0)";
+}
   operands[2] = mips_prefetch_cookie (operands[2], operands[3]);
   return "prefx\t%2,%1(%0)";
 }
diff --git 

Re: [PATCH][AArch64] Add -mcpu/-mtune support for Arm Ares

2018-11-07 Thread Richard Earnshaw (lists)
On 07/11/2018 09:47, Kyrill Tkachov wrote:
> Hi all,
> 
> This adds support for the Arm Ares CPU for AArch64.
> It implements the Armv8.2-A architecture with the optional features
> of statistical profiling, dot product and FP16 on by default.
> 
> Note: Ares is a codename to enable early adopters and in time
> we will add the final product name once it's announced.
> 
> Bootstrapped and tested on aarch64-none-linux-gnu.
> 
> Ok for trunk?
> 

OK.

R.

> Thanks,
> Kyrill
> 
> 2018-11-07  Kyrylo Tkachov  
> 
>     * config/aarch64/aarch64-cores.def (ares): Define.
>     * config/aarch64/aarch64-tune.md: Regenerate.
>     * doc/invoke.texi (AArch64 Options): Document ares value for mtune.
> 
> aarch64-ares.patch
> 
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index 
> b1278fc263665e6ef4b175be8ab00502e7d005c4..34062a5e88683d5abe0d3ece64fa0f2a32a17cf6
>  100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -89,6 +89,7 @@ AARCH64_CORE("thunderx2t99",  thunderx2t99,  thunderx2t99, 
> 8_1A,  AARCH64_FL_FOR
>  AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
> AARCH64_FL_DOTPROD, cortexa53, 0x41, 0xd05, -1)
>  AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
> AARCH64_FL_DOTPROD, cortexa73, 0x41, 0xd0a, -1)
>  AARCH64_CORE("cortex-a76",  cortexa76, cortexa57, 8_2A,  
> AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
> AARCH64_FL_DOTPROD, cortexa72, 0x41, 0xd0b, -1)
> +AARCH64_CORE("ares",  ares, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | 
> AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_PROFILE, 
> cortexa72, 0x41, 0xd0c, -1)
>  
>  /* ARMv8.4-A Architecture Processors.  */
>  
> diff --git a/gcc/config/aarch64/aarch64-tune.md 
> b/gcc/config/aarch64/aarch64-tune.md
> index 
> ad52d89d247329e88a01626d2f03a370e8f75d58..fade1d4430ae3614c93e0f6af53e62064a5c5ef5
>  100644
> --- a/gcc/config/aarch64/aarch64-tune.md
> +++ b/gcc/config/aarch64/aarch64-tune.md
> @@ -1,5 +1,5 @@
>  ;; -*- buffer-read-only: t -*-
>  ;; Generated automatically by gentune.sh from aarch64-cores.def
>  (define_attr "tune"
> - 
> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
> + 
> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
>   (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 
> 4253af27ec50d2968bd4168b95acd069302ebb61..5f051ed1acca32a6bd0bb673691a55b72b239c96
>  100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -15147,13 +15147,13 @@ Specify the name of the target processor for which 
> GCC should tune the
>  performance of the code.  Permissible values for this option are:
>  @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
>  @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
> -@samp{cortex-a76}, @samp{exynos-m1}, @samp{falkor}, @samp{qdf24xx},
> -@samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
> -@samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},@samp{tsv110},
> -@samp{thunderxt83}, @samp{thunderx2t99}, @samp{cortex-a57.cortex-a53},
> -@samp{cortex-a72.cortex-a53}, @samp{cortex-a73.cortex-a35},
> -@samp{cortex-a73.cortex-a53}, @samp{cortex-a75.cortex-a55},
> -@samp{cortex-a76.cortex-a55}
> +@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{falkor},
> +@samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan},
> +@samp{thunderx}, @samp{thunderxt88}, @samp{thunderxt88p1}, 
> @samp{thunderxt81},
> +@samp{tsv110}, @samp{thunderxt83}, @samp{thunderx2t99},
> +@samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53},
> +@samp{cortex-a73.cortex-a35}, @samp{cortex-a73.cortex-a53},
> +@samp{cortex-a75.cortex-a55}, @samp{cortex-a76.cortex-a55}
>  @samp{native}.
>  
>  The values @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53},
> 



Re: Simplify function types

2018-11-07 Thread Richard Biener
On Tue, 6 Nov 2018, Jan Hubicka wrote:

> Hi,
> this patch simplifies function types.  For GCC it cuts number of type
> duplicates to half (to about 500 duplicates).  I need to analyze the
> remaining ones, but i think they are mostly caused by mixing up
> complete/incomplete enums and arrays of pointers that should be last
> necessary changes to avoid duplicated ODR types at the GCC bootstrap.
> 
> We are now down from 650MB to 450MB of ltrans files since my last
> report https://gcc.gnu.org/ml/gcc-patches/2018-10/msg02034.html
> 
>  phase opt and generate :  39.37 ( 78%)   0.87 ( 13%)  40.27 ( 
> 70%)  40 kB ( 29%)
>  phase stream in:  10.34 ( 20%)   0.40 (  6%)  10.74 ( 
> 19%)  980729 kB ( 70%)
>  ipa function summary   :   0.20 (  0%)   0.05 (  1%)   0.24 (  
> 0%)   67974 kB (  5%)
>  ipa cp :   0.85 (  2%)   0.05 (  1%)   0.93 (  
> 2%)  126839 kB (  9%)
>  ipa inlining heuristics:  30.71 ( 61%)   0.08 (  1%)  30.81 ( 
> 53%)  119761 kB (  9%)
>  lto stream inflate :   2.41 (  5%)   0.20 (  3%)   2.51 (  
> 4%)   0 kB (  0%)
>  ipa lto gimple in  :   1.21 (  2%)   0.50 (  8%)   1.65 (  
> 3%)  201610 kB ( 14%)
>  ipa lto gimple out :   0.04 (  0%)   0.03 (  0%)   0.07 (  
> 0%)   0 kB (  0%)
>  whopr partitioning :   1.31 (  3%)   0.01 (  0%)   1.32 (  
> 2%)5338 kB (  0%)
>  ipa icf:   2.93 (  6%)   0.07 (  1%)   3.04 (  
> 5%)   12830 kB (  1%)
>  TOTAL  :  50.54  6.60 57.72  
>   1393898 kB
> 
> Thus even relatively small improvements in type merging still translate
> to large improvements in overal ltrans stream size, because types keeps
> dupicating everything else.
> 
> lto-bootstrapped/regtested x86_64-linux before some last minute change.
> Re-testing, OK if it passes?

OK.

Richard.

> Honza
> 
>   * tree.c (free_lang_data_in_type): Add fld parameter; simplify
>   return and parameter types of function and method types.
>   (free_lang_data_in_cgraph): Update.
> Index: tree.c
> ===
> --- tree.c(revision 265848)
> +++ tree.c(working copy)
> @@ -5261,7 +5261,7 @@ free_lang_data_in_binfo (tree binfo)
>  /* Reset all language specific information still present in TYPE.  */
>  
>  static void
> -free_lang_data_in_type (tree type)
> +free_lang_data_in_type (tree type, struct free_lang_data_d *fld)
>  {
>gcc_assert (TYPE_P (type));
>  
> @@ -5280,6 +5280,7 @@ free_lang_data_in_type (tree type)
>  
>if (TREE_CODE (type) == FUNCTION_TYPE)
>  {
> +  TREE_TYPE (type) = fld_simplified_type (TREE_TYPE (type), fld);
>/* Remove the const and volatile qualifiers from arguments.  The
>C++ front end removes them, but the C front end does not,
>leading to false ODR violation errors when merging two
> @@ -5287,6 +5288,7 @@ free_lang_data_in_type (tree type)
>different front ends.  */
>for (tree p = TYPE_ARG_TYPES (type); p; p = TREE_CHAIN (p))
>   {
> +  TREE_VALUE (p) = fld_simplified_type (TREE_VALUE (p), fld);
> tree arg_type = TREE_VALUE (p);
>  
> if (TYPE_READONLY (arg_type) || TYPE_VOLATILE (arg_type))
> @@ -5295,16 +5297,22 @@ free_lang_data_in_type (tree type)
> & ~TYPE_QUAL_CONST
> & ~TYPE_QUAL_VOLATILE;
> TREE_VALUE (p) = build_qualified_type (arg_type, quals);
> -   free_lang_data_in_type (TREE_VALUE (p));
> +   free_lang_data_in_type (TREE_VALUE (p), fld);
>   }
> /* C++ FE uses TREE_PURPOSE to store initial values.  */
> TREE_PURPOSE (p) = NULL;
>   }
>  }
>else if (TREE_CODE (type) == METHOD_TYPE)
> -for (tree p = TYPE_ARG_TYPES (type); p; p = TREE_CHAIN (p))
> -  /* C++ FE uses TREE_PURPOSE to store initial values.  */
> -  TREE_PURPOSE (p) = NULL;
> +{
> +  TREE_TYPE (type) = fld_simplified_type (TREE_TYPE (type), fld);
> +  for (tree p = TYPE_ARG_TYPES (type); p; p = TREE_CHAIN (p))
> + {
> +   /* C++ FE uses TREE_PURPOSE to store initial values.  */
> +   TREE_VALUE (p) = fld_simplified_type (TREE_VALUE (p), fld);
> +   TREE_PURPOSE (p) = NULL;
> + }
> +}
>else if (RECORD_OR_UNION_TYPE_P (type))
>  {
>/* Remove members that are not FIELD_DECLs from the field list
> @@ -5985,7 +5994,7 @@ free_lang_data_in_cgraph (void)
>  
>/* Traverse every type found freeing its language data.  */
>FOR_EACH_VEC_ELT (fld.types, i, t)
> -free_lang_data_in_type (t);
> +free_lang_data_in_type (t, );
>if (flag_checking)
>  {
>FOR_EACH_VEC_ELT (fld.types, i, t)
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


[PATCH][arm] Add support for Arm Ares

2018-11-07 Thread Kyrill Tkachov

Hi all,

This adds support for the Arm Ares CPU for in the arm port.
It implements the Armv8.2-A architecture with the optional features
of statistical profiling, dot product and FP16 on by default.

Note: Ares is a codename to enable early adopters and in time
we will add the final product name once it's announced.

Bootstrapped and tested on arm-none-linux-gnueabihf.

Will commit to trunk with the aarch64 patch once that is approved.

Thanks,
Kyrill

2018-11-07  Kyrylo Tkachov  

* config/arm/arm-cpus.in (ares): New entry.
* config/arm/arm-tables.opt: Regenerate.
* config/arm/arm-tune.md: Likewise.
* doc/invoke.texi (ARM Options): Document ares.
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index d82e95a226659948e59b317f07e0fd386ed674a2..b3163a90260c66a8df18d00282443434dee96e15 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -1376,6 +1376,17 @@ begin cpu cortex-a76
  part d0b
 end cpu cortex-a76
 
+begin cpu ares
+ cname ares
+ tune for cortex-a57
+ tune flags LDSCHED
+ architecture armv8.2-a+fp16+dotprod+simd
+ option crypto add FP_ARMv8 CRYPTO
+ costs cortex_a57
+ vendor 41
+ part d0c
+end cpu ares
+
 # ARMv8.2 A-profile ARM DynamIQ big.LITTLE implementations
 begin cpu cortex-a75.cortex-a55
  cname cortexa75cortexa55
diff --git a/gcc/config/arm/arm-tables.opt b/gcc/config/arm/arm-tables.opt
index eacee746a39912d04aa03c636f9a95e0e72ce43b..ceac4b4be419c9bd27db281e9880948ff5c40d76 100644
--- a/gcc/config/arm/arm-tables.opt
+++ b/gcc/config/arm/arm-tables.opt
@@ -282,6 +282,9 @@ Enum(processor_type) String(cortex-a75) Value( TARGET_CPU_cortexa75)
 EnumValue
 Enum(processor_type) String(cortex-a76) Value( TARGET_CPU_cortexa76)
 
+EnumValue
+Enum(processor_type) String(ares) Value( TARGET_CPU_ares)
+
 EnumValue
 Enum(processor_type) String(cortex-a75.cortex-a55) Value( TARGET_CPU_cortexa75cortexa55)
 
diff --git a/gcc/config/arm/arm-tune.md b/gcc/config/arm/arm-tune.md
index f64c1ef176de6c31659cce35326de8393e9cd886..2bd7e8741166af43f606cee1eb2cc3a0c712af29 100644
--- a/gcc/config/arm/arm-tune.md
+++ b/gcc/config/arm/arm-tune.md
@@ -49,7 +49,7 @@ (define_attr "tune"
 	cortexa72,cortexa73,exynosm1,
 	xgene1,cortexa57cortexa53,cortexa72cortexa53,
 	cortexa73cortexa35,cortexa73cortexa53,cortexa55,
-	cortexa75,cortexa76,cortexa75cortexa55,
-	cortexa76cortexa55,cortexm23,cortexm33,
-	cortexr52"
+	cortexa75,cortexa76,ares,
+	cortexa75cortexa55,cortexa76cortexa55,cortexm23,
+	cortexm33,cortexr52"
 	(const (symbol_ref "((enum attr_tune) arm_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5f051ed1acca32a6bd0bb673691a55b72b239c96..81c6232283b0607703f1f3381f1135ebdda36bfe 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -16693,8 +16693,8 @@ Permissible names are: @samp{arm2}, @samp{arm250},
 @samp{cortex-a9}, @samp{cortex-a12}, @samp{cortex-a15}, @samp{cortex-a17},
 @samp{cortex-a32}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
 @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
-@samp{cortex-a76}, @samp{cortex-r4}, @samp{cortex-r4f}, @samp{cortex-r5},
-@samp{cortex-r7}, @samp{cortex-r8}, @samp{cortex-r52},
+@samp{cortex-a76}, @samp{ares}, @samp{cortex-r4}, @samp{cortex-r4f},
+@samp{cortex-r5}, @samp{cortex-r7}, @samp{cortex-r8}, @samp{cortex-r52},
 @samp{cortex-m33},
 @samp{cortex-m23},
 @samp{cortex-m7},


[PATCH][AArch64] Add -mcpu/-mtune support for Arm Ares

2018-11-07 Thread Kyrill Tkachov

Hi all,

This adds support for the Arm Ares CPU for AArch64.
It implements the Armv8.2-A architecture with the optional features
of statistical profiling, dot product and FP16 on by default.

Note: Ares is a codename to enable early adopters and in time
we will add the final product name once it's announced.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

Thanks,
Kyrill

2018-11-07  Kyrylo Tkachov  

* config/aarch64/aarch64-cores.def (ares): Define.
* config/aarch64/aarch64-tune.md: Regenerate.
* doc/invoke.texi (AArch64 Options): Document ares value for mtune.
diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index b1278fc263665e6ef4b175be8ab00502e7d005c4..34062a5e88683d5abe0d3ece64fa0f2a32a17cf6 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -89,6 +89,7 @@ AARCH64_CORE("thunderx2t99",  thunderx2t99,  thunderx2t99, 8_1A,  AARCH64_FL_FOR
 AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa53, 0x41, 0xd05, -1)
 AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa73, 0x41, 0xd0a, -1)
 AARCH64_CORE("cortex-a76",  cortexa76, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa72, 0x41, 0xd0b, -1)
+AARCH64_CORE("ares",  ares, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | AARCH64_FL_PROFILE, cortexa72, 0x41, 0xd0c, -1)
 
 /* ARMv8.4-A Architecture Processors.  */
 
diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md
index ad52d89d247329e88a01626d2f03a370e8f75d58..fade1d4430ae3614c93e0f6af53e62064a5c5ef5 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
-	"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
+	"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
 	(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4253af27ec50d2968bd4168b95acd069302ebb61..5f051ed1acca32a6bd0bb673691a55b72b239c96 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15147,13 +15147,13 @@ Specify the name of the target processor for which GCC should tune the
 performance of the code.  Permissible values for this option are:
 @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
 @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
-@samp{cortex-a76}, @samp{exynos-m1}, @samp{falkor}, @samp{qdf24xx},
-@samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
-@samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},@samp{tsv110},
-@samp{thunderxt83}, @samp{thunderx2t99}, @samp{cortex-a57.cortex-a53},
-@samp{cortex-a72.cortex-a53}, @samp{cortex-a73.cortex-a35},
-@samp{cortex-a73.cortex-a53}, @samp{cortex-a75.cortex-a55},
-@samp{cortex-a76.cortex-a55}
+@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{falkor},
+@samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan},
+@samp{thunderx}, @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},
+@samp{tsv110}, @samp{thunderxt83}, @samp{thunderx2t99},
+@samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53},
+@samp{cortex-a73.cortex-a35}, @samp{cortex-a73.cortex-a53},
+@samp{cortex-a75.cortex-a55}, @samp{cortex-a76.cortex-a55}
 @samp{native}.
 
 The values @samp{cortex-a57.cortex-a53}, @samp{cortex-a72.cortex-a53},


Re: [PATCH v4 0/3] OpenRISC port

2018-11-07 Thread Richard Henderson
On 11/6/18 9:21 PM, Stafford Horne wrote:
> As you can see this is v4 of the OpenRISC port patch series, I just want to
> mention that there are a few things pointed out during the v3 review that I 
> have
> not fixed, and do not plan before pushing upstream.  These are either because 
> I
> didn't feel they made the code easier to read or they were things that could
> wait unil after upstreaming.  These include:
> 
> (not changed)
>  - libgcc !cmov 1cyc improvements suggested by Richard
>  - gcc eliminations refactorings suggested by Segher
>  - leaving out empty constraint strings suggested by Segher
>  - implementing TARGET_ISNS_COST suggested by Segher
> 
> Please let me know if you have concerns; now onto the patches:
> 
> 
> Changes Since v3:
>  - Fix tabs formatting pointed out by Segher
>  - Fix comment formatting and typos pointed out by Segher
>  - Fix for sign/zero extention login in md file from Richard
>  - Remove usages of ATTRIBUTE_UNUSED suggested by Segher
>  - Remove need for init/fini, removing crti/n.S files
>  - Add support for -static-pie in LINK_SPEC suggsted by Szabolcs

All ok.  I agree that the other improvements you detail above can be handled
via normal development.


r~


Re: [PATCH] Verify that last argument of __builtin_expect_with_probability is a real cst (PR c/87811).

2018-11-07 Thread Martin Liška
On 11/5/18 7:00 PM, Martin Sebor wrote:
> On 11/01/2018 07:45 AM, Martin Liška wrote:
>> On 11/1/18 1:15 PM, Jakub Jelinek wrote:
>>> On Thu, Nov 01, 2018 at 01:09:16PM +0100, Martin Liška wrote:
 -range 0.0 to 1.0, inclusive.
 +range 0.0 to 1.0, inclusive.  The @var{probability} argument must be
 +a compiler time constant.
>>>
>>> When you say must, I think error_at should be used rather than warning_at.
>>> If others disagree I'm open for leaving it as is.
>>
>> Error is fine for me as well.
>>
>>>
 @@ -2474,6 +2481,11 @@ expr_expected_value_1 (tree type, tree op0, enum 
 tree_code code,
    *predictor = PRED_BUILTIN_EXPECT_WITH_PROBABILITY;
    *probability = probi;
  }
 +  else
 +  warning_at (gimple_location (def), 0,
 +  "probability argument %qE must be a in the "
 +  "range 0.0 to 1.0", prob);
>>>
>>> Wrong indentation.
>>>
>>> And, no diagnostics for -O0 (which should also be covered by a testcase).
>>
>> Test for that added.
>>
>>>
 +/* { dg-options "-O2 -fdump-tree-profile_estimate -frounding-math" } */
>>>
>>> Why the -frounding-math options?
>>
>> I remember I had some issue with:
>>   tree r = fold_build2_initializer_loc (UNKNOWN_LOCATION,
>>     MULT_EXPR, t, prob, base);
>>
>> on targets with a non-IEEE floating point arithmetics (s390?).
>>
>>  I think test
>>> coverage should handle both that and when that option is not used
>>> if that option makes any difference.
>>
>> It will eventually pop up if we install new tests w/o rounding math.
>>
>>>
>>> Jakub
>>>
>>
>>
>> Martin
>>
> 
> I noticed a few minor issues in the hunks below:
> 
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -12046,7 +12046,8 @@
>  when testing pointer or floating-point values.
> 
>  This function has the same semantics as @code{__builtin_expect},
>  but the caller provides the expected probability that @var{exp} == @var{c}.
>  The last argument, @var{probability}, is a floating-point value in the
> -range 0.0 to 1.0, inclusive.
> +range 0.0 to 1.0, inclusive.  The @var{probability} argument must be
> +a compiler time constant.
> 
> The term is "compile-time constant" but please see below.
> 
> --- a/gcc/predict.c
> +++ b/gcc/predict.c
> @@ -2467,6 +2467,13 @@
>  expr_expected_value_1 (tree type, tree op0, enum tree_code code,
>    base = build_real_from_int_cst (t, base);
>    tree r = fold_build2_initializer_loc (UNKNOWN_LOCATION,
>  MULT_EXPR, t, prob, base);
> +  if (TREE_CODE (r) != REAL_CST)
> +    {
> +  error_at (gimple_location (def),
> +    "probability argument %qE must be a compile "
> +    "time constant", prob);
> +  return NULL;
>     }
> 
> According to GCC coding conventions, when used as an adjective
> the term "compile-time" should be hyphenated.  But the term used
> in other diagnostics is either "constant integer" or "constant
> integer expressions" so I would suggest to use it instead, here
> and in the manual.
> 
> @@ -2474,6 +2481,11 @@
>  expr_expected_value_1 (tree type, tree op0, enum tree_code code,
>    *predictor = PRED_BUILTIN_EXPECT_WITH_PROBABILITY;
>    *probability = probi;
>  }
> +  else
> +    error_at (gimple_location (def),
> +  "probability argument %qE must be a in the "
> +  "range 0.0 to 1.0", prob);
> +
> 
> There's a stray 'a' in the text of the error.
> 
> But it's not really meaningful to say
> 
>   3.14 must be in the range 0.0 to 1.0
> 
> because that simply cannot happen.  We could say "argument 2 must
> be in the range" but I would instead suggest to rephrase the error
> along the same lines as other similar messages GCC already issues:
> 
>   "probability %qE is outside the range [0.0, 1.0]"
> 
> Martin

Hi Martin.

Thanks for help with the wording. Please take a look at attached patch
candidate.

Martin
>From 94b61505be171b6b16f7a85c62c722d3c9e13c2f Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 7 Nov 2018 10:27:00 +0100
Subject: [PATCH] Change wording of __builtin_expect_with_probability errors.

gcc/ChangeLog:

2018-11-07  Martin Liska  

	* doc/extend.texi: Reword.
	* predict.c (expr_expected_value_1): Likewise.

gcc/testsuite/ChangeLog:

2018-11-07  Martin Liska  

	* gcc.dg/pr87811.c: Update scanned pattern.
	* gcc.dg/pr87811-2.c: Likewise.
---
 gcc/doc/extend.texi  | 2 +-
 gcc/predict.c| 8 
 gcc/testsuite/gcc.dg/pr87811-2.c | 2 +-
 gcc/testsuite/gcc.dg/pr87811.c   | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7d16129..d6802ad3467 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -12047,7 +12047,7 @@ This function has the same semantics as 

Re: [PATCH][RFC] Fix UBSAN in postreload-gcse.c (PR rtl-optimization/87868).

2018-11-07 Thread Martin Liška
On 11/6/18 7:55 PM, Jeff Law wrote:
> On 11/6/18 7:05 AM, Martin Liška wrote:
>> Hi.
>>
>> The patch is adding a check overflow in  eliminate_partially_redundant_load.
>> Question is whether the usage of conditional compilation of 
>> __builtin_mul_overflow
>> is fine?
>>
>> Thanks,
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2018-11-06  Martin Liska  
>>
>>  PR rtl-optimization/87868
>>  * postreload-gcse.c (eliminate_partially_redundant_load): Set
>>  threshold to max_count if we would overflow.
>>  * profile-count.h: Make max_count a public constant.
> OK.  Though I do worry about how many of these things we'll have to
> sprinkle over the sources over time.  I suspect there's all kinds of
> overflows just waiting to happen, some are obviously more important than
> others.

Sure! Btw. I've been preparing patch that will limit some of --param values as
they tent to overlap if a user selects a big-enough value.

Martin

> 
> jeff
> 



Re: [PATCH 2/4] Fix GNU coding style.

2018-11-07 Thread Martin Liška
On 11/7/18 10:17 AM, Jakub Jelinek wrote:
> On Wed, Nov 07, 2018 at 10:12:17AM +0100, Martin Liška wrote:
>>/* Register memory allocation descriptor for container PTR.  ORIGIN 
>> identifies
>>   type of container and GGC identifes if the allocation is handled in GGC
>>   memory.  Each location is identified by file NAME, LINE in source code 
>> and
>>   FUNCTION name.  */
>> -  T * register_descriptor (const void *ptr, mem_alloc_origin origin,
>> +  T *register_descriptor (const void *ptr, mem_alloc_origin origin,
>> bool ggc, const char *name, int line,
>> const char *function);
> 
> This can't be right, if you move the ( one column to the left, then the
> following lines need to be moved one column to the left too.  Likewise
> below:

Thanks, I noticed that and installed proper version.

Martin

> 
>> @@ -342,7 +342,7 @@ public:
>>/* Release PTR pointer of SIZE bytes. If REMOVE_FROM_MAP is set to true,
>>   remove the instance from reverse map.  Return memory usage that belongs
>>   to this memory description.  */
>> -  T * release_instance_overhead (void *ptr, size_t size,
>> +  T *release_instance_overhead (void *ptr, size_t size,
>>   bool remove_from_map = false);
>>  
>>/* Release intance object identified by PTR pointer.  */
>> @@ -355,7 +355,7 @@ public:
>>   are filtered by ORIGIN type, LENGTH is return value where we register
>>   the number of elements in the list. If we want to process custom order,
>>   CMP comparator can be provided.  */
>> -  mem_list_t * get_list (mem_alloc_origin origin, unsigned *length,
>> +  mem_list_t *get_list (mem_alloc_origin origin, unsigned *length,
>>   int (*cmp) (const void *first,
>>   const void *second) = NULL);
>>  
> 
>   Jakub
> 



[PATCH v4 6/6,Committed] [MIPS] Add Loongson 2K1000 processor support

2018-11-07 Thread Paul Hua
On Tue, Oct 16, 2018 at 10:50 AM Paul Hua  wrote:
>
>
From 7ab0637b28b22bdb00e021692ceb8372855c8a87 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Wed, 7 Nov 2018 09:38:09 +0800
Subject: [PATCH 6/6] Add support for Loongson 2K1000 processor.

gcc/
	* config/mips/gs264e.md: New.
	* config/mips/mips-cpus.def: Define gs264e.
	* config/mips/mips-tables.opt: Regenerate.
	* config/mips/mips.c (mips_rtx_cost_data): Add DEFAULT_COSTS for
	gs264e.
	(mips_issue_rate): Add support for gs264e.
	(mips_multipass_dfa_lookahead): Likewise.
	* config/mips/mips.h: Define TARGET_GS264E and TUNE_GS264E.
	(MIPS_ISA_LEVEL_SPEC): Infer mips64r2 from gs264e.
	(MIPS_ASE_MSA_SPEC): New.
	(BASE_DRIVER_SELF_SPECS): march=gs264e implies -mmsa.
	(ISA_HAS_FUSED_MADD4): Enable for TARGET_GS264E.
	(ISA_HAS_UNFUSED_MADD4): Exclude TARGET_GS264E.
	* config/mips/mips.md: Include gs264e.md.
	(processor): Add gs264e.
	* config/mips/mips.opt (MSA): Use Mask instead of Var.
	* doc/invoke.texi: Add gs264e to supported architectures.
---
 gcc/config/mips/gs264e.md   | 133 
 gcc/config/mips/mips-cpus.def   |   1 +
 gcc/config/mips/mips-tables.opt |  19 +++--
 gcc/config/mips/mips.c  |   6 +-
 gcc/config/mips/mips.h  |  23 --
 gcc/config/mips/mips.md |   2 +
 gcc/config/mips/mips.opt|   2 +-
 gcc/doc/invoke.texi |   2 +-
 8 files changed, 171 insertions(+), 17 deletions(-)
 create mode 100644 gcc/config/mips/gs264e.md

diff --git a/gcc/config/mips/gs264e.md b/gcc/config/mips/gs264e.md
new file mode 100644
index 000..8f1f9e17e08
--- /dev/null
+++ b/gcc/config/mips/gs264e.md
@@ -0,0 +1,133 @@
+;; Pipeline model for Loongson gs264e cores.
+
+;; Copyright (C) 2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+;; Uncomment the following line to output automata for debugging.
+;; (automata_option "v")
+
+;; Automaton for integer instructions.
+(define_automaton "gs264e_a_alu")
+
+;; Automaton for floating-point instructions.
+(define_automaton "gs264e_a_falu")
+
+;; Automaton for memory operations.
+(define_automaton "gs264e_a_mem")
+
+;; Describe the resources.
+
+(define_cpu_unit "gs264e_alu1" "gs264e_a_alu")
+(define_cpu_unit "gs264e_mem1" "gs264e_a_mem")
+(define_cpu_unit "gs264e_falu1" "gs264e_a_falu")
+
+;; Describe instruction reservations.
+
+(define_insn_reservation "gs264e_arith" 1
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "arith,clz,const,logical,
+			move,nop,shift,signext,slt"))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_branch" 1
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "branch,jump,call,condmove,trap"))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_mfhilo" 1
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "mfhi,mflo,mthi,mtlo"))
+  "gs264e_alu1")
+
+;; Operation imul3nc is fully pipelined.
+(define_insn_reservation "gs264e_imul3nc" 7
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "imul3nc"))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_imul" 7
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "imul,imadd"))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_idiv_si" 12
+  (and (eq_attr "cpu" "gs264e")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "SI")))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_idiv_di" 25
+  (and (eq_attr "cpu" "gs264e")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "DI")))
+  "gs264e_alu1")
+
+(define_insn_reservation "gs264e_load" 4
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "load"))
+  "gs264e_mem1")
+
+(define_insn_reservation "gs264e_fpload" 4
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "load,mfc,mtc"))
+  "gs264e_mem1")
+
+(define_insn_reservation "gs264e_prefetch" 0
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "prefetch,prefetchx"))
+  "gs264e_mem1")
+
+(define_insn_reservation "gs264e_store" 0
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "store,fpstore,fpidxstore"))
+  "gs264e_mem1")
+
+(define_insn_reservation "gs264e_fadd" 4
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "fadd,fmul,fmadd"))
+  "gs264e_falu1")
+
+(define_insn_reservation "gs264e_fcmp" 2
+  (and (eq_attr "cpu" "gs264e")
+   (eq_attr "type" "fabs,fcmp,fmove,fneg"))
+  "gs264e_falu1")
+
+(define_insn_reservation 

[PATCH v4 4/6, Committed] [MIPS] Add Loongson 3A1000 processor support

2018-11-07 Thread Paul Hua
On Tue, Oct 16, 2018 at 10:50 AM Paul Hua  wrote:
>
>
From ef10d77f03e693299611e6b4eee2ae6375a5841d Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Tue, 6 Nov 2018 21:12:46 +0800
Subject: [PATCH 4/6] Add support for Loongson 3A1000 processor.

gcc/
	* config/mips/loongson3a.md: Rename to ...
	* config/mips/gs464.md: ... here.
	* config/mips/mips-cpus.def: Define gs464; Add loongson3a
	as an alias of gs464 processor.
	* config/mips/mips-tables.opt: Regenerate.
	* config/mips/mips.c (mips_issue_rate): Use PROCESSOR_GS464
	instead of PROCESSOR_LOONGSON_3A.
	(mips_multipass_dfa_lookahead): Use TUNE_GS464 instead of
	TUNE_LOONGSON_3A.
	(mips_option_override): Enable MMI and EXT for gs464.
	* config/mips/mips.h: Rename TARGET_LOONGSON_3A to TARGET_GS464;
	Rename TUNE_LOONGSON_3A to TUNE_GS464.
	(MIPS_ISA_LEVEL_SPEC): Infer mips64r2 from gs464.
	(ISA_HAS_ODD_SPREG, ISA_AVOID_DIV_HILO, ISA_HAS_FUSED_MADD4,
	ISA_HAS_UNFUSED_MADD4): Use TARGET_GS464 instead of
	TARGET_LOONGSON_3A.
	* config/mips/mips.md: Include gs464.md instead of loongson3a.md.
	(processor): Add gs464;
	* doc/invoke.texi: Add gs464 to supported architectures.
---
 gcc/config/mips/gs464.md| 137 
 gcc/config/mips/loongson3a.md   | 137 
 gcc/config/mips/mips-cpus.def   |   3 +-
 gcc/config/mips/mips-tables.opt |  19 +++--
 gcc/config/mips/mips.c  |   6 +-
 gcc/config/mips/mips.h  |  17 ++--
 gcc/config/mips/mips.md |   4 +-
 gcc/doc/invoke.texi |   2 +-
 8 files changed, 165 insertions(+), 160 deletions(-)
 create mode 100644 gcc/config/mips/gs464.md
 delete mode 100644 gcc/config/mips/loongson3a.md

diff --git a/gcc/config/mips/gs464.md b/gcc/config/mips/gs464.md
new file mode 100644
index 000..82efb66786f
--- /dev/null
+++ b/gcc/config/mips/gs464.md
@@ -0,0 +1,137 @@
+;; Pipeline model for Loongson gs464 cores.
+
+;; Copyright (C) 2011-2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+;; Uncomment the following line to output automata for debugging.
+;; (automata_option "v")
+
+;; Automaton for integer instructions.
+(define_automaton "gs464_a_alu")
+
+;; Automaton for floating-point instructions.
+(define_automaton "gs464_a_falu")
+
+;; Automaton for memory operations.
+(define_automaton "gs464_a_mem")
+
+;; Describe the resources.
+
+(define_cpu_unit "gs464_alu1" "gs464_a_alu")
+(define_cpu_unit "gs464_alu2" "gs464_a_alu")
+(define_cpu_unit "gs464_mem" "gs464_a_mem")
+(define_cpu_unit "gs464_falu1" "gs464_a_falu")
+(define_cpu_unit "gs464_falu2" "gs464_a_falu")
+
+;; Describe instruction reservations.
+
+(define_insn_reservation "gs464_arith" 1
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "arith,clz,const,logical,
+			move,nop,shift,signext,slt"))
+  "gs464_alu1 | gs464_alu2")
+
+(define_insn_reservation "gs464_branch" 1
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "branch,jump,call,condmove,trap"))
+  "gs464_alu1")
+
+(define_insn_reservation "gs464_mfhilo" 1
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "mfhi,mflo,mthi,mtlo"))
+  "gs464_alu2")
+
+;; Operation imul3nc is fully pipelined.
+(define_insn_reservation "gs464_imul3nc" 5
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "imul3nc"))
+  "gs464_alu2")
+
+(define_insn_reservation "gs464_imul" 7
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "imul,imadd"))
+  "gs464_alu2 * 7")
+
+(define_insn_reservation "gs464_idiv_si" 12
+  (and (eq_attr "cpu" "gs464")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "SI")))
+  "gs464_alu2 * 12")
+
+(define_insn_reservation "gs464_idiv_di" 25
+  (and (eq_attr "cpu" "gs464")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "DI")))
+  "gs464_alu2 * 25")
+
+(define_insn_reservation "gs464_load" 3
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "load"))
+  "gs464_mem")
+
+(define_insn_reservation "gs464_fpload" 4
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "load,mfc,mtc"))
+  "gs464_mem")
+
+(define_insn_reservation "gs464_prefetch" 0
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "prefetch,prefetchx"))
+  "gs464_mem")
+
+(define_insn_reservation "gs464_store" 0
+  (and (eq_attr "cpu" "gs464")
+   (eq_attr "type" "store,fpstore,fpidxstore"))
+  "gs464_mem")
+
+;; All the fp operations can 

Re: [PATCH 2/4] Fix GNU coding style.

2018-11-07 Thread Jakub Jelinek
On Wed, Nov 07, 2018 at 10:12:17AM +0100, Martin Liška wrote:
>/* Register memory allocation descriptor for container PTR.  ORIGIN 
> identifies
>   type of container and GGC identifes if the allocation is handled in GGC
>   memory.  Each location is identified by file NAME, LINE in source code 
> and
>   FUNCTION name.  */
> -  T * register_descriptor (const void *ptr, mem_alloc_origin origin,
> +  T *register_descriptor (const void *ptr, mem_alloc_origin origin,
>  bool ggc, const char *name, int line,
>  const char *function);

This can't be right, if you move the ( one column to the left, then the
following lines need to be moved one column to the left too.  Likewise
below:

> @@ -342,7 +342,7 @@ public:
>/* Release PTR pointer of SIZE bytes. If REMOVE_FROM_MAP is set to true,
>   remove the instance from reverse map.  Return memory usage that belongs
>   to this memory description.  */
> -  T * release_instance_overhead (void *ptr, size_t size,
> +  T *release_instance_overhead (void *ptr, size_t size,
>bool remove_from_map = false);
>  
>/* Release intance object identified by PTR pointer.  */
> @@ -355,7 +355,7 @@ public:
>   are filtered by ORIGIN type, LENGTH is return value where we register
>   the number of elements in the list. If we want to process custom order,
>   CMP comparator can be provided.  */
> -  mem_list_t * get_list (mem_alloc_origin origin, unsigned *length,
> +  mem_list_t *get_list (mem_alloc_origin origin, unsigned *length,
>int (*cmp) (const void *first,
>const void *second) = NULL);
>  

Jakub


[PATCH v4 5/6,Committed] [MIPS] Add Loongson 3A2000/3A3000 processor support

2018-11-07 Thread Paul Hua
On Tue, Oct 16, 2018 at 10:50 AM Paul Hua  wrote:
>
>
From 51c914e8c2b2e4c7cc93718e563a8f55f0161ff9 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Wed, 7 Nov 2018 09:27:05 +0800
Subject: [PATCH 5/6] Add support for Loongson 3A2000/3A3000 processor.

gcc/
	* config/mips/gs464e.md: New.
	* config/mips/mips-cpus.def: Define gs464e.
	* config/mips/mips-tables.opt: Regenerate.
	* config/mips/mips.c (mips_rtx_cost_data): Add DEFAULT_COSTS for
	gs464e.
	(mips_issue_rate): Add support for gs464e.
	(mips_multipass_dfa_lookahead): Likewise.
	(mips_option_override): Enable MMI, EXT and EXT2 for gs464e.
	* config/mips/mips.h: Define TARGET_GS464E and TUNE_GS464E.
	(MIPS_ISA_LEVEL_SPEC): Infer mips64r2 from gs464e.
	(ISA_HAS_FUSED_MADD4): Enable for TARGET_GS464E.
	(ISA_HAS_UNFUSED_MADD4): Exclude TARGET_GS464E.
	* config/mips/mips.md: Include gs464e.md.
	(processor): Add gs464e.
	* doc/invoke.texi: Add gs464e to supported architectures.
---
 gcc/config/mips/gs464e.md   | 137 
 gcc/config/mips/mips-cpus.def   |   1 +
 gcc/config/mips/mips-tables.opt |  19 +++--
 gcc/config/mips/mips.c  |   6 +-
 gcc/config/mips/mips.h  |  13 ++-
 gcc/config/mips/mips.md |   2 +
 gcc/doc/invoke.texi |   1 +
 7 files changed, 166 insertions(+), 13 deletions(-)
 create mode 100644 gcc/config/mips/gs464e.md

diff --git a/gcc/config/mips/gs464e.md b/gcc/config/mips/gs464e.md
new file mode 100644
index 000..60e0e6b0463
--- /dev/null
+++ b/gcc/config/mips/gs464e.md
@@ -0,0 +1,137 @@
+;; Pipeline model for Loongson gs464e cores.
+
+;; Copyright (C) 2018 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published
+;; by the Free Software Foundation; either version 3, or (at your
+;; option) any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+;; Uncomment the following line to output automata for debugging.
+;; (automata_option "v")
+
+;; Automaton for integer instructions.
+(define_automaton "gs464e_a_alu")
+
+;; Automaton for floating-point instructions.
+(define_automaton "gs464e_a_falu")
+
+;; Automaton for memory operations.
+(define_automaton "gs464e_a_mem")
+
+;; Describe the resources.
+
+(define_cpu_unit "gs464e_alu1" "gs464e_a_alu")
+(define_cpu_unit "gs464e_alu2" "gs464e_a_alu")
+(define_cpu_unit "gs464e_mem1" "gs464e_a_mem")
+(define_cpu_unit "gs464e_mem2" "gs464e_a_mem")
+(define_cpu_unit "gs464e_falu1" "gs464e_a_falu")
+(define_cpu_unit "gs464e_falu2" "gs464e_a_falu")
+
+;; Describe instruction reservations.
+
+(define_insn_reservation "gs464e_arith" 1
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "arith,clz,const,logical,
+			move,nop,shift,signext,slt"))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_branch" 1
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "branch,jump,call,condmove,trap"))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_mfhilo" 1
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "mfhi,mflo,mthi,mtlo"))
+  "gs464e_alu1 | gs464e_alu2")
+
+;; Operation imul3nc is fully pipelined.
+(define_insn_reservation "gs464e_imul3nc" 5
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "imul3nc"))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_imul" 7
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "imul,imadd"))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_idiv_si" 12
+  (and (eq_attr "cpu" "gs464e")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "SI")))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_idiv_di" 25
+  (and (eq_attr "cpu" "gs464e")
+   (and (eq_attr "type" "idiv")
+	(eq_attr "mode" "DI")))
+  "gs464e_alu1 | gs464e_alu2")
+
+(define_insn_reservation "gs464e_load" 4
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "load"))
+  "gs464e_mem1 | gs464e_mem2")
+
+(define_insn_reservation "gs464e_fpload" 5
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "load,mfc,mtc"))
+  "gs464e_mem1 | gs464e_mem2")
+
+(define_insn_reservation "gs464e_prefetch" 0
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "prefetch,prefetchx"))
+  "gs464e_mem1 | gs464e_mem2")
+
+(define_insn_reservation "gs464e_store" 0
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "store,fpstore,fpidxstore"))
+  "gs464e_mem1 | gs464e_mem2")
+
+(define_insn_reservation "gs464e_fadd" 4
+  (and (eq_attr "cpu" "gs464e")
+   (eq_attr "type" "fadd,fmul,fmadd"))
+  

[PATCH v4 3/6,Committed] [MIPS] Add Loongson EXTensions R2 (EXT2) instructions support

2018-11-07 Thread Paul Hua
On Tue, Oct 16, 2018 at 10:50 AM Paul Hua  wrote:
>
>
From 73a4aac5034307cf7369bb70fa407709502fffbf Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Fri, 31 Aug 2018 11:55:48 +0800
Subject: [PATCH 3/6] Add support for Loongson EXT2 instructions.

gcc/
	* config/mips/mips-protos.h
	(mips_loongson_ext2_prefetch_cookie): New prototype.
	* config/mips/mips.c (mips_loongson_ext2_prefetch_cookie): New.
	(mips_option_override): Enable TARGET_LOONGSON_EXT when
	TARGET_LOONGSON_EXT2 is true.
	* config/mips/mips.h (TARGET_CPU_CPP_BUILTINS): Define
	__mips_loongson_ext2, __mips_loongson_ext_rev=2.
	(ISA_HAS_CTZ_CTO): New, true if TARGET_LOONGSON_EXT2.
	(ISA_HAS_PREFETCH): Include TARGET_LOONGSON_EXT and
	TARGET_LOONGSON_EXT2.
	(ASM_SPEC): Add mloongson-ext2 and mno-loongson-ext2.
	(define_insn "ctz2"): New insn pattern.
	(define_insn "prefetch"): Include TARGET_LOONGSON_EXT2.
	(define_insn "prefetch_indexed_"): Include
	TARGET_LOONGSON_EXT and TARGET_LOONGSON_EXT2.
	* config/mips/mips.opt (-mloongson-ext2): Add option.
	* gcc/doc/invoke.texi (-mloongson-ext2): Document.

gcc/testsuite/
	* gcc.target/mips/loongson-ctz.c: New test.
	* gcc.target/mips/loongson-dctz.c: Likewise.
	* gcc.target/mips/mips.exp (mips_option_groups): Add
	-mloongson-ext2 option.
---
 gcc/config/mips/mips-protos.h |  1 +
 gcc/config/mips/mips.c| 28 +++
 gcc/config/mips/mips.h| 15 +-
 gcc/config/mips/mips.md   | 47 +--
 gcc/config/mips/mips.opt  |  4 ++
 gcc/doc/invoke.texi   |  7 +++
 gcc/testsuite/gcc.target/mips/loongson-ctz.c  | 11 +
 gcc/testsuite/gcc.target/mips/loongson-dctz.c | 11 +
 gcc/testsuite/gcc.target/mips/mips.exp|  1 +
 9 files changed, 120 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/loongson-ctz.c
 create mode 100644 gcc/testsuite/gcc.target/mips/loongson-dctz.c

diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index 099120db7b4..7cde2424016 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -323,6 +323,7 @@ extern bool mips_linked_madd_p (rtx_insn *, rtx_insn *);
 extern bool mips_store_data_bypass_p (rtx_insn *, rtx_insn *);
 extern int mips_dspalu_bypass_p (rtx, rtx);
 extern rtx mips_prefetch_cookie (rtx, rtx);
+extern rtx mips_loongson_ext2_prefetch_cookie (rtx, rtx);
 
 extern const char *current_section_name (void);
 extern unsigned int current_section_flags (void);
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index b579c3c3a2a..1c2075044d0 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -15142,6 +15142,22 @@ mips_prefetch_cookie (rtx write, rtx locality)
   /* store_retained / load_retained.  */
   return GEN_INT (INTVAL (write) + 6);
 }
+
+/* Loongson EXT2 only implements perf hint=0 (prefetch for load) and hint=1
+   (prefetch for store), other hint just scale to hint = 0 and hint = 1.  */
+
+rtx
+mips_loongson_ext2_prefetch_cookie (rtx write, rtx locality)
+{
+  /* store.  */
+  if (INTVAL (write) == 1)
+return GEN_INT (INTVAL (write));
+
+  /* load.  */
+  if (INTVAL (write) == 0)
+return GEN_INT (INTVAL (write));
+}
+
 
 /* Flags that indicate when a built-in function is available.
 
@@ -20171,6 +20187,18 @@ mips_option_override (void)
   if (TARGET_LOONGSON_MMI &&  !TARGET_HARD_FLOAT_ABI)
 error ("%<-mloongson-mmi%> must be used with %<-mhard-float%>");
 
+  /* If TARGET_LOONGSON_EXT2, enable TARGET_LOONGSON_EXT.  */
+  if (TARGET_LOONGSON_EXT2)
+{
+  /* Make sure that when TARGET_LOONGSON_EXT2 is true, TARGET_LOONGSON_EXT
+	 is true.  If a user explicitly says -mloongson-ext2 -mno-loongson-ext
+	 then that is an error.  */
+  if (!TARGET_LOONGSON_EXT
+	  && !((target_flags_explicit & MASK_LOONGSON_EXT) == 0))
+	error ("%<-mloongson-ext2%> must be used with %<-mloongson-ext%>");
+  target_flags |= MASK_LOONGSON_EXT;
+}
+
   /* .eh_frame addresses should be the same width as a C pointer.
  Most MIPS ABIs support only one pointer size, so the assembler
  will usually know exactly how big an .eh_frame address is.
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 7237c8da8ac..beeb4bcf20d 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -600,6 +600,13 @@ struct mips_cpu_info {
   if (TARGET_LOONGSON_EXT)		\
 	{\
 	  builtin_define ("__mips_loongson_ext");			\
+	  if (TARGET_LOONGSON_EXT2)	\
+	{\
+	  builtin_define ("__mips_loongson_ext2");			\
+	  builtin_define ("__mips_loongson_ext_rev=2");		\
+	}\
+	  else\
+	  builtin_define ("__mips_loongson_ext_rev=1");		\
 	}\
 	\
   /* Historical Octeon macro.  */	\
@@ -1134,6 +1141,9 @@ struct mips_cpu_info {
 /* ISA has count leading zeroes/ones instruction (not implemented).  */
 #define ISA_HAS_CLZ_CLO		

Re: [PATCH 2/4] Fix GNU coding style.

2018-11-07 Thread Martin Liška
The version I've just installed.

M.
>From ff99a2f57f3372c66aa25ed0c8e36697b49fee56 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 7 Nov 2018 10:05:41 +0100
Subject: [PATCH] Fix GNU coding style (V2).

gcc/ChangeLog:

2018-11-07  Martin Liska  

	* mem-stats.h: Fix GNU coding style.
---
 gcc/mem-stats.h | 22 +++---
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/mem-stats.h b/gcc/mem-stats.h
index 10b41651bf3..6ab92211cf4 100644
--- a/gcc/mem-stats.h
+++ b/gcc/mem-stats.h
@@ -316,23 +316,23 @@ public:
   bool contains_descriptor_for_instance (const void *ptr);
 
   /* Return descriptor for instance PTR.  */
-  T * get_descriptor_for_instance (const void *ptr);
+  T *get_descriptor_for_instance (const void *ptr);
 
   /* Register memory allocation descriptor for container PTR which is
  described by a memory LOCATION.  */
-  T * register_descriptor (const void *ptr, mem_location *location);
+  T *register_descriptor (const void *ptr, mem_location *location);
 
   /* Register memory allocation descriptor for container PTR.  ORIGIN identifies
  type of container and GGC identifes if the allocation is handled in GGC
  memory.  Each location is identified by file NAME, LINE in source code and
  FUNCTION name.  */
-  T * register_descriptor (const void *ptr, mem_alloc_origin origin,
-			   bool ggc, const char *name, int line,
-			   const char *function);
+  T *register_descriptor (const void *ptr, mem_alloc_origin origin,
+			  bool ggc, const char *name, int line,
+			  const char *function);
 
   /* Register instance overhead identified by PTR pointer. Allocation takes
  SIZE bytes.  */
-  T * register_instance_overhead (size_t size, const void *ptr);
+  T *register_instance_overhead (size_t size, const void *ptr);
 
   /* For containers (and GGC) where we want to track every instance object,
  we register allocation of SIZE bytes, identified by PTR pointer, belonging
@@ -342,8 +342,8 @@ public:
   /* Release PTR pointer of SIZE bytes. If REMOVE_FROM_MAP is set to true,
  remove the instance from reverse map.  Return memory usage that belongs
  to this memory description.  */
-  T * release_instance_overhead (void *ptr, size_t size,
- bool remove_from_map = false);
+  T *release_instance_overhead (void *ptr, size_t size,
+bool remove_from_map = false);
 
   /* Release intance object identified by PTR pointer.  */
   void release_object_overhead (void *ptr);
@@ -355,9 +355,9 @@ public:
  are filtered by ORIGIN type, LENGTH is return value where we register
  the number of elements in the list. If we want to process custom order,
  CMP comparator can be provided.  */
-  mem_list_t * get_list (mem_alloc_origin origin, unsigned *length,
-			 int (*cmp) (const void *first,
- const void *second) = NULL);
+  mem_list_t *get_list (mem_alloc_origin origin, unsigned *length,
+			int (*cmp) (const void *first,
+const void *second) = NULL);
 
   /* Dump all tracked instances of type ORIGIN. If we want to process custom
  order, CMP comparator can be provided.  */
-- 
2.19.1



[PATCH v4 2/6, Committed] [MIPS] Split Loongson EXTensions (EXT) instructions from loongson3a

2018-11-07 Thread Paul Hua
On Tue, Oct 16, 2018 at 10:50 AM Paul Hua  wrote:
>
>
From b1dfcb228934e3cde90f408056192ed7faff4417 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Tue, 6 Nov 2018 17:04:36 +0800
Subject: [PATCH 2/6] Add support for Loongson EXT instructions.

gcc/
	* config/mips/mips.h (TARGET_CPU_CPP_BUILTINS): Add
	__mips_loongson_ext.
	(MIPS_ASE_LOONGSON_EXT_SPEC): New.
	(BASE_DRIVER_SELF_SPECS): march=loongson3a implies
	-mloongson-ext.
	(ASM_SPEC): Add mloongson-ext and mno-loongson-ext.
	* config/mips/mips.md (mul3, mul3_mul3_nohilo,
	div3, mod3, prefetch): Use TARGET_LOONGSON_EXT
	instead of TARGET_LOONGSON_3A.
	* config/mips/mips.opt (-mloongson-ext): Add option.
	* gcc/doc/invoke.texi (-mloongson-ext): Document.

gcc/testsuite/
	* gcc.target/mips/mips.exp (mips_option_groups): Add
	-mloongson-ext option.
	(mips-dg-options): Add mips_option_dependency options
	"-mmicromips" vs "-mno-loongson-ext",
---
 gcc/config/mips/mips.h | 14 +-
 gcc/config/mips/mips.md| 16 
 gcc/config/mips/mips.opt   |  4 
 gcc/doc/invoke.texi|  7 +++
 gcc/testsuite/gcc.target/mips/mips.exp |  2 ++
 5 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 27c0222ee46..7237c8da8ac 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -596,6 +596,12 @@ struct mips_cpu_info {
 	  builtin_define ("__mips_loongson_mmi");			\
 	}\
 	\
+  /* Whether Loongson EXT modes are enabled.  */			\
+  if (TARGET_LOONGSON_EXT)		\
+	{\
+	  builtin_define ("__mips_loongson_ext");			\
+	}\
+	\
   /* Historical Octeon macro.  */	\
   if (TARGET_OCTEON)		\
 	builtin_define ("__OCTEON__");	\
@@ -881,7 +887,8 @@ struct mips_cpu_info {
 #define BASE_DRIVER_SELF_SPECS	\
   MIPS_ISA_NAN2008_SPEC,	\
   MIPS_ASE_DSP_SPEC, 		\
-  MIPS_ASE_LOONGSON_MMI_SPEC
+  MIPS_ASE_LOONGSON_MMI_SPEC,	\
+  MIPS_ASE_LOONGSON_EXT_SPEC
 
 #define MIPS_ASE_DSP_SPEC \
   "%{!mno-dsp: \
@@ -893,6 +900,10 @@ struct mips_cpu_info {
   "%{!mno-loongson-mmi:\
  %{march=loongson2e|march=loongson2f|march=loongson3a: -mloongson-mmi}}"
 
+#define MIPS_ASE_LOONGSON_EXT_SPEC		\
+  "%{!mno-loongson-ext:\
+ %{march=loongson3a: -mloongson-ext}}"
+
 #define DRIVER_SELF_SPECS \
   MIPS_ISA_LEVEL_SPEC,	  \
   BASE_DRIVER_SELF_SPECS
@@ -1367,6 +1378,7 @@ struct mips_cpu_info {
 %{mginv} %{mno-ginv} \
 %{mmsa} %{mno-msa} \
 %{mloongson-mmi} %{mno-loongson-mmi} \
+%{mloongson-ext} %{mno-loongson-ext} \
 %{msmartmips} %{mno-smartmips} \
 %{mmt} %{mno-mt} \
 %{mfix-rm7000} %{mno-fix-rm7000} \
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index a88c1c53134..4b7a627b7a6 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -1599,7 +1599,7 @@
 {
   rtx lo;
 
-  if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_3A || ISA_HAS_R6MUL)
+  if (TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT || ISA_HAS_R6MUL)
 emit_insn (gen_mul3_mul3_nohilo (operands[0], operands[1],
 	   operands[2]));
   else if (ISA_HAS_MUL3)
@@ -1622,11 +1622,11 @@
   [(set (match_operand:GPR 0 "register_operand" "=d")
 (mult:GPR (match_operand:GPR 1 "register_operand" "d")
   (match_operand:GPR 2 "register_operand" "d")))]
-  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_3A || ISA_HAS_R6MUL"
+  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT || ISA_HAS_R6MUL"
 {
   if (TARGET_LOONGSON_2EF)
 return "multu.g\t%0,%1,%2";
-  else if (TARGET_LOONGSON_3A)
+  else if (TARGET_LOONGSON_EXT)
 return "gsmultu\t%0,%1,%2";
   else
 return "mul\t%0,%1,%2";
@@ -3016,11 +3016,11 @@
   [(set (match_operand:GPR 0 "register_operand" "=")
 	(any_div:GPR (match_operand:GPR 1 "register_operand" "d")
 		 (match_operand:GPR 2 "register_operand" "d")))]
-  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_3A || ISA_HAS_R6DIV"
+  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT || ISA_HAS_R6DIV"
   {
 if (TARGET_LOONGSON_2EF)
   return mips_output_division ("div.g\t%0,%1,%2", operands);
-else if (TARGET_LOONGSON_3A)
+else if (TARGET_LOONGSON_EXT)
   return mips_output_division ("gsdiv\t%0,%1,%2", operands);
 else
   return mips_output_division ("div\t%0,%1,%2", operands);
@@ -3032,11 +3032,11 @@
   [(set (match_operand:GPR 0 "register_operand" "=")
 	(any_mod:GPR (match_operand:GPR 1 "register_operand" "d")
 		 (match_operand:GPR 2 "register_operand" "d")))]
-  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_3A || ISA_HAS_R6DIV"
+  "TARGET_LOONGSON_2EF || TARGET_LOONGSON_EXT || ISA_HAS_R6DIV"
   {
 if (TARGET_LOONGSON_2EF)
   return mips_output_division ("mod.g\t%0,%1,%2", operands);
-else if (TARGET_LOONGSON_3A)
+else if (TARGET_LOONGSON_EXT)
   return mips_output_division ("gsmod\t%0,%1,%2", operands);
 else
   return mips_output_division ("mod\t%0,%1,%2", operands);
@@ -7136,7 +7136,7 @@
 	 

[PATCH v3 1/6, Committed] [MIPS] Split Loongson (MMI) from loongson3a

2018-11-07 Thread Paul Hua
Hi, Matthew:

I committed the patch. Thanks for your review.

On Tue, Oct 16, 2018 at 10:50 AM Paul Hua  wrote:
>
>
From f0e4191439f1dd212b766ea80852aad1919e4887 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Mon, 5 Nov 2018 16:34:50 +0800
Subject: [PATCH 1/6] Add support for loongson mmi instructions.

gcc/
	* config.gcc (extra_headers): Add loongson-mmiintrin.h.
	* config/mips/loongson.md: Move to ...
	* config/mips/loongson-mmi.md: here; Adjustment.
	* config/mips/loongson.h: Move to ...
	State as deprecated. Include loongson-mmiintrin.h for back
	compatibility and warning.
	* config/mips/loongson-mmiintrin.h: ... here.
	* config/mips/mips.c (mips_hard_regno_mode_ok_uncached,
	mips_vector_mode_supported_p, AVAIL_NON_MIPS16): Use
	TARGET_LOONGSON_MMI instead of TARGET_LOONGSON_VECTORS.
	(mips_option_override): Make sure MMI use hard float;
	(mips_shift_truncation_mask, mips_expand_vpc_loongson_even_odd,
	mips_expand_vpc_loongson_pshufh, mips_expand_vpc_loongson_bcast,
	mips_expand_vector_init): Use TARGET_LOONGSON_MMI instead of
	TARGET_LOONGSON_VECTORS.
	* gcc/config/mips/mips.h (TARGET_LOONGSON_VECTORS): Delete.
	(TARGET_CPU_CPP_BUILTINS): Add __mips_loongson_mmi.
	(MIPS_ASE_DSP_SPEC, MIPS_ASE_LOONGSON_MMI_SPEC): New.
	(BASE_DRIVER_SELF_SPECS): march=loongson2e/2f/3a implies
	-mloongson-mmi.
	(SHIFT_COUNT_TRUNCATED): Use TARGET_LOONGSON_MMI instead of
	TARGET_LOONGSON_VECTORS.
	* gcc/config/mips/mips.md (MOVE64, MOVE128): Use
	TARGET_LOONGSON_MMI instead of TARGET_LOONGSON_VECTORS.
	(Loongson MMI patterns): Include loongson-mmi.md instead of
	loongson.md.
	* gcc/config/mips/mips.opt (-mloongson-mmi): New option.
	* gcc/doc/invoke.texi (-mloongson-mmi): Document.

gcc/testsuite/
	* gcc.target/mips/loongson-shift-count-truncated-1.c
	(dg-options): Run under -mloongson-mmi option.
	Include loongson-mmiintrin.h instead of loongson.h.
	* gcc.target/mips/loongson-simd.c: Likewise.
	* gcc.target/mips/mips.exp (mips_option_groups): Add
	-mloongson-mmi option.
	(mips-dg-options): Add mips_option_dependency options "-mips16" vs
	"-mno-loongson-mmi", "-mmicromips" vs "-mno-loongson-mmi",
	"-msoft-float" vs "-mno-loongson-mmi".
	(mips-dg-init): Add -mloongson-mmi option.
	* lib/target-supports.exp: Rename check_mips_loongson_hw_available
	to check_mips_loongson_mmi_hw_available.
	Rename check_effective_target_mips_loongson_runtime to
	check_effective_target_mips_loongson_mmi_runtime.
	(check_effective_target_vect_int): Use mips_loongson_mmi instead
	of mips_loongson when check et-is-effective-target.
	(add_options_for_mips_loongson_mmi): New proc.
	Rename check_effective_target_mips_loongson to
	check_effective_target_mips_loongson_mmi.
	(check_effective_target_vect_shift,
	check_effective_target_whole_vector_shift,
	check_effective_target_vect_no_int_min_max,
	check_effective_target_vect_no_align,
	check_effective_target_vect_short_mult,
	check_vect_support_and_set_flags):Use mips_loongson_mmi instead
	of mips_loongson when check et-is-effective-target.
---
 gcc/config.gcc|   2 +-
 .../mips/{loongson.md => loongson-mmi.md} | 155 ++--
 gcc/config/mips/loongson-mmiintrin.h  | 691 ++
 gcc/config/mips/loongson.h| 669 +
 gcc/config/mips/mips.c|  27 +-
 gcc/config/mips/mips.h|  36 +-
 gcc/config/mips/mips.md   |  16 +-
 gcc/config/mips/mips.opt  |   4 +
 gcc/doc/invoke.texi   |   7 +
 .../mips/loongson-shift-count-truncated-1.c   |   6 +-
 gcc/testsuite/gcc.target/mips/loongson-simd.c |   4 +-
 gcc/testsuite/gcc.target/mips/mips.exp|  10 +
 gcc/testsuite/lib/target-supports.exp |  47 +-
 13 files changed, 877 insertions(+), 797 deletions(-)
 rename gcc/config/mips/{loongson.md => loongson-mmi.md} (88%)
 create mode 100644 gcc/config/mips/loongson-mmiintrin.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 5e5c328ed4c..e275a673836 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -458,7 +458,7 @@ microblaze*-*-*)
 mips*-*-*)
 	cpu_type=mips
 	d_target_objs="mips-d.o"
-	extra_headers="loongson.h msa.h"
+	extra_headers="loongson.h loongson-mmiintrin.h msa.h"
 	extra_objs="frame-header-opt.o"
 	extra_options="${extra_options} g.opt fused-madd.opt mips/mips-tables.opt"
 	;;
diff --git a/gcc/config/mips/loongson.md b/gcc/config/mips/loongson-mmi.md
similarity index 88%
rename from gcc/config/mips/loongson.md
rename to gcc/config/mips/loongson-mmi.md
index 14794d3671f..b126e625ed5 100644
--- a/gcc/config/mips/loongson.md
+++ b/gcc/config/mips/loongson-mmi.md
@@ -1,5 +1,4 @@
-;; Machine description for Loongson-specific patterns, such as
-;; ST Microelectronics Loongson-2E/2F etc.
+;; Machine description for Loongson MultiMedia extensions Instructions (MMI).
 ;; Copyright (C) 2008-2018 Free Software Foundation, Inc.
 ;; Contributed by CodeSourcery.
 ;;
@@ -102,7 +101,7 @@
 (define_expand "mov"
   

  1   2   >