[PATCH v2 C++] Fix PR70182 -- missing "on" in mangling of unresolved operators

2017-01-11 Thread Markus Trippelsdorf
On 2017.01.11 at 08:21 -0500, Nathan Sidwell wrote:
> On 01/11/2017 08:16 AM, Markus Trippelsdorf wrote:
> 
> > --- a/gcc/cp/mangle.c
> > +++ b/gcc/cp/mangle.c
> > @@ -2813,6 +2813,8 @@ write_template_args (tree args)
> >  static void
> >  write_member_name (tree member)
> >  {
> > +  if (abi_version_at_least (11) && IDENTIFIER_OPNAME_P (member))
> > +write_string ("on");
> 
> It looks like you need to:
> 1) add documentation to doc/invoke.texi (-fabi-version)
> 2) add something like:
>   if (abi_warn_or_compat_version_crosses (11))
>   G.need_abi_warning = 1;
> into that if clause.

Thanks for the review. Here is a new patch:

OK for trunk?


libiberty:

PR c++/70182
* cp-demangle.c (d_unqualified_name): Handle "on" for
operator names.
* testsuite/demangle-expected: Add tests.

gcc/cp:

PR c++/70182
* mangle.c (write_template_args): Add "on" for operator names.

gcc:

PR c++/70182
* doc/invoke.texi (fabi-version): Mention mangling fix for
operator names.


diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index e831deb31405..ef9e8fa71221 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -2813,6 +2813,12 @@ write_template_args (tree args)
 static void
 write_member_name (tree member)
 {
+  if (abi_version_at_least (11) && IDENTIFIER_OPNAME_P (member))
+{
+  write_string ("on");
+  if (abi_warn_or_compat_version_crosses (11))
+   G.need_abi_warning = 1;
+}
   if (identifier_p (member))
 write_unqualified_id (member);
   else if (DECL_P (member))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 9c77db25e776..75ef5875c0cb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -2250,7 +2250,7 @@ attributes that affect type identity, such as ia32 
calling convention
 attributes (e.g. @samp{stdcall}).
 
 Version 11, which first appeared in G++ 7, corrects the mangling of
-sizeof... expressions.  It also implies
+sizeof... expressions and operator names.  It also implies
 @option{-fnew-inheriting-ctors}.
 
 See also @option{-Wabi}.
diff --git a/gcc/testsuite/g++.dg/abi/mangle13.C 
b/gcc/testsuite/g++.dg/abi/mangle13.C
index 716c4c36f410..c8822a34039c 100644
--- a/gcc/testsuite/g++.dg/abi/mangle13.C
+++ b/gcc/testsuite/g++.dg/abi/mangle13.C
@@ -1,4 +1,4 @@
-// { dg-options "-fabi-version=0" }
+// { dg-options "-fabi-version=10" }
 
 struct A {
   template  int f ();
diff --git a/gcc/testsuite/g++.dg/abi/mangle37.C 
b/gcc/testsuite/g++.dg/abi/mangle37.C
index 691566b384ba..4dd87e84c108 100644
--- a/gcc/testsuite/g++.dg/abi/mangle37.C
+++ b/gcc/testsuite/g++.dg/abi/mangle37.C
@@ -1,5 +1,6 @@
 // Testcase for mangling of expressions involving operator names.
 // { dg-do compile { target c++11 } }
+// { dg-options "-fabi-version=10" }
 // { dg-final { scan-assembler "_Z1fI1AEDTclonplfp_fp_EET_" } }
 // { dg-final { scan-assembler "_Z1gI1AEDTclonplIT_Efp_fp_EES1_" } }
 // { dg-final { scan-assembler "_Z1hI1AEDTcldtfp_miEET_" } }
diff --git a/gcc/testsuite/g++.dg/abi/pr70182.C 
b/gcc/testsuite/g++.dg/abi/pr70182.C
new file mode 100644
index ..d299362910c1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/pr70182.C
@@ -0,0 +1,28 @@
+// { dg-options "-fabi-version=0" }
+
+struct A {
+  template  int f ();
+  int operator+();
+  operator int ();
+  template  
+  int operator-();
+};
+
+typedef int (A::*P)();
+
+template  struct S {};
+
+template  void g (S<::template f >) {}
+template  void g (S<::operator+ >) {}
+template  void g (S<::operator int>) {}
+template  void g (S<::template operator-  >) {}
+
+template void g (S<::f >);
+template void g (S<::operator+>);
+template void g (S<::operator int>);
+template void g (S<::operator- >);
+
+// { dg-final { scan-assembler _Z1gI1AEv1SIXadsrT_1fIiEEE } }
+// { dg-final { scan-assembler _Z1gI1AEv1SIXadsrT_onplEE } }
+// { dg-final { scan-assembler _Z1gI1AEv1SIXadsrT_oncviEE } }
+// { dg-final { scan-assembler _Z1gI1AEv1SIXadsrT_onmiIdEEE } }
diff --git a/gcc/testsuite/g++.dg/dfp/mangle-1.C 
b/gcc/testsuite/g++.dg/dfp/mangle-1.C
index 455d3e4c0ef6..ee9644b27a53 100644
--- a/gcc/testsuite/g++.dg/dfp/mangle-1.C
+++ b/gcc/testsuite/g++.dg/dfp/mangle-1.C
@@ -1,4 +1,5 @@
 // { dg-do compile }
+// { dg-options "-fabi-version=10" }
 
 // Mangling of classes from std::decimal are special-cased.
 // Derived from g++.dg/abi/mangle13.C.
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index d84929eca20d..f0dbf9381c6b 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -1594,6 +1594,8 @@ d_unqualified_name (struct d_info *di)
 ret = d_source_name (di);
   else if (IS_LOWER (peek))
 {
+  if (peek == 'o' && d_peek_next_char (di) == 'n')
+   d_advance (di, 2);
   ret = d_operator_name (di);
   if (ret != NULL && ret->type == DEMANGLE_COMPONENT_OPERATOR)
{
diff --git a/libiberty/testsuite/demangle-expected 
b/libiberty/testsuite/demangle-expected
index 07e258fe58b3..c1cfa1545eca 100644
--- 

Re: [PATCH C++] Fix PR77489 -- mangling of discriminator >= 10

2017-01-11 Thread Markus Trippelsdorf
On 2017.01.11 at 13:03 +0100, Jakub Jelinek wrote:
> On Wed, Jan 11, 2017 at 12:48:29PM +0100, Markus Trippelsdorf wrote:
> > @@ -1965,7 +1966,11 @@ write_discriminator (const int discriminator)
> >if (discriminator > 0)
> >  {
> >write_char ('_');
> > +  if (abi_version_at_least(11) && discriminator - 1 >= 10)
> > +   write_char ('_');
> >write_unsigned_number (discriminator - 1);
> > +  if (abi_version_at_least(11) && discriminator - 1 >= 10)
> > +   write_char ('_');
> 
> Formatting nits, there should be space before (11).
> 
> > +// { dg-final { scan-assembler "_ZZ3foovE8localVar__10_" } }
> > +// { dg-final { scan-assembler "_ZZ3foovE8localVar__11_" } }
> 
> Would be nice to also
> // { dg-final { scan-assembler "_ZZ3foovE8localVar_9" } }
> 
> Otherwise, I defer to Jason (primarily whether this doesn't need
> ABI version 12).

Thanks for review. I will fix these issues. 
Jason said on IRC that he is fine with ABI version 11.

Ok for trunk?

-- 
Markus


Re: [RFA][PATCH 3/4] Trim mem* calls in DSE

2017-01-11 Thread Jeff Law

On 01/04/2017 07:04 AM, Richard Biener wrote:


Didn't see a re-post of this one so reviewing the old.
Didn't figure mem* trimming was suitable for gcc-7 as I couldn't justify 
it as a bugfix, so I didn't ping it.


I don't think it changed materially.  All your comments are still 
applicable to the version in my tree.






* tree-ssa-dse.c (need_ssa_update): New file scoped boolean.
(decrement_count): New function.
(increment_start_addr, trim_memstar_call): Likewise.
(trim_partially_dead_store): Call trim_memstar_call.
(pass_dse::execute): Initialize need_ssa_update.  If set, then
return TODO_ssa_update.

* gcc.dg/tree-ssa/ssa-dse-25.c: New test.

diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c
index 1482c7f..b21b9b5 100644
--- a/gcc/tree-ssa-dse.c
+++ b/gcc/tree-ssa-dse.c
@@ -79,6 +80,10 @@ static bitmap need_eh_cleanup;
It is always safe to return FALSE.  But typically better optimziation
can be achieved by analyzing more statements.  */

+/* If trimming stores requires insertion of new statements, then we
+   will need an SSA update.  */
+static bool need_ssa_update;
+


huh?  You set this to true after inserting a POINTER_PLUS_EXPR, I don't see
how you need an SSA update for this.
I'll go back and re-investigate.  I could easily have goof'd the 
in-place update and be papering over that with the ssa update.







 static bool
 initialize_ao_ref_for_dse (gimple *stmt, ao_ref *write)
 {
@@ -309,6 +314,113 @@ trim_constructor_store (bitmap orig, bitmap live,
gimple *stmt)
 }
 }

+/* STMT is a memcpy, memmove or memset.  Decrement the number of bytes
+   copied/set by DECREMENT.  */
+static void
+decrement_count (gimple *stmt, int decrement)
+{
+  tree *countp = gimple_call_arg_ptr (stmt, 2);
+  gcc_assert (TREE_CODE (*countp) == INTEGER_CST);
+  tree x = fold_build2 (MINUS_EXPR, TREE_TYPE (*countp), *countp,
+   build_int_cst (TREE_TYPE (*countp), decrement));
+  *countp = x;


thanks to wide-int the following should work

   *countp = wide_int_to_tree (TREE_TYPE (*countp), *countp - decrement);

Sweet.  I like that much better.



(if not please use int_const_binop rather than fold_build2 here and
below as well)


+}
+
+static void
+increment_start_addr (gimple *stmt ATTRIBUTE_UNUSED, tree *where, int
increment)
+{
+  /* If the address wasn't initially a MEM_REF, make it a MEM_REF.  */
+  if (TREE_CODE (*where) == ADDR_EXPR
+  && TREE_CODE (TREE_OPERAND (*where, 0)) != MEM_REF)
+{
+  tree t = TREE_OPERAND (*where, 0);
+  t = build_ref_for_offset (EXPR_LOCATION (t), t,
+   increment * BITS_PER_UNIT, false,
+   ptr_type_node, NULL, false);


please don't use build_ref_for_offset for this.  Simply only handle the SSA_NAME
case here and below ...
I think build_ref_for_offset was what spurred the tree-sra.h inclusion. 
IIRC I was seeing a goodly number of cases where the argument wasn't a 
MEM_REF or SSA_NAME at this point.  But I'll double-check.


If we don't need build_ref_for_offset, do you still want me to pull its 
prototype into the new tree-sra.h, or just leave it as-is?







+  *where = build_fold_addr_expr (t);
+  return;
+}
+  else if (TREE_CODE (*where) == SSA_NAME)
+{
+  tree tem = make_ssa_name (TREE_TYPE (*where));
+  gassign *newop
+= gimple_build_assign (tem, POINTER_PLUS_EXPR, *where,
+  build_int_cst (sizetype, increment));
+  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+  gsi_insert_before (, newop, GSI_SAME_STMT);
+  need_ssa_update = true;
+  *where = tem;
+  update_stmt (gsi_stmt (gsi));
+  return;
+}
+
+  /* We can just adjust the offset in the MEM_REF expression.  */
+  tree x1 = TREE_OPERAND (TREE_OPERAND (*where, 0), 1);
+  tree x = fold_build2 (PLUS_EXPR, TREE_TYPE (x1), x1,
+   build_int_cst (TREE_TYPE (x1), increment));
+  TREE_OPERAND (TREE_OPERAND (*where, 0), 1) = x;

...

re-fold the thing as MEM_REF which will do all the magic for you:

  *where = build_fold_addr_expr (fold_build2 (MEM_REF, char_type_node,
*where, build_int_cst (ptr_type_node, increment)));

that handles [] and  just fine and avoids adding magic here.
And that () is likely what I was looking to handle with the 
first if clause above where I called build_ref_for_offset.




Otherwise looks ok.  I think I'd like to see this in GCC 7 given it's
so much similar to the constructor pruning.
OK.  I'll sort through the issues noted above and get this one reposted 
as well.


jeff



Re: [RFA] [PR tree-optimization/33562] [PATCH 1/4] Byte tracking in DSE - v3

2017-01-11 Thread Jeff Law

On 01/04/2017 06:23 AM, Richard Biener wrote:

On Wed, Jan 4, 2017 at 2:22 PM, Richard Biener
 wrote:

On Thu, Dec 22, 2016 at 7:26 AM, Jeff Law  wrote:

This is the first of the 4 part patchkit to address deficiencies in our DSE
implementation.

This patch addresses the P2 regression 33562 which has been a low priority
regression since gcc-4.3.  To summarize, DSE no longer has the ability to
detect an aggregate store as dead if subsequent stores are done in a
piecemeal fashion.

I originally tackled this by changing how we lower complex objects. That was
sufficient to address 33562, but was reasonably rejected.

This version attacks the problem by improving DSE to track stores to memory
at a byte level.  That allows us to determine if a series of stores
completely covers an earlier store (thus making the earlier store dead).

A useful side effect of this is we can detect when parts of a store are dead
and potentially rewrite the store.  This patch implements that for complex
object initializations.  While not strictly part of 33562, it's so closely
related that I felt it belongs as part of this patch.

This originally limited the size of the tracked memory space to 64 bytes.  I
bumped the limit after working through the CONSTRUCTOR and mem* trimming
patches.  The 256 byte limit is still fairly arbitrary and I wouldn't lose
sleep if we throttled back to 64 or 128 bytes.

Later patches in the kit will build upon this patch.  So if pieces look like
skeleton code, that's because it is.

The changes since the V2 patch are:

1. Using sbitmaps rather than bitmaps.
2. Returning a tri-state from dse_classify_store (renamed from
dse_possible_dead_store_p)
3. More efficient trim computation
4. Moving trimming code out of dse_classify_store
5. Refactoring code to delete dead calls/assignments
6. dse_optimize_stmt moves into the dse_dom_walker class

Not surprisingly, this patch has most of the changes based on prior feedback
as it includes the raw infrastructure.

Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the trunk?


New functions in sbitmap.c lack function comments.

bitmap_count_bits fails to guard against GCC_VERSION >= 3400 (the version
is not important, but non-GCC host compilers are).  See bitmap.c for a
fallback.

Both bitmap_clear_range and bitmap_set_range look rather inefficient...
(it's not likely anybody will clean this up after you)

I'd say split out the sbitmap.[ch] changes.

+DEFPARAM(PARAM_DSE_MAX_OBJECT_SIZE,
+"dse-max-object-size",
+"Maximum size (in bytes) of objects tracked by dead store
elimination.",
+256, 0, 0)

the docs suggest that DSE doesn't handle larger stores but it does (just in
the original limited way).  Maybe "tracked bytewise" is better.


Oh, and new --params need documeting in invoke.texi.

Fixed.

jeff



Re: [RFA] [PR tree-optimization/33562] [PATCH 1/4] Byte tracking in DSE - v3

2017-01-11 Thread Jeff Law

On 01/04/2017 06:22 AM, Richard Biener wrote:



Bootstrapped and regression tested on x86_64-linux-gnu.  OK for the trunk?


New functions in sbitmap.c lack function comments.

Bah.  Sophomoric on my part.  Fixed.



bitmap_count_bits fails to guard against GCC_VERSION >= 3400 (the version
is not important, but non-GCC host compilers are).  See bitmap.c for a
fallback.
Mistake on my part.  I keep thinking we support starting the bootstrap 
process with the most recently released GCC, but we support 3.4 as well 
as other C++98/C++03 compilers.  Fixed in the next update (and tested by 
forcing the fallback method).




Both bitmap_clear_range and bitmap_set_range look rather inefficient...
(it's not likely anybody will clean this up after you)
They were, but not anymore.  Now they build a mask to deal with any 
partial clearing/setting in the first word, then a single memset for any 
whole words in the middle, then another masking operation on residuals 
in the last word.  Verified behavior by by keeping two bitmaps, one with 
the old slow approach and one with the faster implementation and 
checking for equality.  Obviously those verification bits won't be in 
the final patch.



I'd say split out the sbitmap.[ch] changes.

Sure.  That's easy enough.



+DEFPARAM(PARAM_DSE_MAX_OBJECT_SIZE,
+"dse-max-object-size",
+"Maximum size (in bytes) of objects tracked by dead store
elimination.",
+256, 0, 0)

the docs suggest that DSE doesn't handle larger stores but it does (just in
the original limited way).  Maybe "tracked bytewise" is better.

Agreed and fixed.




+static bool
+valid_ao_ref_for_dse (ao_ref *ref)
+{
+  return (ao_ref_base (ref)
+ && ref->max_size != -1
+ && (ref->offset % BITS_PER_UNIT) == 0
+ && (ref->size % BITS_PER_UNIT) == 0
+ && (ref->size / BITS_PER_UNIT) > 0);

I think the last test is better written as ref->size != -1.

Seems reasonable.  Fixed.




Btw, seeing you discount non-byte size/offset stores this somehow asks
for store-merging being done before the last DSE (it currently runs after).
Sth to keep in mind for GCC 8.
Yea, probably.  Of course it may also be the case that DSE enables store 
merging.  Worth some experimentation.





+/* Delete a dead store STMT, which is mem* call of some kind.  */

call STMT

Fixed.



+static void
+delete_dead_call (gimple *stmt)
+{
+
excess vertical space

Likewise.


..
+  if (lhs)
+{
+  tree ptr = gimple_call_arg (stmt, 0);
+  gimple *new_stmt = gimple_build_assign (lhs, ptr);
+  unlink_stmt_vdef (stmt);
+  if (gsi_replace (, new_stmt, true))
+bitmap_set_bit (need_eh_cleanup, gimple_bb (stmt)->index);

  release_ssa_name (gimple_vdef (stmt));


+  { m_live_bytes = sbitmap_alloc (PARAM_VALUE
(PARAM_DSE_MAX_OBJECT_SIZE));m_byte_tracking_enabled = false; }

formatting.

Yea. Fixed.



The DSE parts look good to me with the nits above fixed.  Just re-spin
the sbitmap.[ch] part please.

Will repost the sbitmap.c bits after retesting the series.

jeff




[PATCH/AARCH64] Add scheduler for Thunderx2t99

2017-01-11 Thread Hurugalawadi, Naveen
Hi James,

The scheduling patch for vulcan was posted at the following link:-
https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01205.html

We are working on the patch and addressed the comments for thunderx2t99.

>> I tried lowering the repeat expressions as so:
Done.

>>split off the AdvSIMD/FP model from the main pipeline
Done.

>> A change like wiring the vulcan_f0 and vulcan_f1 reservations
>> to be cpu_units of a new define_automaton "vulcan_advsimd"
Done.

>> simplifying some of the remaining large expressions
>> (vulcan_asimd_load*_mult, vulcan_asimd_load*_elts) can bring the size down
Did not understand much about this comment.
Can you please let me know about the simplification?

Please find attached the modified patch as per your suggestions and comments.
Please review the patch and let us know if its okay?

Thanks,
Naveendiff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index a7a4b33..4d39673 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -75,7 +75,7 @@ AARCH64_CORE("xgene1",  xgene1,xgene1,8A,  AARCH64_FL_FOR_ARCH8, xge
 
 /* Broadcom ('B') cores. */
 AARCH64_CORE("thunderx2t99",  thunderx2t99, cortexa57, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1)
-AARCH64_CORE("vulcan",  vulcan, cortexa57, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1)
+AARCH64_CORE("vulcan",  vulcan, vulcan, 8_1A,  AARCH64_FL_FOR_ARCH8_1 | AARCH64_FL_CRYPTO, thunderx2t99, 0x42, 0x516, -1)
 
 /* V8 big.LITTLE implementations.  */
 
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index bde4231..063559c 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -220,6 +220,7 @@
 (include "../arm/exynos-m1.md")
 (include "thunderx.md")
 (include "../arm/xgene1.md")
+(include "thunderx2t99.md")
 
 ;; ---
 ;; Jumps and other miscellaneous insns
diff --git a/gcc/config/aarch64/thunderx2t99.md b/gcc/config/aarch64/thunderx2t99.md
new file mode 100644
index 000..00d40f8
--- /dev/null
+++ b/gcc/config/aarch64/thunderx2t99.md
@@ -0,0 +1,513 @@
+;; Cavium ThunderX 2 CN99xx pipeline description
+;; Copyright (C) 2016-2017 Free Software Foundation, Inc.
+;;
+;; Contributed by Cavium, Broadcom and Mentor Embedded.
+
+;; This file is part of GCC.
+
+;; GCC is free software; you can redistribute it and/or modify
+;; it under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 3, or (at your option)
+;; any later version.
+
+;; GCC is distributed in the hope that it will be useful,
+;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+;; GNU General Public License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING3.  If not see
+;; .
+
+(define_automaton "thunderx2t99, thunderx2t99_advsimd, thunderx2t99_ldst")
+(define_automaton "thunderx2t99_mult")
+
+(define_cpu_unit "thunderx2t99_i0" "thunderx2t99")
+(define_cpu_unit "thunderx2t99_i1" "thunderx2t99")
+(define_cpu_unit "thunderx2t99_i2" "thunderx2t99")
+
+(define_cpu_unit "thunderx2t99_ls0" "thunderx2t99_ldst")
+(define_cpu_unit "thunderx2t99_ls1" "thunderx2t99_ldst")
+(define_cpu_unit "thunderx2t99_sd" "thunderx2t99_ldst")
+
+; Pseudo-units for multiply pipeline.
+
+(define_cpu_unit "thunderx2t99_i1m1" "thunderx2t99_mult")
+(define_cpu_unit "thunderx2t99_i1m2" "thunderx2t99_mult")
+(define_cpu_unit "thunderx2t99_i1m3" "thunderx2t99_mult")
+
+; Pseudo-units for load delay (assuming dcache hit).
+
+(define_cpu_unit "thunderx2t99_ls0d1" "thunderx2t99_ldst")
+(define_cpu_unit "thunderx2t99_ls0d2" "thunderx2t99_ldst")
+(define_cpu_unit "thunderx2t99_ls0d3" "thunderx2t99_ldst")
+
+(define_cpu_unit "thunderx2t99_ls1d1" "thunderx2t99_ldst")
+(define_cpu_unit "thunderx2t99_ls1d2" "thunderx2t99_ldst")
+(define_cpu_unit "thunderx2t99_ls1d3" "thunderx2t99_ldst")
+
+; Make some aliases for f0/f1.
+(define_cpu_unit "thunderx2t99_f0" "thunderx2t99_advsimd")
+(define_cpu_unit "thunderx2t99_f1" "thunderx2t99_advsimd")
+
+(define_reservation "thunderx2t99_i012" "thunderx2t99_i0|thunderx2t99_i1|thunderx2t99_i2")
+(define_reservation "thunderx2t99_ls01" "thunderx2t99_ls0|thunderx2t99_ls1")
+(define_reservation "thunderx2t99_f01" "thunderx2t99_f0|thunderx2t99_f1")
+
+(define_reservation "thunderx2t99_ls_both" "thunderx2t99_ls0+thunderx2t99_ls1")
+
+; A load with delay in the ls0/ls1 pipes.
+(define_reservation "thunderx2t99_l0delay" "thunderx2t99_ls0,\
+  thunderx2t99_ls0d1,thunderx2t99_ls0d2,\
+  thunderx2t99_ls0d3")
+(define_reservation "thunderx2t99_l1delay" "thunderx2t99_ls1,\
+  thunderx2t99_ls1d1,thunderx2t99_ls1d2,\
+  thunderx2t99_ls1d3")

[PATCH 1/6] RISC-V Port: gcc/config/riscv/riscv.c

2017-01-11 Thread Palmer Dabbelt
This is split from the rest of the gcc submission so I can fit this
patch on the mailing list's 200KiB limit.
---
 gcc/config/riscv/riscv.c | 4157 ++
 1 file changed, 4157 insertions(+)
 create mode 100644 gcc/config/riscv/riscv.c

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
new file mode 100644
index 000..f4911d3
--- /dev/null
+++ b/gcc/config/riscv/riscv.c
@@ -0,0 +1,4157 @@
+/* Subroutines used for code generation for RISC-V.
+   Copyright (C) 2011-2017 Free Software Foundation, Inc.
+   Contributed by Andrew Waterman (and...@sifive.com).
+   Based on MIPS target for GNU compiler.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "rtl.h"
+#include "regs.h"
+#include "hard-reg-set.h"
+#include "insn-config.h"
+#include "conditions.h"
+#include "insn-attr.h"
+#include "recog.h"
+#include "output.h"
+#include "hash-set.h"
+#include "machmode.h"
+#include "vec.h"
+#include "double-int.h"
+#include "input.h"
+#include "alias.h"
+#include "symtab.h"
+#include "wide-int.h"
+#include "inchash.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "varasm.h"
+#include "stringpool.h"
+#include "stor-layout.h"
+#include "calls.h"
+#include "function.h"
+#include "hashtab.h"
+#include "flags.h"
+#include "statistics.h"
+#include "real.h"
+#include "fixed-value.h"
+#include "expmed.h"
+#include "dojump.h"
+#include "explow.h"
+#include "memmodel.h"
+#include "emit-rtl.h"
+#include "stmt.h"
+#include "expr.h"
+#include "insn-codes.h"
+#include "optabs.h"
+#include "libfuncs.h"
+#include "reload.h"
+#include "tm_p.h"
+#include "ggc.h"
+#include "gstab.h"
+#include "hash-table.h"
+#include "debug.h"
+#include "target.h"
+#include "target-def.h"
+#include "common/common-target.h"
+#include "langhooks.h"
+#include "dominance.h"
+#include "cfg.h"
+#include "cfgrtl.h"
+#include "cfganal.h"
+#include "lcm.h"
+#include "cfgbuild.h"
+#include "cfgcleanup.h"
+#include "predict.h"
+#include "basic-block.h"
+#include "bitmap.h"
+#include "regset.h"
+#include "df.h"
+#include "sched-int.h"
+#include "tree-ssa-alias.h"
+#include "internal-fn.h"
+#include "gimple-fold.h"
+#include "tree-eh.h"
+#include "gimple-expr.h"
+#include "is-a.h"
+#include "gimple.h"
+#include "gimplify.h"
+#include "diagnostic.h"
+#include "target-globals.h"
+#include "opts.h"
+#include "tree-pass.h"
+#include "context.h"
+#include "hash-map.h"
+#include "plugin-api.h"
+#include "ipa-ref.h"
+#include "cgraph.h"
+#include "builtins.h"
+#include "rtl-iter.h"
+#include 
+
+/* True if X is an UNSPEC wrapper around a SYMBOL_REF or LABEL_REF.  */
+#define UNSPEC_ADDRESS_P(X)\
+  (GET_CODE (X) == UNSPEC  \
+   && XINT (X, 1) >= UNSPEC_ADDRESS_FIRST  \
+   && XINT (X, 1) < UNSPEC_ADDRESS_FIRST + NUM_SYMBOL_TYPES)
+
+/* Extract the symbol or label from UNSPEC wrapper X.  */
+#define UNSPEC_ADDRESS(X) \
+  XVECEXP (X, 0, 0)
+
+/* Extract the symbol type from UNSPEC wrapper X.  */
+#define UNSPEC_ADDRESS_TYPE(X) \
+  ((enum riscv_symbol_type) (XINT (X, 1) - UNSPEC_ADDRESS_FIRST))
+
+/* True if bit BIT is set in VALUE.  */
+#define BITSET_P(VALUE, BIT) (((VALUE) & (1ULL << (BIT))) != 0)
+
+/* Classifies an address.
+
+   ADDRESS_REG
+   A natural register + offset address.  The register satisfies
+   riscv_valid_base_register_p and the offset is a const_arith_operand.
+
+   ADDRESS_LO_SUM
+   A LO_SUM rtx.  The first operand is a valid base register and
+   the second operand is a symbolic address.
+
+   ADDRESS_CONST_INT
+   A signed 16-bit constant address.
+
+   ADDRESS_SYMBOLIC:
+   A constant symbolic address.  */
+enum riscv_address_type {
+  ADDRESS_REG,
+  ADDRESS_LO_SUM,
+  ADDRESS_CONST_INT,
+  ADDRESS_SYMBOLIC
+};
+
+/* Information about a function's frame layout.  */
+struct GTY(())  riscv_frame_info {
+  /* The size of the frame in bytes.  */
+  HOST_WIDE_INT total_size;
+
+  /* Bit X is set if the function saves or restores GPR X.  */
+  unsigned int mask;
+
+  /* Likewise FPR X.  */
+  unsigned int fmask;
+
+  /* How much the GPR save/restore routines adjust sp (or 0 if unused).  */
+  unsigned save_libcall_adjustment;
+
+  /* Offsets of fixed-point and floating-point save areas from frame bottom 

[PATCH 4/6] RISC-V Port: libsanitizer

2017-01-11 Thread Palmer Dabbelt
---
 libsanitizer/sanitizer_common/sanitizer_linux.cc | 5 +
 libsanitizer/sanitizer_common/sanitizer_platform.h   | 4 ++--
 libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc | 2 +-
 libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h  | 7 +--
 4 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_linux.cc 
b/libsanitizer/sanitizer_common/sanitizer_linux.cc
index 806fcd5..4de9d16 100644
--- a/libsanitizer/sanitizer_common/sanitizer_linux.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_linux.cc
@@ -1369,6 +1369,11 @@ void GetPcSpBp(void *context, uptr *pc, uptr *sp, uptr 
*bp) {
   *pc = ucontext->uc_mcontext.pc;
   *bp = ucontext->uc_mcontext.gregs[30];
   *sp = ucontext->uc_mcontext.gregs[29];
+#elif defined(__riscv)
+  ucontext_t *ucontext = (ucontext_t*)context;
+  *pc = ucontext->uc_mcontext.gregs[REG_PC];
+  *bp = ucontext->uc_mcontext.gregs[REG_S0];
+  *sp = ucontext->uc_mcontext.gregs[REG_SP];
 #elif defined(__s390__)
   ucontext_t *ucontext = (ucontext_t*)context;
 # if defined(__s390x__)
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform.h 
b/libsanitizer/sanitizer_common/sanitizer_platform.h
index 428709d..5519bc6 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform.h
@@ -188,9 +188,9 @@
 
 // The AArch64 linux port uses the canonical syscall set as mandated by
 // the upstream linux community for all new ports. Other ports may still
-// use legacy syscalls.
+// use legacy syscalls.  The RISC-V port also does this.
 #ifndef SANITIZER_USES_CANONICAL_LINUX_SYSCALLS
-# if defined(__aarch64__) && SANITIZER_LINUX
+# if (defined(__aarch64__) || defined(__riscv)) && SANITIZER_LINUX
 # define SANITIZER_USES_CANONICAL_LINUX_SYSCALLS 1
 # else
 # define SANITIZER_USES_CANONICAL_LINUX_SYSCALLS 0
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc 
b/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
index 23a0148..11a3850 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_linux.cc
@@ -64,7 +64,7 @@ namespace __sanitizer {
 
 #if !defined(__powerpc64__) && !defined(__x86_64__) && !defined(__aarch64__)\
 && !defined(__mips__) && !defined(__s390__)\
-&& !defined(__sparc__)
+&& !defined(__sparc__) && && !defined(__riscv)
 COMPILER_CHECK(struct___old_kernel_stat_sz == sizeof(struct 
__old_kernel_stat));
 #endif
 
diff --git a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h 
b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
index c139322..dddcef2 100644
--- a/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
+++ b/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h
@@ -83,6 +83,9 @@ namespace __sanitizer {
  SANITIZER_ANDROID ? FIRST_32_SECOND_64(104, 128) :
  FIRST_32_SECOND_64(144, 216);
   const unsigned struct_kernel_stat64_sz = 104;
+#elif defined(__riscv)
+  const unsigned struct_kernel_stat_sz = 128;
+  const unsigned struct_kernel_stat64_sz = 128;
 #elif defined(__s390__) && !defined(__s390x__)
   const unsigned struct_kernel_stat_sz = 64;
   const unsigned struct_kernel_stat64_sz = 104;
@@ -117,7 +120,7 @@ namespace __sanitizer {
 
 #if SANITIZER_LINUX || SANITIZER_FREEBSD
 
-#if defined(__powerpc64__) || defined(__s390__)
+#if defined(__powerpc64__) || defined(__riscv) || defined(__s390__)
   const unsigned struct___old_kernel_stat_sz = 0;
 #elif !defined(__sparc__)
   const unsigned struct___old_kernel_stat_sz = 32;
@@ -540,7 +543,7 @@ namespace __sanitizer {
   typedef long __sanitizer___kernel_off_t;
 #endif
 
-#if defined(__powerpc__) || defined(__mips__)
+#if defined(__powerpc__) || defined(__mips__) || defined(__riscv)
   typedef unsigned int __sanitizer___kernel_old_uid_t;
   typedef unsigned int __sanitizer___kernel_old_gid_t;
 #else
-- 
2.10.2



[PATCH 3/6] RISC-V Port: libgcc

2017-01-11 Thread Palmer Dabbelt
From: Andrew Waterman 

---
 libgcc/config.host |  12 ++
 libgcc/config/riscv/atomic.c   | 111 +
 libgcc/config/riscv/crti.S |   1 +
 libgcc/config/riscv/crtn.S |   1 +
 libgcc/config/riscv/div.S  | 146 ++
 libgcc/config/riscv/linux-unwind.h |  89 ++
 libgcc/config/riscv/muldi3.S   |  46 +++
 libgcc/config/riscv/multi3.S   |  81 
 libgcc/config/riscv/save-restore.S | 245 +
 libgcc/config/riscv/sfp-machine.h  | 156 +++
 libgcc/config/riscv/t-elf  |   6 +
 libgcc/config/riscv/t-elf32|   1 +
 libgcc/config/riscv/t-elf64|   1 +
 libgcc/config/riscv/t-softfp32 |   3 +
 libgcc/config/riscv/t-softfp64 |   4 +
 15 files changed, 903 insertions(+)
 create mode 100644 libgcc/config/riscv/atomic.c
 create mode 100644 libgcc/config/riscv/crti.S
 create mode 100644 libgcc/config/riscv/crtn.S
 create mode 100644 libgcc/config/riscv/div.S
 create mode 100644 libgcc/config/riscv/linux-unwind.h
 create mode 100644 libgcc/config/riscv/muldi3.S
 create mode 100644 libgcc/config/riscv/multi3.S
 create mode 100644 libgcc/config/riscv/save-restore.S
 create mode 100644 libgcc/config/riscv/sfp-machine.h
 create mode 100644 libgcc/config/riscv/t-elf
 create mode 100644 libgcc/config/riscv/t-elf32
 create mode 100644 libgcc/config/riscv/t-elf64
 create mode 100644 libgcc/config/riscv/t-softfp32
 create mode 100644 libgcc/config/riscv/t-softfp64

diff --git a/libgcc/config.host b/libgcc/config.host
index 6f2e458..bb6d5370e 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -167,6 +167,9 @@ powerpc*-*-*)
;;
 rs6000*-*-*)
;;
+riscv*)
+   cpu_type=riscv
+   ;;
 sparc64*-*-*)
cpu_type=sparc
;;
@@ -1091,6 +1094,15 @@ powerpcle-*-eabi*)
tmake_file="${tmake_file} rs6000/t-ppccomm rs6000/t-crtstuff 
t-crtstuff-pic t-fdpbit"
extra_parts="$extra_parts crtbegin.o crtend.o crtbeginS.o crtendS.o 
crtbeginT.o ecrti.o ecrtn.o ncrti.o ncrtn.o"
;;
+riscv*-*-linux*)
+   tmake_file="${tmake_file} t-softfp-sfdf riscv/t-softfp${host_address} 
t-softfp riscv/t-elf riscv/t-elf${host_address}"
+   extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o crtendS.o 
crtbeginT.o"
+   md_unwind_header=riscv/linux-unwind.h
+   ;;
+riscv*-*-*)
+   tmake_file="${tmake_file} t-softfp-sfdf riscv/t-softfp${host_address} 
t-softfp riscv/t-elf riscv/t-elf${host_address}"
+   extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o"
+   ;;
 rs6000-ibm-aix4.[3456789]* | powerpc-ibm-aix4.[3456789]*)
md_unwind_header=rs6000/aix-unwind.h
tmake_file="t-fdpbit rs6000/t-ppc64-fp rs6000/t-slibgcc-aix 
rs6000/t-ibm-ldouble"
diff --git a/libgcc/config/riscv/atomic.c b/libgcc/config/riscv/atomic.c
new file mode 100644
index 000..448b0e5
--- /dev/null
+++ b/libgcc/config/riscv/atomic.c
@@ -0,0 +1,111 @@
+/* Legacy sub-word atomics for RISC-V.
+ 
+   Copyright (C) 2016-2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+#ifdef __riscv_atomic
+
+#include 
+
+#define INVERT "not %[tmp1], %[tmp1]\n\t"
+#define DONT_INVERT""
+
+#define GENERATE_FETCH_AND_OP(type, size, opname, insn, invert, cop)   \
+  type __sync_fetch_and_ ## opname ## _ ## size (type *p, type v)  \
+  {\
+unsigned long aligned_addr = ((unsigned long) p) & ~3UL;   \
+int shift = (((unsigned long) p) & 3) * 8; \
+unsigned mask = ((1U << ((sizeof v) * 8)) - 1) << shift;   \
+unsigned old, tmp1, tmp2;  \
+   \
+asm volatile ("1:\n\t" \
+ "lr.w.aq %[old], %[mem]\n\t"  \
+ #insn " %[tmp1], %[old], %[value]\n\t"\
+

[PATCH 5/6] RISC-V Port: libatomic

2017-01-11 Thread Palmer Dabbelt
From: Andrew Waterman 

---
 libatomic/configure.tgt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libatomic/configure.tgt b/libatomic/configure.tgt
index 6d77c94..b8af3ab 100644
--- a/libatomic/configure.tgt
+++ b/libatomic/configure.tgt
@@ -37,6 +37,7 @@ case "${target_cpu}" in
ARCH=alpha
;;
   rs6000 | powerpc*)   ARCH=powerpc ;;
+  riscv*)  ARCH=riscv ;;
   sh*) ARCH=sh ;;
 
   arm*)
-- 
2.10.2



[PATCH 6/6] RISC-V Port: gcc/testsuite

2017-01-11 Thread Palmer Dabbelt
From: Kito Cheng 

---
 gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C| 2 +-
 gcc/testsuite/gcc.c-torture/execute/20101011-1.c  | 3 +++
 gcc/testsuite/gcc.dg/20020312-2.c | 2 ++
 gcc/testsuite/gcc.dg/builtin-apply2.c | 1 +
 gcc/testsuite/gcc.dg/ifcvt-4.c| 2 +-
 gcc/testsuite/gcc.dg/loop-8.c | 2 +-
 gcc/testsuite/gcc.dg/sibcall-9.c  | 2 ++
 gcc/testsuite/gcc.dg/stack-usage-1.c  | 2 ++
 gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c| 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c | 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-3.c | 2 +-
 gcc/testsuite/lib/target-supports.exp | 1 +
 13 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C
index 80a571a..2e0ef68 100644
--- a/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-rom.C
@@ -2,7 +2,7 @@
 // { dg-do compile { target c++11 } }
 // { dg-additional-options -G0 { target { { alpha*-*-* frv*-*-* ia64-*-* 
lm32*-*-* m32r*-*-* microblaze*-*-* mips*-*-* nios2-*-* powerpc*-*-* 
rs6000*-*-* } && { ! { *-*-darwin* *-*-aix* alpha*-*-*vms* } } } } }
 // { dg-final { scan-assembler "\\.rdata" { target mips*-*-* } } }
-// { dg-final { scan-assembler "rodata" { target { { *-*-linux-gnu *-*-gnu* 
*-*-elf } && { ! mips*-*-* } } } } }
+// { dg-final { scan-assembler "rodata" { target { { *-*-linux-gnu *-*-gnu* 
*-*-elf } && { ! { mips*-*-* riscv*-*-* } } } } } }
 
 struct Data
 {
diff --git a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c 
b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
index 744763f..899a401 100644
--- a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
@@ -6,6 +6,9 @@
 #elif defined (__powerpc__) || defined (__PPC__) || defined (__ppc__) || 
defined (__POWERPC__) || defined (__ppc)
   /* On PPC division by zero does not trap.  */
 # define DO_TEST 0
+#elif defined (__riscv)
+  /* On RISC-V division by zero does not trap.  */
+# define DO_TEST 0
 #elif defined (__SPU__)
   /* On SPU division by zero does not trap.  */
 # define DO_TEST 0
diff --git a/gcc/testsuite/gcc.dg/20020312-2.c 
b/gcc/testsuite/gcc.dg/20020312-2.c
index 5fce50d..f5929e0 100644
--- a/gcc/testsuite/gcc.dg/20020312-2.c
+++ b/gcc/testsuite/gcc.dg/20020312-2.c
@@ -67,6 +67,8 @@ extern void abort (void);
 # else
 #  define PIC_REG  "30"
 # endif
+#elif defined(__riscv)
+/* No pic register.  */
 #elif defined(__RX__)
 /* No pic register.  */
 #elif defined(__s390__)
diff --git a/gcc/testsuite/gcc.dg/builtin-apply2.c 
b/gcc/testsuite/gcc.dg/builtin-apply2.c
index b6cbe39..ad61d3b 100644
--- a/gcc/testsuite/gcc.dg/builtin-apply2.c
+++ b/gcc/testsuite/gcc.dg/builtin-apply2.c
@@ -1,6 +1,7 @@
 /* { dg-do run } */
 /* { dg-require-effective-target untyped_assembly } */
 /* { dg-skip-if "Variadic funcs have all args on stack. Normal funcs have args 
in registers." { "avr-*-* nds32*-*-*" } { "*" } { "" } } */
+/* { dg-skip-if "Variadic funcs use different argument passing from normal 
funcs." { "riscv*-*-*" } { "*" } { "" } } */
 /* { dg-skip-if "Variadic funcs use Base AAPCS.  Normal funcs use VFP 
variant." { arm*-*-* && arm_hf_eabi } { "*" } { "" } } */
 
 /* PR target/12503 */
diff --git a/gcc/testsuite/gcc.dg/ifcvt-4.c b/gcc/testsuite/gcc.dg/ifcvt-4.c
index 0d1671c..466ad15 100644
--- a/gcc/testsuite/gcc.dg/ifcvt-4.c
+++ b/gcc/testsuite/gcc.dg/ifcvt-4.c
@@ -1,6 +1,6 @@
 /* { dg-options "-fdump-rtl-ce1 -O2 --param max-rtl-if-conversion-insns=3 
--param max-rtl-if-conversion-unpredictable-cost=100" } */
 /* { dg-additional-options "-misel" { target { powerpc*-*-* } } } */
-/* { dg-skip-if "Multiple set if-conversion not guaranteed on all subtargets" 
{ "arm*-*-* hppa*64*-*-* visium-*-*" } }  */
+/* { dg-skip-if "Multiple set if-conversion not guaranteed on all subtargets" 
{ "arm*-*-* hppa*64*-*-* visium-*-*" riscv*-*-* } }  */
 
 typedef int word __attribute__((mode(word)));
 
diff --git a/gcc/testsuite/gcc.dg/loop-8.c b/gcc/testsuite/gcc.dg/loop-8.c
index 8a4b209..fd4fa62 100644
--- a/gcc/testsuite/gcc.dg/loop-8.c
+++ b/gcc/testsuite/gcc.dg/loop-8.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O1 -fdump-rtl-loop2_invariant" } */
-/* { dg-skip-if "unexpected IV" { "hppa*-*-* mips*-*-* visium-*-*" } { "*" } { 
"" } } */
+/* { dg-skip-if "unexpected IV" { "hppa*-*-* mips*-*-* visium-*-* riscv*-*-*" 
} { "*" } { "" } } */
 
 void
 f (int *a, int *b)
diff --git a/gcc/testsuite/gcc.dg/sibcall-9.c b/gcc/testsuite/gcc.dg/sibcall-9.c
index 34e7053..8e30952 100644
--- a/gcc/testsuite/gcc.dg/sibcall-9.c
+++ b/gcc/testsuite/gcc.dg/sibcall-9.c
@@ -8,6 +8,8 @@
 /* { dg-do run { xfail { { 

New Port for RISC-V

2017-01-11 Thread Palmer Dabbelt
We'd like to submit for inclusion in GCC a port for the RISC-V architecture.
The port suffices to build a substantial body of software (including Linux and
some 2,000 Fedora packages) and passes most of the gcc and g++ test suites; so,
while it is doubtlessly not complete, we think it is far enough along to start
the upstreaming process.  It is our understanding that it is OK to submit this
port during stage 3 because it does not touch any shared code.  Our binutils
port has already been accepted for the 2.28 release, and we plan on submitting
glibc and Linux patch sets soon.

This port targets Version 2.0 of the RV32I and RV64I base user ISAs, and the
five standard extensions M, A, F, D, and C, all of which are frozen and will
not change over time.  The RISC-V community and the 50-some member companies of
the RISC-V Foundation are quite eager to have a single, standard GCC port.  We
thank you in advance for your help in this process and for your feedback on the
software contribution itself.

These patches build on top of cac3398e5f378549d84bc2ebb6af97cfd0189b25, the
latest commit in the GCC git mirror as of last night.

Andrew and I will volunteer to maintain this port if it's OK with everyone.
Our understanding is that the GCC steering committee decides this, and this is
the correct place to contact them.

We'd like to thank the various members of the RISC-V software community who
have helped us with the port.  Specifically we'd like to thank Kito Cheng for
his work getting the GCC test suite running (and running correctly).

Thanks!

[PATCH 1/6] RISC-V Port: gcc/config/riscv/riscv.c
[PATCH 2/6] RISC-V Port: gcc
[PATCH 3/6] RISC-V Port: libgcc
[PATCH 4/6] RISC-V Port: libsanitizer
[PATCH 5/6] RISC-V Port: libatomic
[PATCH 6/6] RISC-V Port: gcc/testsuite


RE: Make MIPS soft-fp preserve NaN payloads for NAN2008

2017-01-11 Thread Joseph Myers
On Wed, 11 Jan 2017, Maciej W. Rozycki wrote:

> > > In any case, the soft-fp change is relevant in the hard-float case as
> > > well, to make software TFmode behave consistently with hardware SFmode
> > > and DFmode regarding NaN payload preservation.
> 
>  Is mixing TFmode, DFmode and SFmode operations with the two latters 
> handled in hardware and the former deferred to soft-fp a supported 
> configuration?  Do we have any MIPS ABI which provides for using all these 

Yes.

> data types?  AFAIK all MIPS/Linux ABIs use DFmode for `long double' and 

n32 and n64 use TFmode (unconditionally; unlike on some architectures, 
there is no -mlong-double-64 option).  From GCC 4.9 onwards this uses 
soft-fp rather than fp-bit, with integration with hardware exceptions and 
rounding modes.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH, gcc, wwwdocs] Document upcoming Qualcomm Falkor processor support

2017-01-11 Thread Andrew Pinski
On Wed, Jan 11, 2017 at 8:29 AM, Richard Earnshaw (lists)
 wrote:
> On 06/01/17 12:11, Siddhesh Poyarekar wrote:
>> Hi,
>>
>> This patch documents the newly added flag in gcc 7 for the upcoming
>> Qualcomm Falkor processor core.
>>
>> Siddhesh
>>
>> Index: htdocs/gcc-7/changes.html
>> ===
>> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
>> retrieving revision 1.33
>> diff -u -r1.33 changes.html
>> --- htdocs/gcc-7/changes.html 3 Jan 2017 10:55:03 -   1.33
>> +++ htdocs/gcc-7/changes.html 6 Jan 2017 12:09:53 -
>> @@ -390,7 +390,8 @@
>>   
>> Support has been added for the following processors
>> (GCC identifiers in parentheses): ARM Cortex-A73
>> -   (cortex-a73) and Broadcom Vulcan (vulcan).
>> +   (cortex-a73), Broadcom Vulcan (vulcan) and
>> +   Qualcomm Falkor (falkor).
>> The GCC identifiers can be used
>> as arguments to the -mcpu or -mtune 
>> options,
>> for example: -mcpu=cortex-a73 or
>>
>
> Thanks.  The file had changed again, but I've merged this in.

Yes that was my fault, I was just adding the Cavium ThunderX SOCs that
are now supported by GCC.

Thanks,
Andrew

>
> R.


Re: [PATCH, rs6000] Fix PR79044 (ICE in swap optimization)

2017-01-11 Thread Segher Boessenkool
On Tue, Jan 10, 2017 at 02:18:38PM -0600, Bill Schmidt wrote:
> PR79044 reports a situation where swap optimization ICEs in GCC 6 and in 
> trunk.  The
> problem is that swap optimization doesn't properly recognize that 
> element-reversing
> loads and stores (e.g., lxvw4x) cannot be treated as "swappable" 
> instructions.  These
> arise from the __builtin_vec_xl and __builtin_vec_xst interfaces that were 
> added in 
> GCC 6.  The surrounding code is slightly different, so the fix is slightly 
> different
> for the two releases.
> 
> The fix is obvious, and bootstraps on powerpc64le-unknown-linux-gnu with no 
> regressions.
> Are these patches ok for trunk and GCC 6, respectively?

Okay for both.  Is this needed for GCC 5 as well?

Thanks,


Segher


Re: [PATCH][GIMPLE FE] Add parsing of MEM_REFs

2017-01-11 Thread Joseph Myers
On Wed, 11 Jan 2017, Richard Biener wrote:

> As you can see I adjusted dumping of pointer constants (we can't
> parse the B suffix and large unsigned numbers get a warning so
> add 'U').  There's the general issue that we dump
> 
>   short x;
>   x = 1;
> 
> and then lex the '1' as type int and there's no suffixes for integer
> types smaller than int which means we can't write those constants
> type correct :/  Suggestions welcome (currently we ICE with type
> mismatches in those cases, we can avoid that by auto-fixing during
> parsing but I'd like to be explicit somehow).

You could always dump as ((short) 1); that's valid C.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] combine ignoring a check

2017-01-11 Thread Segher Boessenkool
On Tue, Jan 10, 2017 at 09:12:38AM -0500, Nathan Sidwell wrote:
> Segher commented on IRC that a single loop would be slower.  I disagree. 

Slower *and less readable*, which is the main point.  Oh well.

> -  /* Make sure this PARALLEL is not an asm.  We do not allow combining
> +  Neither can this PARALLEL be not an asm.  We do not allow combining

Delete the first "not" here please.  Okay for trunk with that change.
Thanks for the patch!


Segher


Build failure cris-elf, gcc-5 backport of PR rtl-optimization/78255 fix, gcc/postreload.c:reload_cse_simplify

2017-01-11 Thread Hans-Peter Nilsson
For cris-elf on the gcc-5-branch at r244321:

g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings   -DHAVE_CONFIG_H 
-I. -I. -I/tmp/hpautotest-gcc5/gcc/gcc -I/tmp/hpautotest-gcc5/gcc/gcc/. 
-I/tmp/hpautotest-gcc5/gcc/gcc/../include 
-I/tmp/hpautotest-gcc5/gcc/gcc/../libcpp/include 
-I/tmp/hpautotest-gcc5/cris-elf/gccobj/./gmp -I/tmp/hpautotest-gcc5/gcc/gmp 
-I/tmp/hpautotest-gcc5/cris-elf/gccobj/./mpfr -I/tmp/hpautotest-gcc5/gcc/mpfr 
-I/tmp/hpautotest-gcc5/gcc/mpc/src  
-I/tmp/hpautotest-gcc5/gcc/gcc/../libdecnumber 
-I/tmp/hpautotest-gcc5/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
-I/tmp/hpautotest-gcc5/gcc/gcc/../libbacktrace   -o postreload.o -MT 
postreload.o -MMD -MP -MF ./.deps/postreload.TPo 
/tmp/hpautotest-gcc5/gcc/gcc/postreload.c
/tmp/hpautotest-gcc5/gcc/gcc/postreload.c: In function 'bool 
reload_cse_simplify(rtx_insn*, rtx)':
/tmp/hpautotest-gcc5/gcc/gcc/postreload.c:125:7: error: 'NO_FUNCTION_CSE' was 
not declared in this scope
make[2]: *** [postreload.o] Error 1

This seems to be due to your backport:

2017-01-11  Andre Vieira 

Backport from mainline
2016-12-09  Andre Vieira 

PR rtl-optimization/78255
* gcc/postreload.c (reload_cse_simplify): Do not CSE a function if
NO_FUNCTION_CSE is true.

brgds, H-P


Re: [PATCH] MIPS: Fix generation of DIV.G and MOD.G for Loongson targets.

2017-01-11 Thread Maciej W. Rozycki
On Mon, 9 Jan 2017, Toma Tabacu wrote:

> The expand_DIVMOD function, introduced in r241660, will pick the divmod4
> (or the udivmod4) pattern when it checks for the presence of hardware
> div/mod instructions, which results in the generation of the old DIV
> instruction.
> 
> Unfortunately, this interferes with the generation of DIV.G and MOD.G
> (the div3 and mod3 patterns) for Loongson targets, which
> causes test failures.

 What test failures?  Details please.

> This patch prevents the selection of divmod4 and udivmod4 when
> targeting Loongson by adding !ISA_HAS_DIV3 to the match condition.
> ISA_HAS_DIV3 checks for the presence of the 3-operand Loongson-specific DIV.G
> and MOD.G instructions.
> 
> Tested with mips-mti-elf.

 And Loongson hardware presumably, right?

> This solution might be excessive, however, as it effectively forbids the
> generation of the old DIV instruction for Loongson targets, which actually do
> support it.

 What's the purpose of this change other than "fixing test failures"?  
Can you please demonstrate a technical justification of this change?  Has 
there been a code quality regression which this patch addresses for 
example?  What about source code which did emit `divmod4' and 
`udivmod4' patterns on Loongson targets before r241660?

 Given that the DIV.G, MOD.G and accumulator DIV instructions (and their 
unsigned counterparts) are all available the compiler should have freedom 
to choose whichever hardware operation is the most suitable for the 
calculations required according to code generation options selected and 
artificially disabling some hardware instructions does not appear to be a 
move in that direction to me.

  Maciej


Re: [PATCH] PR target/79004, Fix char/short -> _Float128 on PowerPC -mcpu=power9

2017-01-11 Thread Segher Boessenkool
On Mon, Jan 09, 2017 at 07:32:27PM -0500, Michael Meissner wrote:
> This patch fixes PR target/79004 by eliminating the optimization of avoiding
> direct move if we are converting an 8/16-bit integer value from memory to IEEE
> 128-bit floating point.
> 
> I opened a new bug (PR target/79038) to address the underlying issue that the
> IEEE 128-bit floating point integer conversions were written before small
> integers were allowed in the traditional Altivec registers.  This meant that 
> we
> had to use UNSPEC and explicit temporaries to get the integers into the
> appropriate registers.
> 
> I have tested this bug by doing a bootstrap build and make check on a little
> endian power8 system and using an assembler that knows about ISA 3.0
> instructions.  I added a new test to verify the results.  Can I check this 
> into
> the trunk?  This is not an issue on GCC 6.x.

Okay, thanks!  Two comments:

> +/* { dg-final { scan-assembler-not " bl __"} } */
> +/* { dg-final { scan-assembler "xscvdpqp"  } } */
> +/* { dg-final { scan-assembler "xscvqpdp"  } } */

This line always matches if ...

> +/* { dg-final { scan-assembler "xscvqpdpo" } } */

... this one does.  I recommend \m \M .

> +/* { dg-final { scan-assembler "xscvqpsdz" } } */
> +/* { dg-final { scan-assembler "xscvqpswz" } } */
> +/* { dg-final { scan-assembler "xscvsdqp"  } } */
> +/* { dg-final { scan-assembler "xscvudqp"  } } */
> +/* { dg-final { scan-assembler "lxsd"  } } */
> +/* { dg-final { scan-assembler "lxsiwax"   } } */
> +/* { dg-final { scan-assembler "lxsiwzx"   } } */
> +/* { dg-final { scan-assembler "lxssp" } } */
> +/* { dg-final { scan-assembler "stxsd" } } */
> +/* { dg-final { scan-assembler "stxsiwx"   } } */
> +/* { dg-final { scan-assembler "stxssp"} } */

There are many more than 14 instructions generated; maybe you want
scan-assembler-times?


Segher


Re: [PATCH] PR 78534 Change character length from int to size_t

2017-01-11 Thread Janne Blomqvist
On Sun, Jan 8, 2017 at 4:29 PM, Dominique d'Humières  wrote:
>> r244027 reverts r244011. Sorry for the breakage. It seems to affect
>> all i686 as well in addition to power, maybe all 32-bit hosts.
>
> For the record, I see the following failures with an instrumented r244026 (as 
> in pr78672)

[snip]

I was finally able to get a 32-bit i686 compiler going (my attempts to
do this on a x86_64-pc-linux-gnu host failed, in the end I resorted to
running 32-bit builds/tests on a i686 container). At least on i686,
the patch below on top of the big charlen->size_t patch fixes the
failures:

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index be63038..82319ed 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -4726,7 +4726,7 @@ gfc_resolve_substring_charlen (gfc_expr *e)
   /* Length = (end - start + 1).  */
   e->ts.u.cl->length = gfc_subtract (end, start);
   e->ts.u.cl->length = gfc_add (e->ts.u.cl->length,
-   gfc_get_int_expr (gfc_default_integer_kind,
+   gfc_get_int_expr (gfc_charlen_int_kind,
  NULL, 1));

   /* F2008, 6.4.1:  Both the starting point and the ending point shall
@@ -11420,9 +11420,10 @@ resolve_charlen (gfc_charlen *cl)

   /* F2008, 4.4.3.2:  If the character length parameter value evaluates to
  a negative value, the length of character entities declared is zero.  */
-  if (cl->length && mpz_sgn (cl->length->value.integer) < 0)
+  if (cl->length && cl->length->expr_type == EXPR_CONSTANT
+  && mpz_sgn (cl->length->value.integer) < 0)
 gfc_replace_expr (cl->length,
- gfc_get_int_expr (gfc_default_integer_kind, NULL, 0));
+ gfc_get_int_expr (gfc_charlen_int_kind, NULL, 0));

   /* Check that the character length is not too large.  */
   k = gfc_validate_kind (BT_INTEGER, gfc_charlen_int_kind, false);


So what happened was that without the EXPR_CONSTANT check, I was
accessing uninitialized memory (for some reason probably due to memory
layout or such, this didn't cause failures on x86_64-pc-linux-gnu).

Also, I found a couple of additional places where gfc_charlen_int_kind
should be used instead of gfc_default_integer_kind which is included
in the patch above, although AFAICT they have nothing to do with the
testcase failures.

Unless there are objections, I'll commit the fixed patch in a few days.

-- 
Janne Blomqvist


[PATCH, i386]: Increase memory address length only when rip_relative_addr_p returns false.

2017-01-11 Thread Uros Bizjak
Hello!

Revision 204369 [1] inadvertently reversed the detection of
RIP-relative address, resulting in the incorrect calculation of the
insn length. Attached patch fixes this problem.

2017-01-11  Uros Bizjak  

* config/i386/i386.c (memory_address_length): Increase len
only when rip_relative_addr_p returns false.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN and release branches.

[1] https://gcc.gnu.org/viewcvs/gcc?view=revision=204369

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e03dadd..bc4325a 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -28738,7 +28738,7 @@ memory_address_length (rtx addr, bool lea)
   else if (disp && !base && !index)
 {
   len += 4;
-  if (rip_relative_addr_p ())
+  if (!rip_relative_addr_p ())
len++;
 }
   else


RE: Make MIPS soft-fp preserve NaN payloads for NAN2008

2017-01-11 Thread Matthew Fortune
Maciej Rozycki  writes:
> On Thu, 5 Jan 2017, Matthew Fortune wrote:
> > It is true to say that users are discouraged from using 2008-NaN with
> > soft-float for pre-R6 architectures simply to avoid further fragmentation
> > of software for no real gain. However, for R6 then soft-float is 2008-NaN
> > otherwise we are stuck with legacy-NaN forever.
> 
>  What's the actual issue you have with legacy NaN, and how does soft-float
> relate to R6?  It's not like hardware, R6 or othwerwise, limits soft-float
> in any way.

What about floating point data stored in binary form written by a hard-float
app and read by a soft-float etc. From R6 onwards it would be poor not to
address this and make it the same. I assume people do this sometimes!

For R5 I agree there is no useful reason to do soft-float in nan2008 but
someone may wish to do that in a closed environment if perhaps they have
a requirement like the one I describe above for R6.
 
> > If someone did want to build a system from source with soft-float as
> > 2008-NaN then I see no reason to stop them but I doubt they would and I
> > don't expect the --with-nan GCC configure option to be used in conjunction
> > with --with-float=soft for the same reason. The most likely use of
> > --with-nan is to build a distribution specifically to target an MSA capable
> > system like P5600 or perhaps an M5150 with an FPU. The NaN interlinking
> > work will make these use-cases less important still though I think.
> 
>  You can have GCC configured with `--with-nan=2008' and equipped with a

You 'can' but I don't think you would... unless that is actually what you
wanted which is really the premise of permitting nan2008 soft-float.

> soft-float multilib.  IMHO you ought to be able to just use `-msoft-float'
> then to select the soft-float multilib and have it implicitly use the
> legacy NaN encoding rather than having to pass `-msoft-float -mnan=legacy'
> to get the intended semantics.

This is something a vendor configuration could handle or the addition of
a spec to do it but I believe we have currently got a reasonably clean
separation of options in the generic/unknown configuration such that use
of one option does interfere with others (at least in terms of the options
which affect which ABI variant is in use). The less x implies y type
options we have the less mental trauma we have in understanding what the
effective behaviour is of a given set of options. That's not to say it is
in any way easy to figure out currently from an arbitrarily selected MIPS
GCC toolchain.

>  There shouldn't be a need for NaN interlinking for soft-float objects,
> that's just unnecessary burden IMO.

Indeed. I can't see anyone needing that as soft-float nan-2008 version is
highly unlikely.
 
>  MSA is irrelevant for soft-float operations, we don't have a soft-float
> MSA ABI.  If we ever define one, then we could well choose the 2008-NaN
> encoding for compatibility with hard-float code; there's no issue with
> backwards compatibility here as no legacy-NaN MSA hardware has been ever
> allowed.

I wasn't meaning to imply MSA makes sense with soft-float but rather making
the point that there are few scenarios where --with-nan configure time
option is likely to be used but one is a hard-float MSA native toolchain.

>  Have I missed anything?

Summary:

* No technical need to prohibit nan2008 soft-float
* Benefit to R6 onwards such that we don't have to track two different
  floating point formats forever in tools that may not bother with pre-r6
  in the future.
* Marginal benefit in sharing floating point data in binary format between
  soft and hard float programs.

Hope that explains my thinking on this.

Matthew


Re: [PATCH] Enable SGX intrinsics

2017-01-11 Thread Uros Bizjak
On Wed, Jan 11, 2017 at 8:59 PM, Koval, Julia  wrote:
> Hi, I rebased the patch onto latest trunk version and changed specification 
> according to ICC:
> _enclu_u32 (const int __L, size_t *__D)  -->  _enclu_u32 (const int __L, 
> size_t __D[])

I have committed the patch with additional testsuite changes and
removal of libgcc part. The later is wrong, since it changes order of
bits (please see the comment above the enum), and is generally not
needed, since it doesn't implement ABI addition, like SSE, AVX, or
similar.

2017-01-11  Julia Koval  

* common/config/i386/i386-common.c (OPTION_MASK_ISA_SGX_UNSET): New.
(OPTION_MASK_ISA_SGX_SET): New.
(ix86_handle_option): Handle OPT_msgx.
* config.gcc: Added sgxintrin.h.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect sgx.
* config/i386/i386-c.c (ix86_target_macros_internal): Define __SGX__.
* config/i386/i386.c (ix86_target_string): Add -msgx.
(PTA_SGX): New.
(ix86_option_override_internal): Handle new options.
(ix86_valid_target_attribute_inner_p): Add sgx.
* config/i386/i386.h (TARGET_SGX, TARGET_SGX_P): New.
* config/i386/i386.opt: Add msgx.
* config/i386/sgxintrin.h: New file.
* config/i386/x86intrin.h: Add sgxintrin.h.

testsuite/ChangeLog:

2017-01-11  Julia Koval  
Uros Bizjak  

* gcc.target/i386/sgx.c New test.
* gcc.target/i386/sse-12.c: Add -msgx.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Ditto.
* gcc.target/i386/sse-22.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.
* g++.dg/other/i386-2.C: Ditto.
* g++.dg/other/i386-3.C: Ditto.

Uros.
Index: common/config/i386/i386-common.c
===
--- common/config/i386/i386-common.c(revision 244335)
+++ common/config/i386/i386-common.c(working copy)
@@ -116,6 +116,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_ABM_SET \
   (OPTION_MASK_ISA_ABM | OPTION_MASK_ISA_POPCNT)
 
+#define OPTION_MASK_ISA_SGX_SET OPTION_MASK_ISA_SGX
 #define OPTION_MASK_ISA_BMI_SET OPTION_MASK_ISA_BMI
 #define OPTION_MASK_ISA_BMI2_SET OPTION_MASK_ISA_BMI2
 #define OPTION_MASK_ISA_LZCNT_SET OPTION_MASK_ISA_LZCNT
@@ -214,6 +215,7 @@ along with GCC; see the file COPYING3.  If not see
 #define OPTION_MASK_ISA_SHA_UNSET OPTION_MASK_ISA_SHA
 #define OPTION_MASK_ISA_PCLMUL_UNSET OPTION_MASK_ISA_PCLMUL
 #define OPTION_MASK_ISA_ABM_UNSET OPTION_MASK_ISA_ABM
+#define OPTION_MASK_ISA_SGX_UNSET OPTION_MASK_ISA_SGX
 #define OPTION_MASK_ISA_BMI_UNSET OPTION_MASK_ISA_BMI
 #define OPTION_MASK_ISA_BMI2_UNSET OPTION_MASK_ISA_BMI2
 #define OPTION_MASK_ISA_LZCNT_UNSET OPTION_MASK_ISA_LZCNT
@@ -500,6 +502,19 @@ ix86_handle_option (struct gcc_options *opts,
}
   return true;
 
+case OPT_msgx:
+  if (value)
+   {
+ opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_SGX_SET;
+ opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_SGX_SET;
+   }
+  else
+   {
+ opts->x_ix86_isa_flags2 &= ~OPTION_MASK_ISA_SGX_UNSET;
+ opts->x_ix86_isa_flags2_explicit |= OPTION_MASK_ISA_SGX_UNSET;
+   }
+  return true;
+
 case OPT_mavx512dq:
   if (value)
{
Index: config/i386/cpuid.h
===
--- config/i386/cpuid.h (revision 244335)
+++ config/i386/cpuid.h (working copy)
@@ -74,6 +74,7 @@
 /* Extended Features (%eax == 7) */
 /* %ebx */
 #define bit_FSGSBASE   (1 << 0)
+#define bit_SGX (1 << 2)
 #define bit_BMI(1 << 3)
 #define bit_HLE(1 << 4)
 #define bit_AVX2   (1 << 5)
Index: config/i386/driver-i386.c
===
--- config/i386/driver-i386.c   (revision 244335)
+++ config/i386/driver-i386.c   (working copy)
@@ -404,7 +404,7 @@ const char *host_detect_local_cpu (int argc, const
   unsigned int has_pclmul = 0, has_abm = 0, has_lwp = 0;
   unsigned int has_fma = 0, has_fma4 = 0, has_xop = 0;
   unsigned int has_bmi = 0, has_bmi2 = 0, has_tbm = 0, has_lzcnt = 0;
-  unsigned int has_hle = 0, has_rtm = 0;
+  unsigned int has_hle = 0, has_rtm = 0, has_sgx = 0;
   unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0;
   unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0;
   unsigned int has_osxsave = 0, has_fxsr = 0, has_xsave = 0, has_xsaveopt = 0;
@@ -480,6 +480,7 @@ const char *host_detect_local_cpu (int argc, const
   __cpuid_count (7, 0, eax, ebx, ecx, edx);
 
   has_bmi = ebx & bit_BMI;
+  has_sgx = ebx & bit_SGX;
   has_hle = ebx & bit_HLE;
   has_rtm = ebx & bit_RTM;
   has_avx2 = ebx & bit_AVX2;
@@ -993,6 +994,7 @@ const char *host_detect_local_cpu (int argc, const
   const char *fma4 = has_fma4 ? " -mfma4" : " -mno-fma4";
   const char *xop = has_xop ? " -mxop" : " -mno-xop";
   const char *bmi = has_bmi 

C++ PATCH for c++/78337 (ICE with invalid generic lambda)

2017-01-11 Thread Jason Merrill
We instantiate the return type of the lambda outside of the function
context, at which point trying to walk from the template instantiation
context up to the context of 'f' hits NULL_TREE.  So we should handle
that.

There was also a SFINAE issue whereby we skipped the error in SFINAE
context, but still gave the inform.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 1742e12992e0df29546bde5b3714a07ec61e14c7
Author: Jason Merrill 
Date:   Wed Jan 11 07:45:01 2017 -0500

PR c++/78337 - ICE on invalid with generic lambda

* semantics.c (process_outer_var_ref): Check if containing_function
is null.  Move inform call under complain test.

diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 342b671..4202475 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -3278,6 +3278,8 @@ process_outer_var_ref (tree decl, tsubst_flags_t complain)
2. a non-lambda function, or
3. a non-default capturing lambda function.  */
 while (context != containing_function
+  /* containing_function can be null with invalid generic lambdas.  */
+  && containing_function
   && LAMBDA_FUNCTION_P (containing_function))
   {
tree closure = DECL_CONTEXT (containing_function);
@@ -3365,10 +3367,13 @@ process_outer_var_ref (tree decl, tsubst_flags_t 
complain)
   else
 {
   if (complain & tf_error)
-   error (VAR_P (decl)
-  ? G_("use of local variable with automatic storage from 
containing function")
-  : G_("use of parameter from containing function"));
-  inform (DECL_SOURCE_LOCATION (decl), "%q#D declared here", decl);
+   {
+ error (VAR_P (decl)
+? G_("use of local variable with automatic storage from "
+ "containing function")
+: G_("use of parameter from containing function"));
+ inform (DECL_SOURCE_LOCATION (decl), "%q#D declared here", decl);
+   }
   return error_mark_node;
 }
   return decl;
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-generic-ice5.C 
b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-ice5.C
new file mode 100644
index 000..473e412
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-ice5.C
@@ -0,0 +1,27 @@
+// PR c++/78337
+// { dg-do compile { target c++14 } }
+
+struct X {
+  static constexpr int foo (int b) {
+return b;
+  }
+};
+
+template
+using Void = void;
+
+template
+auto
+bar(F f, A a) -> decltype( ( f(a) , 0 ) ) // { dg-error "no match" }
+{ return {}; }
+
+
+int main() {
+  //constexpr
+  int f = 3;
+  (void)f;
+  auto l = [](auto of_type_X)->
+Void<(decltype(of_type_X)::foo(f), 0)> // { dg-error "variable" }
+{return;};
+  bar(l , X{});// { dg-error "no match" }
+}


Re: [PATCH] Optimize n + 1 for automatic n array (PR c++/71537)

2017-01-11 Thread Jakub Jelinek
On Wed, Jan 11, 2017 at 10:27:23PM +0100, John Tytgat wrote:
> On 01/10/2017 11:40 PM, Jakub Jelinek wrote:
> > +constexpr bool
> > +foo ()
> > +{
> > +  constexpr int n[42] = { 1 };
> > +  constexpr int o = n ? 1 : 0;
> > +  constexpr int p = n + 1 ? 1 : 0;
> > +  constexpr int q = "abc" + 1 ? 1 : 0;
> > +  return p + p + q == 3;
> > +}
> 
> Not o + p + q ?

Oops, fixed.

Jakub


Re: [PATCH] Optimize n + 1 for automatic n array (PR c++/71537)

2017-01-11 Thread John Tytgat

On 01/10/2017 11:40 PM, Jakub Jelinek wrote:

+constexpr bool
+foo ()
+{
+  constexpr int n[42] = { 1 };
+  constexpr int o = n ? 1 : 0;
+  constexpr int p = n + 1 ? 1 : 0;
+  constexpr int q = "abc" + 1 ? 1 : 0;
+  return p + p + q == 3;
+}


Not o + p + q ?

John.



Re: [PATCH, rs6000] Add vec_nabs builtin support

2017-01-11 Thread Segher Boessenkool
Hi Carl,

On Mon, Jan 09, 2017 at 10:02:40AM -0800, Carl E. Love wrote:
>   * config/rs6000/rs6000-c: Add support for built-in functions

rs6000-c.c

>   vector signed char vec_nabs (vector signed char)
>   vector signed short vec_nabs (vector signed short)
>   vector signed int vec_nabs (vector signed int)
>   vector signed long long vec_nabs (vector signed long long)
>   vector float vec_nabs (vector float)
>   vector double vec_nabs (vector double)

You should mention the name of the function or data etc. you modified here:

rs6000-c.c (altivec_overloaded_builtins): Blabla.

or something like that.

>   * config/rs6000/rs6000-builtin.def: Add definitions for NABS functions
>   and NABS overload.
>   * config/rs6000/altivec.md: Add define to expand nabs2 types
>   * config/rs6000/altivec.h: Add define for vec_nabs built-in function.
>   * doc/extend.texi: Update the built-in documentation file for the
>   new built-in functions.

Here, too.

> +  int i, n_elt = GET_MODE_NUNITS (mode);

Two lines for this please, two separate declarations.  I realise you just
copied this code ;-)


Segher


gcc-5 branch broken

2017-01-11 Thread Nathan Sidwell

Andre,
this patch:
2017-01-11  Andre Vieira 

Backport from mainline
2016-12-09  Andre Vieira 

PR rtl-optimization/78255
* gcc/postreload.c (reload_cse_simplify): Do not CSE a function if
NO_FUNCTION_CSE is true.

breaks gcc 5 builds on targets that do:
#define NO_FUNCTION_CSE

(define to blank, not 0 or 1).  there are a bunch of themm and i686 is one.

Please fix.

nathan
--
Nathan Sidwell


[C++ PATCH] PR/77812 struct stat hack fix

2017-01-11 Thread Nathan Sidwell
issue 77812 turned out to be a srtuct stat hack problem, the bisected 
commit a red herring (that just made it noisy for the enum case).


We wrap using decls and template functions in a singleton overload, and 
this can confuse set_namespace_binding into thinking we're augmenting an 
existing OVERLOAD binding.  We need to check for this case and use 
supplement_binding in that case, just as if it was a non-overload.


Applied to trunk (and will backport to 5 & 6)

nathan
--
Nathan Sidwell
2017-01-11  Nathan Sidwell  

	cp/
	PR c++/77812
	* name-lookup.c (set_namespace_binding_1): An overload of 1 decl
	is a new overload.

	testsuite/
	PR c++/77812
	* g++.dg/pr77812.C: New.

Index: cp/name-lookup.c
===
--- cp/name-lookup.c	(revision 244334)
+++ cp/name-lookup.c	(working copy)
@@ -3496,7 +3496,12 @@ set_namespace_binding_1 (tree name, tree
   if (scope == NULL_TREE)
 scope = global_namespace;
   b = binding_for_name (NAMESPACE_LEVEL (scope), name);
-  if (!b->value || TREE_CODE (val) == OVERLOAD || val == error_mark_node)
+  if (!b->value
+  /* For templates and using we create a single element OVERLOAD.
+	 Look for the chain to know whether this is really augmenting
+	 an existing overload.  */
+  || (TREE_CODE (val) == OVERLOAD && OVL_CHAIN (val))
+  || val == error_mark_node)
 b->value = val;
   else
 supplement_binding (b, val);
Index: testsuite/g++.dg/pr77812.C
===
--- testsuite/g++.dg/pr77812.C	(revision 0)
+++ testsuite/g++.dg/pr77812.C	(working copy)
@@ -0,0 +1,18 @@
+// PR77812
+// struct-stat hack failure when first overload is a template
+
+enum f {};
+
+template 
+void f ()
+{
+}
+enum f F;
+
+struct g {};
+
+template 
+void g ()
+{
+}
+struct g G;


[PATCH] Fix DW_AT_data_member_location/DW_AT_bit_offset handling (PR debug/78839)

2017-01-11 Thread Jakub Jelinek
Hi!

In GCC 5 and earlier, field_byte_offset had code for
PCC_BITFIELD_TYPE_MATTERS target that figured out what
DW_AT_data_member_location value to use vs. DW_AT_bit_offset.

That code is still in there, but due to several bugs added in r231762
it never triggers anymore.
One is that the is_{bit,byte}_offset_cst variables are set to the opposite
of how they are named (they are true if the corresponding offset is not
an INTEGER_CST).  Another one is that the PCC_BITFIELD_TYPE_MATTERS block,
that has been in GCC 5 and earlier guarded just with
if (PCC_BITFIELD_TYPE_MATTERS),
is guarded with what the var names imply, but the earlier bail-out is
according to what the var is set to, so for variable bit size we return
early, and the PCC_BITFIELD_TYPE_MATTERS is then only invoked for variable
size while it was meant to be used only for constant size.
Even with that fixed, we in a large computation compute something that is
completely ignored.

The following patch fixes all this and restores the GCC 5 behavior.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

For -gdwarf-5 (and after 2 years maybe also for -gdwarf-4) we should use
DW_AT_data_bit_offset instead, which gdb just gained support for recently.
I'll work on that tomorrow.

2017-01-11  Jakub Jelinek  

PR debug/78839
* dwarf2out.c (field_byte_offset): Restore the
PCC_BITFIELD_TYPE_MATTERS behavior for INTEGER_CST DECL_FIELD_OFFSET
and DECL_FIELD_BIT_OFFSET.  Use fold_build2 instead of build2 + fold.
(analyze_variants_discr, gen_variant_part): Use fold_build2 instead
of build2 + fold.

--- gcc/dwarf2out.c.jj  2017-01-11 17:45:47.0 +0100
+++ gcc/dwarf2out.c 2017-01-11 18:49:37.184605527 +0100
@@ -17980,10 +17980,6 @@ static dw_loc_descr_ref
 field_byte_offset (const_tree decl, struct vlr_context *ctx,
   HOST_WIDE_INT *cst_offset)
 {
-  offset_int object_offset_in_bits;
-  offset_int object_offset_in_bytes;
-  offset_int bitpos_int;
-  bool is_byte_offset_cst, is_bit_offset_cst;
   tree tree_result;
   dw_loc_list_ref loc_result;
 
@@ -17994,20 +17990,21 @@ field_byte_offset (const_tree decl, stru
   else
 gcc_assert (TREE_CODE (decl) == FIELD_DECL);
 
-  is_bit_offset_cst = TREE_CODE (DECL_FIELD_BIT_OFFSET (decl)) != INTEGER_CST;
-  is_byte_offset_cst = TREE_CODE (DECL_FIELD_OFFSET (decl)) != INTEGER_CST;
-
   /* We cannot handle variable bit offsets at the moment, so abort if it's the
  case.  */
-  if (is_bit_offset_cst)
+  if (TREE_CODE (DECL_FIELD_BIT_OFFSET (decl)) != INTEGER_CST)
 return NULL;
 
 #ifdef PCC_BITFIELD_TYPE_MATTERS
   /* We used to handle only constant offsets in all cases.  Now, we handle
  properly dynamic byte offsets only when PCC bitfield type doesn't
  matter.  */
-  if (PCC_BITFIELD_TYPE_MATTERS && is_byte_offset_cst && is_bit_offset_cst)
+  if (PCC_BITFIELD_TYPE_MATTERS
+  && TREE_CODE (DECL_FIELD_OFFSET (decl)) == INTEGER_CST)
 {
+  offset_int object_offset_in_bits;
+  offset_int object_offset_in_bytes;
+  offset_int bitpos_int;
   tree type;
   tree field_size_tree;
   offset_int deepest_bitpos;
@@ -18102,13 +18099,22 @@ field_byte_offset (const_tree decl, stru
  object_offset_in_bits
= round_up_to_align (object_offset_in_bits, decl_align_in_bits);
}
+
+  object_offset_in_bytes
+   = wi::lrshift (object_offset_in_bits, LOG2_BITS_PER_UNIT);
+  if (ctx->variant_part_offset == NULL_TREE)
+   {
+ *cst_offset = object_offset_in_bytes.to_shwi ();
+ return NULL;
+   }
+  tree_result = wide_int_to_tree (sizetype, object_offset_in_bytes);
 }
+  else
 #endif /* PCC_BITFIELD_TYPE_MATTERS */
-
-  tree_result = byte_position (decl);
+tree_result = byte_position (decl);
   if (ctx->variant_part_offset != NULL_TREE)
-tree_result = fold (build2 (PLUS_EXPR, TREE_TYPE (tree_result),
-   ctx->variant_part_offset, tree_result));
+tree_result = fold_build2 (PLUS_EXPR, TREE_TYPE (tree_result),
+  ctx->variant_part_offset, tree_result);
 
   /* If the byte offset is a constant, it's simplier to handle a native
  constant rather than a DWARF expression.  */
@@ -23727,14 +23733,12 @@ analyze_variants_discr (tree variant_par
 
  if (!lower_cst_included)
lower_cst
- = fold (build2 (PLUS_EXPR, TREE_TYPE (lower_cst),
- lower_cst,
- build_int_cst (TREE_TYPE (lower_cst), 1)));
+ = fold_build2 (PLUS_EXPR, TREE_TYPE (lower_cst), lower_cst,
+build_int_cst (TREE_TYPE (lower_cst), 1));
  if (!upper_cst_included)
upper_cst
- = fold (build2 (MINUS_EXPR, TREE_TYPE (upper_cst),
- upper_cst,
- 

[C++ PATCH] Overload pushing cleanup

2017-01-11 Thread Nathan Sidwell
In fixing 77812, a name lookup bug, I got confused by the overload 
creation code, which seemed a bit complex.  This simplifies it by 
clearly separating the optional wrapping of a single decl from the 
subsequent prepending of the new decl.


applied to trunk

nathan
--
Nathan Sidwell
2017-01-11  Nathan Sidwell  

	* name-lookup.c (push_overloaded_decl_1): Refactor OVERLOAD creation.

Index: cp/name-lookup.c
===
--- cp/name-lookup.c	(revision 244314)
+++ cp/name-lookup.c	(working copy)
@@ -2454,9 +2454,11 @@ push_overloaded_decl_1 (tree decl, int f
   || (flags & PUSH_USING))
 {
   if (old && TREE_CODE (old) != OVERLOAD)
-	new_binding = ovl_cons (decl, ovl_cons (old, NULL_TREE));
+	/* Wrap the existing single decl in an overload.  */
+	new_binding = ovl_cons (old, NULL_TREE);
   else
-	new_binding = ovl_cons (decl, old);
+	new_binding = old;
+  new_binding = ovl_cons (decl, new_binding);
   if (flags & PUSH_USING)
 	OVL_USED (new_binding) = 1;
 }


RE: [PATCH] Enable SGX intrinsics

2017-01-11 Thread Koval, Julia
Hi, I rebased the patch onto latest trunk version and changed specification 
according to ICC:
_enclu_u32 (const int __L, size_t *__D)  -->  _enclu_u32 (const int __L, size_t 
__D[])

The Changelogs remained the same:

gcc/
* common/config/i386/i386-common.c
   (OPTION_MASK_ISA_SGX_UNSET, OPTION_MASK_ISA_SGX_SET): New.
   (ix86_handle_option): Handle OPT_msgx.
* config.gcc: Added sgxintrin.h.
* config/i386/cpuid.h (bit_SGX): New.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect sgx.
* config/i386/i386-c.c (ix86_target_macros_internal): Define __SGX__.
* config/i386/i386.c
   (ix86_target_string): Add -msgx.
   (PTA_SGX): New.
   (ix86_option_override_internal): Handle new options.
   (ix86_valid_target_attribute_inner_p): Add sgx.
* config/i386/i386.h (TARGET_SGX, TARGET_SGX_P): New.
* config/i386/i386.opt: Add msgx.
* config/i386/sgxintrin.h: New file.
* config/i386/x86intrin.h: Add sgxintrin.h.

gcc/testsuite/
* testsuite/gcc.target/i386/sgx.c New test.

libgcc/
config/i386/cpuinfo.c (get_available_features): Handle FEATURE_SGX.
config/i386/cpuinfo.h (FEATURE_SGX): New.

Thanks,
Julia

-Original Message-
From: Koval, Julia 
Sent: Wednesday, January 11, 2017 1:08 PM
To: 'Uros Bizjak' 
Cc: Andrew Senkevich ; GCC Patches 
; vaalfr...@gmail.com; kirill.yuk...@gmail.com; Jakub 
Jelinek 
Subject: RE: [PATCH] Enable SGX intrinsics

Here is it.

gcc/
* common/config/i386/i386-common.c
   (OPTION_MASK_ISA_SGX_UNSET, OPTION_MASK_ISA_SGX_SET): New.
   (ix86_handle_option): Handle OPT_msgx.
* config.gcc: Added sgxintrin.h.
* config/i386/cpuid.h (bit_SGX): New.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect sgx.
* config/i386/i386-c.c (ix86_target_macros_internal): Define __SGX__.
* config/i386/i386.c
   (ix86_target_string): Add -msgx.
   (PTA_SGX): New.
   (ix86_option_override_internal): Handle new options.
   (ix86_valid_target_attribute_inner_p): Add sgx.
* config/i386/i386.h (TARGET_SGX, TARGET_SGX_P): New.
* config/i386/i386.opt: Add msgx.
* config/i386/sgxintrin.h: New file.
* config/i386/x86intrin.h: Add sgxintrin.h.
* testsuite/gcc.target/i386/sgx.c New test

libgcc/
config/i386/cpuinfo.c (get_available_features): Handle FEATURE_SGX.
config/i386/cpuinfo.h (FEATURE_SGX): New.

Thanks,
Julia

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Wednesday, January 11, 2017 1:02 PM
To: Koval, Julia 
Cc: Andrew Senkevich ; GCC Patches 
; vaalfr...@gmail.com; kirill.yuk...@gmail.com; Jakub 
Jelinek 
Subject: Re: [PATCH] Enable SGX intrinsics

On Wed, Jan 11, 2017 at 12:40 PM, Koval, Julia  wrote:
> Ok, fixed it. Can you please commit it for me, cause I don't have rights to 
> commit?

OK, but please send me updated ChangeLogs.

Uros.

> Thanks,
> Julia
>
> -Original Message-
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Wednesday, January 11, 2017 12:11 PM
> To: Koval, Julia 
> Cc: Andrew Senkevich ; GCC Patches 
> ; vaalfr...@gmail.com; kirill.yuk...@gmail.com; 
> Jakub Jelinek 
> Subject: Re: [PATCH] Enable SGX intrinsics
>
> On Wed, Jan 11, 2017 at 11:31 AM, Koval, Julia  wrote:
>> Ok. I fixed the enum formatting and the enums remain internal.
>
> @@ -7023,7 +7029,6 @@ ix86_can_inline_p (tree caller, tree callee)
>bool ret = false;
>tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
>tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
> -
>/* If callee has no option attributes, then it is ok to inline.  */
>if (!callee_tree)
>  ret = true;
>
>
> No need for the above whitespace change.
>
> OK for mainline with the above part reverted.
>
> Thanks,
> Uros.


0001-SGX-intrinsiks.patch
Description: 0001-SGX-intrinsiks.patch


Re: [PATCH 2/2] IPA ICF: make algorithm stable to survive -fcompare-debug

2017-01-11 Thread Jakub Jelinek
On Wed, Jan 11, 2017 at 11:48:03AM +0100, Martin Liška wrote:
> gcc/testsuite/ChangeLog:
> 
> 2017-01-11  Martin Liska  
> 
>   * gcc.dg/tree-ssa/flatten-3.c: Add -fno-ipa-icf to dg-options.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c
> index a1edb910e9d..153165c72e3 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options -O2 } */
> +/* { dg-options -O2 -fno-ipa-icf } */
>  
>  extern void do_something_usefull();
>  /* Check that we finish compiling even if instructed to

The test still fails, now with:
ERROR: gcc.dg/tree-ssa/flatten-3.c: syntax error in target selector 
"-fno-ipa-icf" for " dg-options 2 -O2 -fno-ipa-icf "

Fixed thusly, committed as obvious:

2017-01-11  Jakub Jelinek  

* gcc.dg/tree-ssa/flatten-3.c: Add quotation marks around dg-options
argument.

--- gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c   (revision 244331)
+++ gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c   (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -fno-ipa-icf } */
+/* { dg-options "-O2 -fno-ipa-icf" } */
 
 extern void do_something_usefull();
 /* Check that we finish compiling even if instructed to


Jakub


RE: Make MIPS soft-fp preserve NaN payloads for NAN2008

2017-01-11 Thread Maciej W. Rozycki
On Thu, 5 Jan 2017, Matthew Fortune wrote:

> > >  AFAIR we deliberately decided not to define a 2008-NaN soft-float
> > > ABI, and chose to require all soft-float binaries to use the legacy
> > encoding.
> > 
> > Soft-float and 2008-NaN are naturally completely orthogonal and the
> > combination works fine (of course, it doesn't need any special kernel or
> > hardware support).  There's no need to disallow the combination.

 True, however unlike with hard-float we are not constrained by hardware 
here and IMO the advantage of using 2008-NaN semantics (i.e. payload 
preservation) does not outweigh the drawback of having two incompatible 
ABIs and the issue of interlinking between them -- a solution for which 
which, as it has been shown, causes enough controversy to still have not 
been agreed upon.  Why not avoid it altogether then, as originally chosen, 
and stick to legacy NaN for all soft-float?

> > In any case, the soft-fp change is relevant in the hard-float case as
> > well, to make software TFmode behave consistently with hardware SFmode
> > and DFmode regarding NaN payload preservation.

 Is mixing TFmode, DFmode and SFmode operations with the two latters 
handled in hardware and the former deferred to soft-fp a supported 
configuration?  Do we have any MIPS ABI which provides for using all these 
data types?  AFAIK all MIPS/Linux ABIs use DFmode for `long double' and 
IRIX support (which did have a TFmode data type, though used an unusual 
format and didn't support 2008 NaN anyway) has been dropped.  Any bare 
metal targets?  I dare say none of the MIPS/*BSD targets supports 2008 
NaN; no idea offhand about their TFmode support.

> It is true to say that users are discouraged from using 2008-NaN with
> soft-float for pre-R6 architectures simply to avoid further fragmentation
> of software for no real gain. However, for R6 then soft-float is 2008-NaN
> otherwise we are stuck with legacy-NaN forever.

 What's the actual issue you have with legacy NaN, and how does soft-float 
relate to R6?  It's not like hardware, R6 or othwerwise, limits soft-float 
in any way.

> If someone did want to build a system from source with soft-float as
> 2008-NaN then I see no reason to stop them but I doubt they would and I
> don't expect the --with-nan GCC configure option to be used in conjunction
> with --with-float=soft for the same reason. The most likely use of
> --with-nan is to build a distribution specifically to target an MSA capable
> system like P5600 or perhaps an M5150 with an FPU. The NaN interlinking
> work will make these use-cases less important still though I think.

 You can have GCC configured with `--with-nan=2008' and equipped with a 
soft-float multilib.  IMHO you ought to be able to just use `-msoft-float' 
then to select the soft-float multilib and have it implicitly use the 
legacy NaN encoding rather than having to pass `-msoft-float -mnan=legacy' 
to get the intended semantics.

 There shouldn't be a need for NaN interlinking for soft-float objects, 
that's just unnecessary burden IMO.

 MSA is irrelevant for soft-float operations, we don't have a soft-float 
MSA ABI.  If we ever define one, then we could well choose the 2008-NaN 
encoding for compatibility with hard-float code; there's no issue with 
backwards compatibility here as no legacy-NaN MSA hardware has been ever 
allowed.

 Have I missed anything?

  Maciej


[PATCh] Avoid in gcc.target/powerpc/ -m32 or -m64 in dg-options (PR target/77416)

2017-01-11 Thread Jakub Jelinek
Hi!

The pr77416.c test fails in some configurations, where the -m32 option
is not supported.  The following patch fixes that by guarding the test
with ilp32 effective target and removing the -m32 and adjusts a couple of
tests that already have ilp32 or lp64 guard, but have useless -m32 or -m64
in dg-options in addition to that.

Bootstrapped/regtested on powerpc64le-linux, ok for trunk?

2017-01-11  Jakub Jelinek  

PR target/77416
* gcc.target/powerpc/pr77416.c Guard the test only for ilp32 effective
target.  Use powerpc* instead of powerpc64* in targets.  Remove -m32
from dg-options.
* gcc.target/powerpc/pr64205.c: Remove -m32 from dg-options of ilp32
guarded test.
* gcc.target/powerpc/fusion4.c: Likewise.
* gcc.target/powerpc/pr63491.c: Remove -m64 from dg-options of lp64
guarded test.
* gcc.target/powerpc/pr58673-1.c: Likewise.
* gcc.target/powerpc/pr58673-2.c: Likewise.
* gcc.target/powerpc/pr59054.c: Likewise.

--- gcc/testsuite/gcc.target/powerpc/pr77416.c.jj   2016-09-20 
10:25:59.0 +0200
+++ gcc/testsuite/gcc.target/powerpc/pr77416.c  2017-01-11 14:18:32.565621295 
+0100
@@ -1,7 +1,7 @@
-/* { dg-do compile { target { powerpc64*-*-*} } } */
-/* { dg-skip-if "" { powerpc64-*-aix* } { "*" } { "" } } */
-/* { dg-skip-if "do not override -mcpu" { powerpc64*-*-* } { "-mcpu=*" } { 
"-mcpu=power7" } } */
-/* { dg-options "-mcpu=power7 -O2 -m32" } */
+/* { dg-do compile { target { { powerpc*-*-* } && ilp32 } } } */
+/* { dg-skip-if "" { powerpc*-*-aix* } { "*" } { "" } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power7" } } */
+/* { dg-options "-mcpu=power7 -O2" } */
 /* { dg-final { scan-assembler-times "addze" 1 } } */
 
 extern int fn2 ();
--- gcc/testsuite/gcc.target/powerpc/pr64205.c.jj   2015-02-16 
11:19:03.0 +0100
+++ gcc/testsuite/gcc.target/powerpc/pr64205.c  2017-01-11 14:20:22.696200110 
+0100
@@ -1,7 +1,7 @@
 /* { dg-do compile { target { powerpc*-*-* && ilp32 } } } */
 /* { dg-skip-if "" { powerpc*-*-aix* } { "*" } { "" } } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=G5" } } */
-/* { dg-options "-O2 -mcpu=G5 -maltivec -m32" } */
+/* { dg-options "-O2 -mcpu=G5 -maltivec" } */
 
 union ieee754r_Decimal32
 {
--- gcc/testsuite/gcc.target/powerpc/fusion4.c.jj   2016-11-18 
20:04:20.0 +0100
+++ gcc/testsuite/gcc.target/powerpc/fusion4.c  2017-01-11 14:19:59.929493904 
+0100
@@ -2,7 +2,7 @@
 /* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power7" } } */
-/* { dg-options "-mcpu=power7 -mtune=power9 -O3 -msoft-float -m32" } */
+/* { dg-options "-mcpu=power7 -mtune=power9 -O3 -msoft-float" } */
 
 #define LARGE 0x12345
 
--- gcc/testsuite/gcc.target/powerpc/pr63491.c.jj   2015-03-20 
12:34:48.0 +0100
+++ gcc/testsuite/gcc.target/powerpc/pr63491.c  2017-01-11 14:21:59.880953359 
+0100
@@ -1,5 +1,5 @@
 /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
-/* { dg-options "-O1 -m64 -mcpu=power8 -mlra" } */
+/* { dg-options "-O1 -mcpu=power8 -mlra" } */
 
 typedef __int128_t __attribute__((__vector_size__(16))) vector_128_t;
 typedef unsigned long long scalar_64_t;
--- gcc/testsuite/gcc.target/powerpc/pr58673-1.c.jj 2014-11-11 
00:05:43.0 +0100
+++ gcc/testsuite/gcc.target/powerpc/pr58673-1.c2017-01-11 
14:21:07.224627429 +0100
@@ -2,7 +2,7 @@
 /* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
-/* { dg-options "-mcpu=power8 -m64 -O1" } */
+/* { dg-options "-mcpu=power8 -O1" } */
 
 enum typecode
 {
--- gcc/testsuite/gcc.target/powerpc/pr58673-2.c.jj 2014-11-11 
00:05:43.0 +0100
+++ gcc/testsuite/gcc.target/powerpc/pr58673-2.c2017-01-11 
14:21:22.336433978 +0100
@@ -2,7 +2,7 @@
 /* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
 /* { dg-require-effective-target powerpc_p8vector_ok } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
-/* { dg-options "-mcpu=power8 -O3 -m64 -funroll-loops" } */
+/* { dg-options "-mcpu=power8 -O3 -funroll-loops" } */
 
 #include 
 #include 
--- gcc/testsuite/gcc.target/powerpc/pr59054.c.jj   2014-11-11 
00:05:43.0 +0100
+++ gcc/testsuite/gcc.target/powerpc/pr59054.c  2017-01-11 14:21:39.076219687 
+0100
@@ -2,6 +2,6 @@
 /* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
 /* { dg-require-effective-target powerpc_vsx_ok } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power7" } } */
-/* { dg-options "-mcpu=power7 -O0 -m64" } */
+/* { dg-options 

Re: [patch,avr] PR78883: Implement a dummy scheduler

2017-01-11 Thread Georg-Johann Lay

Segher Boessenkool schrieb:

On Wed, Jan 04, 2017 at 12:29:49PM -0700, Jeff Law wrote:

We should split off a new "SUBREGS_OF_MEM_ALLOWED" from !INSN_SCHEDULING,
and then probably even default it to false.
That would work for me :-)  The question in my mind would be unexpected 
fallout at this point in the release process.  Maybe default it to 
!INSN_SCHEDULING to minimize such fallout now, then to false for gcc-8?


Do we want to change anything right now?  AVR has a workaround for now.


None of which got approval...



Segher





Go patch committed: add Bfunction to conditional_expression backend method

2017-01-11 Thread Ian Lance Taylor
This patch by Than McIntosh modifies the conditional_expression method
in the Backend interface to take a Bfunction.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

2017-01-11  Than McIntosh  

* go-gcc.cc (conditional_expression): Add Bfunction parameter.
Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc(revision 244166)
+++ gcc/go/go-gcc.cc(working copy)
@@ -325,8 +325,8 @@ class Gcc_backend : public Backend
   compound_expression(Bstatement*, Bexpression*, Location);
 
   Bexpression*
-  conditional_expression(Btype*, Bexpression*, Bexpression*, Bexpression*,
- Location);
+  conditional_expression(Bfunction*, Btype*, Bexpression*, Bexpression*,
+ Bexpression*, Location);
 
   Bexpression*
   unary_expression(Operator, Bexpression*, Location);
@@ -1546,7 +1546,8 @@ Gcc_backend::compound_expression(Bstatem
 // ELSE_EXPR otherwise.
 
 Bexpression*
-Gcc_backend::conditional_expression(Btype* btype, Bexpression* condition,
+Gcc_backend::conditional_expression(Bfunction*, Btype* btype,
+Bexpression* condition,
 Bexpression* then_expr,
 Bexpression* else_expr, Location location)
 {
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 244327)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-6be46149636c3533389e62c6dc76f0a7ff461080
+153f7b68c0c4d3cf3da0becf82eb1a3eb8b47d6e
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/backend.h
===
--- gcc/go/gofrontend/backend.h (revision 244166)
+++ gcc/go/gofrontend/backend.h (working copy)
@@ -324,12 +324,12 @@ class Backend
   compound_expression(Bstatement* bstat, Bexpression* bexpr, Location) = 0;
 
   // Return an expression that executes THEN_EXPR if CONDITION is true, or
-  // ELSE_EXPR otherwise and returns the result as type BTYPE.  ELSE_EXPR
-  // may be NULL.  BTYPE may be NULL.
+  // ELSE_EXPR otherwise and returns the result as type BTYPE, within the
+  // specified function FUNCTION.  ELSE_EXPR may be NULL.  BTYPE may be NULL.
   virtual Bexpression*
-  conditional_expression(Btype* btype, Bexpression* condition,
- Bexpression* then_expr, Bexpression* else_expr,
- Location) = 0;
+  conditional_expression(Bfunction* function, Btype* btype,
+ Bexpression* condition, Bexpression* then_expr,
+ Bexpression* else_expr, Location) = 0;
 
   // Return an expression for the unary operation OP EXPR.
   // Supported values of OP are (from operators.h):
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 244327)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -4390,7 +4390,9 @@ Unary_expression::do_get_backend(Transla
 Bexpression* crash =
  gogo->runtime_error(RUNTIME_ERROR_NIL_DEREFERENCE,
  loc)->get_backend(context);
-bexpr = gogo->backend()->conditional_expression(btype, compare,
+Bfunction* bfn = context->function()->func_value()->get_decl();
+bexpr = gogo->backend()->conditional_expression(bfn, btype,
+compare,
 crash, bexpr,
 loc);
 
@@ -5992,6 +5994,7 @@ Binary_expression::do_get_backend(Transl
   Bexpression* zero_expr =
   gogo->backend()->integer_constant_expression(left_btype, zero);
   overflow = zero_expr;
+  Bfunction* bfn = context->function()->func_value()->get_decl();
   if (this->op_ == OPERATOR_RSHIFT
  && !left_type->integer_type()->is_unsigned())
{
@@ -6000,11 +6003,12 @@ Binary_expression::do_get_backend(Transl
  zero_expr, loc);
   Bexpression* neg_one_expr =
   gogo->backend()->integer_constant_expression(left_btype, 
neg_one);
-  overflow = gogo->backend()->conditional_expression(btype, neg_expr,
+  overflow = gogo->backend()->conditional_expression(bfn,
+ btype, neg_expr,
  neg_one_expr,
  zero_expr, loc);
}
-  ret = gogo->backend()->conditional_expression(btype, compare, ret,
+  ret 

Re: [RFC] combine: Handle zero_extend without subreg in change_zero_ext.

2017-01-11 Thread Segher Boessenkool
On Thu, Jan 05, 2017 at 05:46:51PM +0100, Dominik Vogt wrote:
> The attached patch deals with another type of zero_extend that is
> not yet handled in change_zero_ext, i.e. (zero_extend
> (pseudoreg)), without a "subreg" in between.  What do you think?
> (Mostly untested yet.)

My main question is: where is this useful?  Can you show some example
please?


Segher


Re: [C++ PATCH] Fix ICE on invalid (PR c++/78341)

2017-01-11 Thread Jason Merrill
OK.

On Tue, Jan 10, 2017 at 5:35 PM, Jakub Jelinek  wrote:
> Hi!
>
> As mentioned in the PR, cp_parser_parse_definitely may fail even when
> alignas_expr actually is meaningful, e.g. when the error is due to the
> missing closing paren.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2017-01-10  Jakub Jelinek  
>
> PR c++/78341
> * parser.c (cp_parser_std_attribute_spec): Remove over-eager
> assertion.  Formatting fix.
>
> * g++.dg/cpp0x/pr78341.C: New test.
>
> --- gcc/cp/parser.c.jj  2017-01-10 08:12:46.0 +0100
> +++ gcc/cp/parser.c 2017-01-10 11:09:04.217456830 +0100
> @@ -24931,11 +24931,7 @@ cp_parser_std_attribute_spec (cp_parser
>
>if (!cp_parser_parse_definitely (parser))
> {
> - gcc_assert (alignas_expr == error_mark_node
> - || alignas_expr == NULL_TREE);
> -
> - alignas_expr =
> -   cp_parser_assignment_expression (parser);
> + alignas_expr = cp_parser_assignment_expression (parser);
>   if (alignas_expr == error_mark_node)
> cp_parser_skip_to_end_of_statement (parser);
>   if (alignas_expr == NULL_TREE
> --- gcc/testsuite/g++.dg/cpp0x/pr78341.C.jj 2017-01-10 11:11:10.368843803 
> +0100
> +++ gcc/testsuite/g++.dg/cpp0x/pr78341.C2017-01-10 11:10:52.0 
> +0100
> @@ -0,0 +1,4 @@
> +// PR c++/78341
> +// { dg-do compile { target c++11 } }
> +
> +alignas (alignas double // { dg-error "" }
>
> Jakub


Re: [PATCH] Avoid generating code when writing PCH (PR c++/72813)

2017-01-11 Thread Jason Merrill
OK.

On Tue, Jan 10, 2017 at 5:33 PM, Jakub Jelinek  wrote:
> Hi!
>
> The comments in both the C and C++ FEs say that after writing PCH file
> when --output-pch= is used, we don't want to do anything else and the
> routines return to the caller early, especially for C++ FE skipping lots of
> needed handling for code generation.  But, nothing is signalled to the
> callers, so we actually continue with the full optimization pipeline and
> generate assembly.  Because some important parts have been skipped, we
> can generate errors though.
>
> Normally, the *.s file is thrown away and not used further, only when
> -S or -save-temps is used, it is preserved.
>
> One option would be (patch in the PR) not to skip anything and continue
> writing, but as Richard mentioned on IRC, emitting assembly for the header
> makes really no sense.  So this patch just does what the comment say,
> by setting flag_syntax_only after writing the PCH file tell callers not to
> perform cgraph finalization.
>
> In addition to that, the patch also extends the r237955 fix to -S
> -save-temps, so that we don't error out on that.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2017-01-10  Jakub Jelinek  
>
> PR c++/72813
> * gcc.c (default_compilers): Don't add -o %g.s for -S -save-temps
> of c-header.
>
> * c-decl.c (pop_file_scope): Set flag_syntax_only to 1 after writing
> PCH file.
>
> * decl2.c (c_parse_final_cleanups): Set flag_syntax_only to 1 after
> writing PCH file.
>
> --- gcc/gcc.c.jj2017-01-09 17:22:24.0 +0100
> +++ gcc/gcc.c   2017-01-10 10:42:52.893462294 +0100
> @@ -1328,7 +1328,7 @@ static const struct compiler default_com
> %(cpp_options) -o %{save-temps*:%b.i} %{!save-temps*:%g.i} \n\
> cc1 -fpreprocessed %{save-temps*:%b.i} 
> %{!save-temps*:%g.i} \
> %(cc1_options)\
> -   %{!fsyntax-only:-o %g.s \
> +   %{!fsyntax-only:%{!S:-o %g.s} \
> %{!fdump-ada-spec*:%{!o*:--output-pch=%i.gch}\
>%W{o*:--output-pch=%*}}%V}}\
>   %{!save-temps*:%{!traditional-cpp:%{!no-integrated-cpp:\
> --- gcc/c/c-decl.c.jj   2017-01-01 12:45:46.0 +0100
> +++ gcc/c/c-decl.c  2017-01-10 10:42:07.387043153 +0100
> @@ -1420,6 +1420,8 @@ pop_file_scope (void)
>if (pch_file)
>  {
>c_common_write_pch ();
> +  /* Ensure even the callers don't try to finalize the CU.  */
> +  flag_syntax_only = 1;
>return;
>  }
>
> --- gcc/cp/decl2.c.jj   2017-01-08 17:41:18.0 +0100
> +++ gcc/cp/decl2.c  2017-01-10 10:41:47.539296496 +0100
> @@ -4461,6 +4461,8 @@ c_parse_final_cleanups (void)
>   DECL_ASSEMBLER_NAME (node->decl);
>c_common_write_pch ();
>dump_tu ();
> +  /* Ensure even the callers don't try to finalize the CU.  */
> +  flag_syntax_only = 1;
>return;
>  }
>
>
> Jakub


Re: Go patch committed: drop size arguments to hash/equal functions

2017-01-11 Thread Ian Lance Taylor
On Tue, Jan 10, 2017 at 11:45 AM, Rainer Orth
 wrote:
>
>> Drop the size arguments for the hash/equal functions stored in type
>> descriptors.  Types know what size they are.  To make this work,
>> generate hash/equal functions for types that can use an identity
>> comparison but are not a standard size and alignment.
>>
>> Drop the multiplications by 33 in the generated hash code and the
>> reflect package hash code.  They are not necessary since we started
>> passing a seed value around, as the seed includes the hash of the
>> earlier values.
>>
>> Copy the algorithms for standard types from the Go 1.7 runtime,
>> replacing the C functions.
>>
>> Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
>> to mainline.
>
> this patch broke Solaris/SPARC bootstrap, it seems: building
> debug/dwarf.lo ICEs like this:
>
> go1: internal compiler error: in write_specific_type_functions, at 
> go/gofrontend/types.cc:1993
> 0x4bc50b Type::write_specific_type_functions(Gogo*, Named_type*, long long, 
> std::__cxx11::basic_string > const&, Function_type*, std::__cxx11::basic_string std::char_traits, std::allocator > const&, Function_type*)
> /vol/gcc/src/hg/trunk/local/gcc/go/gofrontend/types.cc:1993

Thanks.  Looks like 32-bit SPARC requires a padding field in the
bucket struct created for some map type.  The compiler then tries to
generate type functions for the type descriptor, and fails due to a
phase ordering problem--at that point all type functions are expected
to have been created.  It worked previously because it used the
standard functions for types that can use an identity comparison, but
that is no longer the case for types with unusual sizes.  Fixed by
this patch, which marks compiler-created arrays and structs
non-comparable, meaning that no type functions are created.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 244291)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-d3725d876496f2cca3d6ce538e98b58c85d90bfb
+6be46149636c3533389e62c6dc76f0a7ff461080
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 244256)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -6741,8 +6741,9 @@ Bound_method_expression::create_thunk(Go
   sfl->push_back(Struct_field(Typed_identifier("val.1",
   orig_fntype->receiver()->type(),
   loc)));
-  Type* closure_type = Type::make_struct_type(sfl, loc);
-  closure_type = Type::make_pointer_type(closure_type);
+  Struct_type* st = Type::make_struct_type(sfl, loc);
+  st->set_is_struct_incomparable();
+  Type* closure_type = Type::make_pointer_type(st);
 
   Function_type* new_fntype = orig_fntype->copy_with_names();
 
@@ -6896,6 +6897,7 @@ Bound_method_expression::do_flatten(Gogo
  loc)));
   fields->push_back(Struct_field(Typed_identifier("val.1", val->type(), loc)));
   Struct_type* st = Type::make_struct_type(fields, loc);
+  st->set_is_struct_incomparable();
 
   Expression_list* vals = new Expression_list();
   vals->push_back(Expression::make_func_code_reference(thunk, loc));
@@ -9683,6 +9685,7 @@ Call_expression::do_flatten(Gogo* gogo,
 }
 
   Struct_type* st = Type::make_struct_type(sfl, loc);
+  st->set_is_struct_incomparable();
   this->call_temp_ = Statement::make_temporary(st, NULL, loc);
   inserter->insert(this->call_temp_);
 }
@@ -11565,7 +11568,8 @@ Field_reference_expression::do_lower(Gog
   Expression* length_expr = Expression::make_integer_ul(s.length(), NULL, loc);
 
   Type* byte_type = gogo->lookup_global("byte")->type_value();
-  Type* array_type = Type::make_array_type(byte_type, length_expr);
+  Array_type* array_type = Type::make_array_type(byte_type, length_expr);
+  array_type->set_is_array_incomparable();
 
   Expression_list* bytes = new Expression_list();
   for (std::string::const_iterator p = s.begin(); p != s.end(); p++)
@@ -11843,8 +11847,9 @@ Interface_field_reference_expression::cr
   Type* vt = Type::make_pointer_type(Type::make_void_type());
   sfl->push_back(Struct_field(Typed_identifier("fn.0", vt, loc)));
   sfl->push_back(Struct_field(Typed_identifier("val.1", type, loc)));
-  Type* closure_type = Type::make_struct_type(sfl, loc);
-  closure_type = Type::make_pointer_type(closure_type);
+  Struct_type* st = Type::make_struct_type(sfl, loc);
+  st->set_is_struct_incomparable();
+  Type* closure_type = Type::make_pointer_type(st);
 
   Function_type* 

Re: [Patch ,gcc/MIPS] add an build-time/runtime option to disable madd.fmt

2017-01-11 Thread Maciej W. Rozycki
On Fri, 23 Dec 2016, Yunqiang Su wrote:

> > 3, kernel: the emulation  when a float exception taken.
> 
> The big problem is that Loongson use the same encode for (unfused) 
> madd.fmt to (fused) madd.fmt. We cannot trap this in kernel.

 Why is that a problem?  Just add a setting like `cpu_has_fused_madd' to 
, switch processing in the emulator accordingly and 
initialise the setting early on in bootstrap as processor features are 
determined.  We'd have to do it likewise for the R8000 MIPS IV CPU Richard 
referred to previously if we had full support for this processor (unlikely 
to happen).

 Cc-ing  in case you want to discuss it further 
with the kernel developers; arguably it's a kernel bug already that the 
FPU as perceived by user software has different semantics on FPU Loongson 
hardware depending on whether or not the `nofpu' kernel parameter has been 
set (I don't know if denormals are handled by Loongson FPU hardware or 
trap with an Unimplemented Operation exception, but if the former, then 
this inconsistency applies for such data in the regular FPU mode as well).

 HTH,

  Maciej


[PATCH, bugfix] builtin expansion of strcmp for rs6000

2017-01-11 Thread Aaron Sawdey
This expands on the previous patch. For strcmp and for strncmp with N
larger than 64, the first 64 bytes of comparison is expanded inline and
then a call to strcmp or strncmp is emitted to compare the remainder if
the strings are found to be equal at that point. 

-mstring-compare-inline-limit=N determines how many block comparisons
are emitted. With the default 8, and 64-bit code, you get 64 bytes. 

Performance testing on a power8 system shows that the code is anywhere
from 2-8 times faster than RHEL7.2 glibc strcmp/strncmp depending on
alignment and length.

In the process of adding this I discovered that the expansion being
generated for strncmp had a bug in that it was not testing for a zero
byte to terminate the comparison. As a result inputs like
strncmp("AB\0CDEFGX", "AB\0CDEFGY", 9) would be compared not equal. The
initial comparison of a doubleword would be equal so a second one would
be fetched and compared, ignoring the zero byte that should have
terminated comparison. The fix is to use a cmpb to check for zero bytes
in the equality case before comparing the next chunk. I updated the
strncmp-1.c test case to check for this, and also added a new 
strcmp-1.c test case to check strcmp expansion. Also both now have a
length 100 tests to check the transition from the inline comparison to
the library call for the remainder.

ChangeLog
2017-01-11  Aaron Sawdey  
* config/rs6000/rs6000-protos.h (expand_strn_compare): Add arg.
* config/rs6000/rs6000.c (expand_strn_compare): Add ability to expand
strcmp. Fix bug where comparison didn't stop with zero byte.
* config/rs6000/rs6000.md (cmpstrnsi): Args to expand_strn_compare.
(cmpstrsi): Add pattern.

gcc.dg/ChangeLog
2017-01-11  Aaron Sawdey  
* gcc.dg/strcmp-1.c: New.
* gcc.dg/strncmp-1.c: Add test for a bug that escaped.




-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h	(revision 244322)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -78,7 +78,7 @@
 extern int expand_block_clear (rtx[]);
 extern int expand_block_move (rtx[]);
 extern bool expand_block_compare (rtx[]);
-extern bool expand_strn_compare (rtx[]);
+extern bool expand_strn_compare (rtx[], int);
 extern const char * rs6000_output_load_multiple (rtx[]);
 extern bool rs6000_is_valid_mask (rtx, int *, int *, machine_mode);
 extern bool rs6000_is_valid_and_mask (rtx, machine_mode);
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c	(revision 244322)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -19635,22 +19635,36 @@
OPERANDS[0] is the target (result).
OPERANDS[1] is the first source.
OPERANDS[2] is the second source.
+   If NO_LENGTH is zero, then:
OPERANDS[3] is the length.
-   OPERANDS[4] is the alignment in bytes.  */
+   OPERANDS[4] is the alignment in bytes.
+   If NO_LENGTH is nonzero, then:
+   OPERANDS[3] is the alignment in bytes.  */
 bool
-expand_strn_compare (rtx operands[])
+expand_strn_compare (rtx operands[], int no_length)
 {
   rtx target = operands[0];
   rtx orig_src1 = operands[1];
   rtx orig_src2 = operands[2];
-  rtx bytes_rtx = operands[3];
-  rtx align_rtx = operands[4];
+  rtx bytes_rtx, align_rtx;
+  if (no_length)
+{
+  bytes_rtx = NULL;
+  align_rtx = operands[3];
+}
+  else
+{
+  bytes_rtx = operands[3];
+  align_rtx = operands[4];
+}
   HOST_WIDE_INT cmp_bytes = 0;
   rtx src1 = orig_src1;
   rtx src2 = orig_src2;
 
-  /* If this is not a fixed size compare, just call strncmp.  */
-  if (!CONST_INT_P (bytes_rtx))
+  /* If we have a length, it must be constant. This simplifies things
+ a bit as we don't have to generate code to check if we've exceeded
+ the length. Later this could be expanded to handle this case.  */
+  if (!no_length && !CONST_INT_P (bytes_rtx))
 return false;
 
   /* This must be a fixed size alignment.  */
@@ -19668,8 +19682,6 @@
 
   gcc_assert (GET_MODE (target) == SImode);
 
-  HOST_WIDE_INT bytes = INTVAL (bytes_rtx);
-
   /* If we have an LE target without ldbrx and word_mode is DImode,
  then we must avoid using word_mode.  */
   int word_mode_ok = !(!BYTES_BIG_ENDIAN && !TARGET_LDBRX
@@ -19678,16 +19690,37 @@
   int word_mode_size = GET_MODE_SIZE (word_mode);
 
   int offset = 0;
+  HOST_WIDE_INT bytes; /* N from the strncmp args if available.  */
+  HOST_WIDE_INT compare_length; /* How much we are going to compare inline.  */
+  if (no_length)
+/* Use this as a standin to determine the mode to use.  */
+bytes = rs6000_string_compare_inline_limit * word_mode_size;
+  else
+bytes = INTVAL (bytes_rtx);
+
   machine_mode 

Re: [PATCH] [PR rtl-optimization/65618] Fix MIPS ADA bootstrap failure

2017-01-11 Thread Maciej W. Rozycki
On Wed, 11 Jan 2017, James Cowgill wrote:

> >  From this consideration I gather you have a program source which can be 
> > used as a test case to reproduce the issue, so can you please file a 
> > problem report and include the source and a recipe to reproduce it?  Is 
> > this a GCC issue with generated assembly or a GAS issue with interpreting
> > it?
> 
> Yes I had a testcase which I used to debug this, but it was pretty huge
> (and written in ADA). When debugging I just diffed it with the debug
> information toggled to see if the patch fixed it. I'll see if I can find
> it anyway.

 A test case being huge is not an issue, we can always try to reduce it if 
needed.  What is key is reproducibility.  Please make sure it's 
self-contained, i.e. doesn't depend on system-installed components (for 
C/C++ I'd ask for a preprocessed source, but I'm not sure offhand what the 
requirements are for ADA -- it's been a while since I did anything about 
ADA, and even then not a huge lot -- just a GCC 3.4 MIPS port).

> The issue was about the assembly GCC generated.

 A GCC bug then.

  Maciej


Re: [PATCH, ARM] Further improve stack usage on sha512 (PR 77308)

2017-01-11 Thread Bernd Edlinger
On 01/11/17 17:55, Richard Earnshaw (lists) wrote:
>
> Sorry for the delay getting around to this.
>
> I just tried this patch and found that it doesn't apply.  Furthermore,
> there's not enough context in the rejected hunks for me to be certain
> which patterns you're trying to fix up.
>
> Could you do an update please?
>

Sure, I just gave up pinging, as we are rather late in stage 3
already...

So the current status is this:

I have the invalid code issue here; it is independent of the
optimization issues:

[PATCH, ARM] correctly encode the CC reg data flow
https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01562.html

Then I have the patch for splitting the most important
64bit patterns here:

[PATCH, ARM] Further improve stack usage on sha512 (PR 77308)
https://gcc.gnu.org/ml/gcc-patches/2016-11/msg02796.html

and the follow-up patch that triggered the invalid code here:

[PATCH, ARM] Further improve stack usage in sha512, part 2 (PR 77308)
https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01563.html

In the last part I had initially this hunk,
-operands[2] = gen_lowpart (SImode, operands[2]);
+if (can_create_pseudo_p ())
+  operands[2] = gen_reg_rtx (SImode);
+else
+  operands[2] = gen_lowpart (SImode, operands[2]);

As Wilco pointed out that the else part is superfluous,
I already removed the gen_lowpart stuff locally.

All three parts should apply to trunk, only the last part
depends on both earlier patches.


Thanks
Bernd.


Re: [PATCH] [PR rtl-optimization/65618] Fix MIPS ADA bootstrap failure

2017-01-11 Thread James Cowgill
Hi,

On 11/01/17 16:49, Maciej W. Rozycki wrote:
> On Mon, 19 Dec 2016, James Cowgill wrote:
>> This bug causes the ADA bootstrap comparison failure in a-except.o
>> because the branch delay scheduling operates slightly differently for
>> that file if debug information is turned on.
> 
>  This looks like a bug to me -- actual code produced is supposed to be the 
> same regardless of the amount of debug information requested.  This is 
> important for some debugging scenarios, such as when a binary originally 
> produced with no debug information crashes and a core file is obtained so 
> that the program can be rebuilt with debug information included and then 
> matched to the core file previously produced, which may not be easy to 
> recreate with the rebuilt binary.
> 
>  From this consideration I gather you have a program source which can be 
> used as a test case to reproduce the issue, so can you please file a 
> problem report and include the source and a recipe to reproduce it?  Is 
> this a GCC issue with generated assembly or a GAS issue with interpreting
> it?

Yes I had a testcase which I used to debug this, but it was pretty huge
(and written in ADA). When debugging I just diffed it with the debug
information toggled to see if the patch fixed it. I'll see if I can find
it anyway.

The issue was about the assembly GCC generated.

James


Re: [PATCH] [PR rtl-optimization/65618] Fix MIPS ADA bootstrap failure

2017-01-11 Thread Maciej W. Rozycki
On Mon, 19 Dec 2016, James Cowgill wrote:

> This bug causes the ADA bootstrap comparison failure in a-except.o
> because the branch delay scheduling operates slightly differently for
> that file if debug information is turned on.

 This looks like a bug to me -- actual code produced is supposed to be the 
same regardless of the amount of debug information requested.  This is 
important for some debugging scenarios, such as when a binary originally 
produced with no debug information crashes and a core file is obtained so 
that the program can be rebuilt with debug information included and then 
matched to the core file previously produced, which may not be easy to 
recreate with the rebuilt binary.

 From this consideration I gather you have a program source which can be 
used as a test case to reproduce the issue, so can you please file a 
problem report and include the source and a recipe to reproduce it?  Is 
this a GCC issue with generated assembly or a GAS issue with interpreting
it?

  Maciej


Re: [PATCH, ARM] Further improve stack usage on sha512 (PR 77308)

2017-01-11 Thread Richard Earnshaw (lists)
On 08/12/16 19:50, Bernd Edlinger wrote:
> Hi Wilco,
> 
> On 11/30/16 18:01, Bernd Edlinger wrote:
>> I attached the completely untested follow-up patch now, but I would
>> like to post that one again for review, after I applied my current
>> patch, which is still waiting for final review (please feel pinged!).
>>
>>
>> This is really exciting...
>>
>>
> 
> 
> when testing the follow-up patch I discovered a single regression
> in gcc.dg/fixed-point/convert-sat.c that was caused by a mis-compilation
> of the libgcc function __gnu_satfractdasq.
> 
> I think it triggerd a latent bug in the carryin_compare patterns.
> 
> everything is as expected until reload.  First what is left over
> of a split cmpdi_insn followed by a former cmpdi_unsigned, if the
> branch is not taken.
> 
> (insn 109 10 110 2 (set (reg:CC 100 cc)
>  (compare:CC (reg:SI 0 r0 [orig:124 _10 ] [124])
>  (const_int 0 [0]))) 
> "../../../gcc-trunk/libgcc/fixed-bit.c":785 196 {*arm_cmpsi_insn}
>   (nil))
> (insn 110 109 13 2 (parallel [
>  (set (reg:CC 100 cc)
>  (compare:CC (reg:SI 1 r1 [orig:125 _10+4 ] [125])
>  (const_int -1 [0x])))
>  (set (reg:SI 3 r3 [123])
>  (minus:SI (plus:SI (reg:SI 1 r1 [orig:125 _10+4 ] [125])
>  (const_int -1 [0x]))
>  (ltu:SI (reg:CC_C 100 cc)
>  (const_int 0 [0]
>  ]) "../../../gcc-trunk/libgcc/fixed-bit.c":785 32 
> {*subsi3_carryin_compare_const}
>   (nil))
> (jump_insn 13 110 31 2 (set (pc)
>  (if_then_else (ge (reg:CC_NCV 100 cc)
>  (const_int 0 [0]))
>  (label_ref:SI 102)
>  (pc))) "../../../gcc-trunk/libgcc/fixed-bit.c":785 204 
> {arm_cond_branch}
>   (int_list:REG_BR_PROB 6400 (nil))
> 
> (note 31 13 97 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
> (note 97 31 114 3 NOTE_INSN_DELETED)
> (insn 114 97 113 3 (set (reg:SI 2 r2 [orig:127+4 ] [127])
>  (const_int -1 [0x])) 
> "../../../gcc-trunk/libgcc/fixed-bit.c":831 630 {*arm_movsi_vfp}
>   (expr_list:REG_EQUIV (const_int -1 [0x])
>  (nil)))
> (insn 113 114 107 3 (set (reg:SI 3 r3 [126])
>  (const_int 2147483647 [0x7fff])) 
> "../../../gcc-trunk/libgcc/fixed-bit.c":831 630 {*arm_movsi_vfp}
>   (expr_list:REG_EQUIV (const_int 2147483647 [0x7fff])
>  (nil)))
> (insn 107 113 108 3 (set (reg:CC 100 cc)
>  (compare:CC (reg:SI 1 r1 [orig:125 _10+4 ] [125])
>  (reg:SI 2 r2 [orig:127+4 ] [127]))) 
> "../../../gcc-trunk/libgcc/fixed-bit.c":831 196 {*arm_cmpsi_insn}
>   (nil))
> 
> 
> Note that the CC register is not really set as implied by insn 110,
> because the C flag depends on the comparison of r1, 0x and the
> carry flag from insn 109.  Therefore in the postreload pass the
> insn 107 appears to be unnecessary, as if should compute
> exactly the same CC flag, as insn 110, i.e. not dependent on
> previous CC flag.  I think all carryin_compare patterns are
> wrong because they do not describe the true value of the CC reg.
> 
> I think the CC reg is actually dependent on left, right and CC-in
> value, as in the new version of the patch it must be a computation
> in DI mode without overflow, as in my new version of the patch.
> 
> I attached an update of the followup patch which is not yet adjusted
> on your pending negdi patch.  Reg-testing is no yet done, but the
> mis-compilation on libgcc is fixed at least.
> 
> What do you think?

Sorry for the delay getting around to this.

I just tried this patch and found that it doesn't apply.  Furthermore,
there's not enough context in the rejected hunks for me to be certain
which patterns you're trying to fix up.

Could you do an update please?

R.

> 
> 
> Thanks
> Bernd.
> 
> 
> patch-pr77308-5.diff
> 
> 
> 2016-12-08  Bernd Edlinger  
> 
>   PR target/77308
>   * config/arm/arm.md (subdi3_compare1, subsi3_carryin_compare,
>   subsi3_carryin_compare_const, negdi2_compare): Fix the CC reg dataflow.
>   (*arm_negdi2, *arm_cmpdi_unsigned): Split early except for
> TARGET_NEON and TARGET_IWMMXT.
>   (*arm_cmpdi_insn): Split early except for
>   TARGET_NEON and TARGET_IWMMXT.  Fix the CC reg dataflow.
>   * config/arm/thumb2.md (*thumb2_negdi2): Split early except for
>   TARGET_NEON and TARGET_IWMMXT.
> 
> testsuite:
> 2016-12-08  Bernd Edlinger  
> 
>   PR target/77308
>   * gcc.target/arm/pr77308-2.c: New test.
> 
> --- gcc/config/arm/arm.md.orig2016-12-08 16:01:43.290595127 +0100
> +++ gcc/config/arm/arm.md 2016-12-08 19:04:22.251065848 +0100
> @@ -1086,8 +1086,8 @@
>  })
>  
>  (define_insn_and_split "subdi3_compare1"
> -  [(set (reg:CC CC_REGNUM)
> - (compare:CC
> +  [(set (reg:CC_NCV CC_REGNUM)
> + (compare:CC_NCV
> 

Re: Use a specfile that actually allows building programs on NetBSD

2017-01-11 Thread coypu
On Wed, Jan 11, 2017 at 04:41:44PM +0100, Krister Walfridsson wrote:
> On Mon, 9 Jan 2017, co...@sdf.org wrote:
> 
> >3 month ping, 1 week ping (trying again), etc...
> 
> Apologies for not getting back to you sooner.
> 
> 
> >Like most operating systems, NetBSD has a libc which contains
> >stuff it needs for most programs to work, and people expect
> >it to be linked without explicitly specifying -lc.
> 
> Well, most programs already get -lc automatically -- it is only when you
> pass -shared that it fails... :-)
> 
> But I'll fix this, together with some other SPEC issues, in a few days.
> 
>/Krister

yay! thanks

netbsd patches stuff a lot more here, but I try not to upstream
stuff I don't understand and did not need. I've done this to get
gcc from pkgsrc working more, and it does for x86, but not for
any other archs yet.


Re: [PATCH] Remove padding from DWARF5 headers

2017-01-11 Thread Jason Merrill
OK.

On Tue, Jan 3, 2017 at 6:09 PM, Jakub Jelinek  wrote:
> Hi!
>
> http://dwarfstd.org/ShowIssue.php?issue=161031.2
> got approved today, so DWARF5 is changing and the various DW_UT_* kinds
> will no longer have the same size of the headers.  So,
> DW_UT_compile/DW_UT_partial shrinks by 12/16 bytes (padding 1 and padding 2
> is removed; 16 bytes for 64-bit DWARF), DW_UT_type remains the same,
> DW_UT_skeleton/DW_UT_split_compile shrink by 4/8 bytes (padding 2 is
> removed).  For DW_UT_* kinds consumers don't understand, the first 3 fields
> (length, version and ut kind) are required to be present and the only
> sensible action is to skip the whole unit (using length field).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> Jan/Mark, are you going to adjust GDB/elfutils etc. correspondingly?
>
> 2017-01-03  Jakub Jelinek  
>
> * dwarf2out.c (DWARF_COMPILE_UNIT_HEADER_SIZE): For DWARF5 decrease
> by 12.
> (DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE): Always
> DWARF_COMPILE_UNIT_HEADER_SIZE plus 12.
> (DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE): Define.
> (calc_base_type_die_sizes): Use 
> DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE
> for initial die_offset if dwarf_split_debug_info.
> (output_comp_unit): Use DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE for
> initial next_die_offset if dwo_id is non-NULL.  Don't emit padding
> fields.
> (output_skeleton_debug_sections): Formatting fix.  Use
> DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE instead of
> DWARF_COMPILE_UNIT_HEADER_SIZE.  Don't emit padding.
>
> --- gcc/dwarf2out.c.jj  2017-01-03 16:04:17.0 +0100
> +++ gcc/dwarf2out.c 2017-01-03 19:41:45.526194592 +0100
> @@ -2996,14 +2996,16 @@ skeleton_chain_node;
>  /* Fixed size portion of the DWARF compilation unit header.  */
>  #define DWARF_COMPILE_UNIT_HEADER_SIZE \
>(DWARF_INITIAL_LENGTH_SIZE + DWARF_OFFSET_SIZE   \
> -   + (dwarf_version >= 5   \
> -  ? 4 + DWARF_TYPE_SIGNATURE_SIZE + DWARF_OFFSET_SIZE : 3))
> +   + (dwarf_version >= 5 ? 4 : 3))
>
>  /* Fixed size portion of the DWARF comdat type unit header.  */
>  #define DWARF_COMDAT_TYPE_UNIT_HEADER_SIZE \
>(DWARF_COMPILE_UNIT_HEADER_SIZE  \
> -   + (dwarf_version >= 5   \
> -  ? 0 : DWARF_TYPE_SIGNATURE_SIZE + DWARF_OFFSET_SIZE))
> +   + DWARF_TYPE_SIGNATURE_SIZE + DWARF_OFFSET_SIZE)
> +
> +/* Fixed size portion of the DWARF skeleton compilation unit header.  */
> +#define DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE \
> +  (DWARF_COMPILE_UNIT_HEADER_SIZE + (dwarf_version >= 5 ? 8 : 0))
>
>  /* Fixed size portion of public names info.  */
>  #define DWARF_PUBNAMES_HEADER_SIZE (2 * DWARF_OFFSET_SIZE + 2)
> @@ -9044,7 +9046,9 @@ calc_die_sizes (dw_die_ref die)
>  static void
>  calc_base_type_die_sizes (void)
>  {
> -  unsigned long die_offset = DWARF_COMPILE_UNIT_HEADER_SIZE;
> +  unsigned long die_offset = (dwarf_split_debug_info
> + ? DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE
> + : DWARF_COMPILE_UNIT_HEADER_SIZE);
>unsigned int i;
>dw_die_ref base_type;
>  #if ENABLE_ASSERT_CHECKING
> @@ -10302,7 +10306,9 @@ output_comp_unit (dw_die_ref die, int ou
>delete extern_map;
>
>/* Initialize the beginning DIE offset - and calculate sizes/offsets.  */
> -  next_die_offset = DWARF_COMPILE_UNIT_HEADER_SIZE;
> +  next_die_offset = (dwo_id
> +? DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE
> +: DWARF_COMPILE_UNIT_HEADER_SIZE);
>calc_die_sizes (die);
>
>oldsym = die->die_id.die_symbol;
> @@ -10330,12 +10336,6 @@ output_comp_unit (dw_die_ref die, int ou
>if (dwo_id != NULL)
> for (int i = 0; i < 8; i++)
>   dw2_asm_output_data (1, dwo_id[i], i == 0 ? "DWO id" : NULL);
> -  else
> -   /* Hope all the padding will be removed for DWARF 5 final for
> -  DW_AT_compile and DW_AT_partial.  */
> -   dw2_asm_output_data (8, 0, "Padding 1");
> -
> -  dw2_asm_output_data (DWARF_OFFSET_SIZE, 0, "Padding 2");
>  }
>output_die (die);
>
> @@ -10430,10 +10430,11 @@ output_skeleton_debug_sections (dw_die_r
>   header.  */
>if (DWARF_INITIAL_LENGTH_SIZE - DWARF_OFFSET_SIZE == 4)
>  dw2_asm_output_data (4, 0x,
> -  "Initial length escape value indicating 64-bit DWARF extension");
> +"Initial length escape value indicating 64-bit "
> +"DWARF extension");
>
>dw2_asm_output_data (DWARF_OFFSET_SIZE,
> -   DWARF_COMPILE_UNIT_HEADER_SIZE
> +  DWARF_COMPILE_UNIT_SKELETON_HEADER_SIZE
> - DWARF_INITIAL_LENGTH_SIZE
> + size_of_die (comp_unit),
>   

[PATCH][AArch64][GCC 6] PR target/79041: Correct -mpc-relative-literal-loads logic in aarch64_classify_symbol

2017-01-11 Thread Kyrill Tkachov

Hi all,

In this PR we generated ADRP/ADD instructions with :lo12: relocations on 
symbols even though -mpc-relative-literal-loads
is used. This is due to the confusing double-negative logic of the
nopcrelative_literal_loads aarch64 variable and its relation to the 
aarch64_nopcrelative_literal_loads global variable
in the GCC 6 branch.

Wilco fixed this on trunk as part of a larger patch (r237607) and parts of that 
patch were backported, but other parts weren't and
that patch now doesn't apply cleanly to the branch.

The actual bug here is that aarch64_classify_symbol uses 
nopcrelative_literal_loads instead of the correct 
aarch64_nopcrelative_literal_loads.
nopcrelative_literal_loads gets set to 1 if the user specifies 
-mpc-relative-literal-loads(!) whereas aarch64_nopcrelative_literal_loads gets
set to false, so that is the variable we want to check.

So this is the minimal patch that fixes this.

Bootstrapped and tested on aarch64-none-linux-gnu on the GCC 6 branch.

Ok for the branch?

Thanks,
Kyrill

2017-01-11  Kyrylo Tkachov  

PR target/79041
* config/aarch64/aarch64.c (aarch64_classify_symbol): Use
aarch64_nopcrelative_literal_loads instead of
nopcrelative_literal_loads.

2017-01-11  Kyrylo Tkachov  

PR target/79041
* gcc.target/aarch64/pr79041.c: New test.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 83dbd57..fa61289 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9324,7 +9324,7 @@ aarch64_classify_symbol (rtx x, rtx offset)
 	  /* This is alright even in PIC code as the constant
 	 pool reference is always PC relative and within
 	 the same translation unit.  */
-	  if (nopcrelative_literal_loads
+	  if (aarch64_nopcrelative_literal_loads
 	  && CONSTANT_POOL_ADDRESS_P (x))
 	return SYMBOL_SMALL_ABSOLUTE;
 	  else
diff --git a/gcc/testsuite/gcc.target/aarch64/pr79041.c b/gcc/testsuite/gcc.target/aarch64/pr79041.c
new file mode 100644
index 000..a23b1ae
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr79041.c
@@ -0,0 +1,26 @@
+/* PR target/79041.  Check that we don't generate the LO12 relocations
+   for -mpc-relative-literal-loads.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcmodel=large -mpc-relative-literal-loads" } */
+
+extern int strcmp (const char *, const char *);
+extern char *strcpy (char *, const char *);
+
+static struct
+{
+  char *b;
+  char *c;
+} d[] = {
+  {"0", "000"}, {"1", "111"},
+};
+
+void
+e (const char *b, char *c)
+{
+  int i;
+  for (i = 0; i < 1; ++i)
+if (!strcmp (d[i].b, b))
+  strcpy (c, d[i].c);
+}
+
+/* { dg-final { scan-assembler-not ":lo12:" } } */


Re: [PATCH, gcc, wwwdocs] Document upcoming Qualcomm Falkor processor support

2017-01-11 Thread Richard Earnshaw (lists)
On 06/01/17 12:11, Siddhesh Poyarekar wrote:
> Hi,
> 
> This patch documents the newly added flag in gcc 7 for the upcoming
> Qualcomm Falkor processor core.
> 
> Siddhesh
> 
> Index: htdocs/gcc-7/changes.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/changes.html,v
> retrieving revision 1.33
> diff -u -r1.33 changes.html
> --- htdocs/gcc-7/changes.html 3 Jan 2017 10:55:03 -   1.33
> +++ htdocs/gcc-7/changes.html 6 Jan 2017 12:09:53 -
> @@ -390,7 +390,8 @@
>   
> Support has been added for the following processors
> (GCC identifiers in parentheses): ARM Cortex-A73
> -   (cortex-a73) and Broadcom Vulcan (vulcan).
> +   (cortex-a73), Broadcom Vulcan (vulcan) and
> +   Qualcomm Falkor (falkor).
> The GCC identifiers can be used
> as arguments to the -mcpu or -mtune options,
> for example: -mcpu=cortex-a73 or
> 

Thanks.  The file had changed again, but I've merged this in.

R.


Re: [ARM] PR 78253 do not resolve weak ref locally

2017-01-11 Thread Richard Earnshaw (lists)
On 11/01/17 16:14, Christophe Lyon wrote:
> On 11 January 2017 at 17:13, Christophe Lyon  
> wrote:
>> On 11 January 2017 at 16:48, Richard Earnshaw (lists)
>>  wrote:
>>> On 01/12/16 14:27, Christophe Lyon wrote:
 Hi,


 On 10 November 2016 at 15:10, Christophe Lyon
  wrote:
> On 10 November 2016 at 11:05, Richard Earnshaw
>  wrote:
>> On 09/11/16 21:29, Christophe Lyon wrote:
>>> Hi,
>>>
>>> PR 78253 shows that the handling of weak references has changed for
>>> ARM with gcc-5.
>>>
>>> When r220674 was committed, default_binds_local_p_2 gained a new
>>> parameter (weak_dominate), which, when true, implies that a reference
>>> to a weak symbol defined locally will be resolved locally, even though
>>> it could be overridden by a strong definition in another object file.
>>>
>>> With r220674, default_binds_local_p forces weak_dominate=true,
>>> effectively changing the previous behavior.
>>>
>>> The attached patch introduces default_binds_local_p_4 which is a copy
>>> of default_binds_local_p_2, but using weak_dominate=false, and updates
>>> the ARM target to call default_binds_local_p_4 instead of
>>> default_binds_local_p_2.
>>>
>>> I ran cross-tests on various arm* configurations with no regression,
>>> and checked that the test attached to the original bugzilla now works
>>> as expected.
>>>
>>> I am not sure why weak_dominate defaults to true, and I couldn't
>>> really understand why by reading the threads related to r220674 and
>>> following updates to default_binds_local_p_* which all deal with other
>>> corner cases and do not discuss the weak_dominate parameter.
>>>
>>> Or should this patch be made more generic?
>>>
>>
>> I certainly don't think it should be ARM specific.
> That was my feeling too.
>
>>
>> The questions I have are:
>>
>> 1) What do other targets do today.  Are they the same, or different?
>
> arm, aarch64, s390 use default_binds_local_p_2 since PR 65780, and
> default_binds_local_p before that. Both have weak_dominate=true
> i386 has its own version, calling default_binds_local_p_3 with true
> for weak_dominate
>
> But the behaviour of default_binds_local_p changed with r220674 as I said 
> above.
> See https://gcc.gnu.org/viewcvs/gcc?view=revision=220674 and
> notice how weak_dominate was introduced
>
> The original bug report is about a different case:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32219
>
> The original patch submission is
> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00410.html
> and the 1st version with weak_dominate is in:
> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00469.html
> but it's not clear to me why this was introduced
>
>> 2) If different why?
> on aarch64, although binds_local_p returns true, the relocations used when
> building the function pointer is still the same (still via the GOT).
>
> aarch64 has different logic than arm when accessing a symbol
> (eg aarch64_classify_symbol)
>
>> 3) Is the current behaviour really what was intended by the patch?  ie.
>> Was the old behaviour actually wrong?
>>
> That's what I was wondering.
> Before r220674, calling a weak function directly or via a function
> pointer had the same effect (in other words, the function pointer
> points to the actual implementation: the strong one if any, the weak
> one otherwise).
>
> After r220674, on arm the function pointer points to the weak
> definition, which seems wrong to me, it should leave the actual
> resolution to the linker.
>
>

 After looking at the aarch64 port, I think that references to weak symbols
 have to be handled carefully, to make sure they cannot be resolved
 by the assembler, since the weak symbol can be overridden by a strong
 definition at link-time.

 Here is a new patch which does that.
 Validated on arm* targets with no regression, and I checked that the
 original testcase now executes as expected.

>>>
>>> This looks sensible, however, I think you should use 'non-weak' rather
>>> than 'strong' in your comments (I've seen ABIs with weak, normal and
>>> strong symbol definitions).
>>>
>>> Also, you're missing a space before each macro/function call.
>>>
>>> OK with those changes.
>>>
>>
>> Thanks, I have attached what I have committed (r244320).
>>
> 
> I forgot to ask: OK to backport to gcc-5 and gcc-6 branches?
> 

Yes.  But give it a couple of days, just in case it throws up any issues.

R.

> 
>> Christophe
>>
>>
>>> R.
>>>
 Christophe


>> R.
>>> Thanks,
>>>
>>> Christophe
>>>
>>
>>
>> pr78253.chlog.txt

Re: [ARM] PR 78253 do not resolve weak ref locally

2017-01-11 Thread Christophe Lyon
On 11 January 2017 at 16:48, Richard Earnshaw (lists)
 wrote:
> On 01/12/16 14:27, Christophe Lyon wrote:
>> Hi,
>>
>>
>> On 10 November 2016 at 15:10, Christophe Lyon
>>  wrote:
>>> On 10 November 2016 at 11:05, Richard Earnshaw
>>>  wrote:
 On 09/11/16 21:29, Christophe Lyon wrote:
> Hi,
>
> PR 78253 shows that the handling of weak references has changed for
> ARM with gcc-5.
>
> When r220674 was committed, default_binds_local_p_2 gained a new
> parameter (weak_dominate), which, when true, implies that a reference
> to a weak symbol defined locally will be resolved locally, even though
> it could be overridden by a strong definition in another object file.
>
> With r220674, default_binds_local_p forces weak_dominate=true,
> effectively changing the previous behavior.
>
> The attached patch introduces default_binds_local_p_4 which is a copy
> of default_binds_local_p_2, but using weak_dominate=false, and updates
> the ARM target to call default_binds_local_p_4 instead of
> default_binds_local_p_2.
>
> I ran cross-tests on various arm* configurations with no regression,
> and checked that the test attached to the original bugzilla now works
> as expected.
>
> I am not sure why weak_dominate defaults to true, and I couldn't
> really understand why by reading the threads related to r220674 and
> following updates to default_binds_local_p_* which all deal with other
> corner cases and do not discuss the weak_dominate parameter.
>
> Or should this patch be made more generic?
>

 I certainly don't think it should be ARM specific.
>>> That was my feeling too.
>>>

 The questions I have are:

 1) What do other targets do today.  Are they the same, or different?
>>>
>>> arm, aarch64, s390 use default_binds_local_p_2 since PR 65780, and
>>> default_binds_local_p before that. Both have weak_dominate=true
>>> i386 has its own version, calling default_binds_local_p_3 with true
>>> for weak_dominate
>>>
>>> But the behaviour of default_binds_local_p changed with r220674 as I said 
>>> above.
>>> See https://gcc.gnu.org/viewcvs/gcc?view=revision=220674 and
>>> notice how weak_dominate was introduced
>>>
>>> The original bug report is about a different case:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32219
>>>
>>> The original patch submission is
>>> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00410.html
>>> and the 1st version with weak_dominate is in:
>>> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00469.html
>>> but it's not clear to me why this was introduced
>>>
 2) If different why?
>>> on aarch64, although binds_local_p returns true, the relocations used when
>>> building the function pointer is still the same (still via the GOT).
>>>
>>> aarch64 has different logic than arm when accessing a symbol
>>> (eg aarch64_classify_symbol)
>>>
 3) Is the current behaviour really what was intended by the patch?  ie.
 Was the old behaviour actually wrong?

>>> That's what I was wondering.
>>> Before r220674, calling a weak function directly or via a function
>>> pointer had the same effect (in other words, the function pointer
>>> points to the actual implementation: the strong one if any, the weak
>>> one otherwise).
>>>
>>> After r220674, on arm the function pointer points to the weak
>>> definition, which seems wrong to me, it should leave the actual
>>> resolution to the linker.
>>>
>>>
>>
>> After looking at the aarch64 port, I think that references to weak symbols
>> have to be handled carefully, to make sure they cannot be resolved
>> by the assembler, since the weak symbol can be overridden by a strong
>> definition at link-time.
>>
>> Here is a new patch which does that.
>> Validated on arm* targets with no regression, and I checked that the
>> original testcase now executes as expected.
>>
>
> This looks sensible, however, I think you should use 'non-weak' rather
> than 'strong' in your comments (I've seen ABIs with weak, normal and
> strong symbol definitions).
>
> Also, you're missing a space before each macro/function call.
>
> OK with those changes.
>

Thanks, I have attached what I have committed (r244320).

Christophe


> R.
>
>> Christophe
>>
>>
 R.
> Thanks,
>
> Christophe
>


 pr78253.chlog.txt


 gcc/ChangeLog:

 2016-12-01  Christophe Lyon  

 PR target/78253
 * config/arm/arm.c (legitimize_pic_address): Handle reference to
 weak symbol.
 (arm_assemble_integer): Likewise.



 pr78253.patch.txt


 diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
 index 74cb64c..258ceb1 100644
 --- a/gcc/config/arm/arm.c
 +++ b/gcc/config/arm/arm.c
 @@ -6923,10 +6923,13 @@ legitimize_pic_address (rtx orig, 

Re: [ARM] PR 78253 do not resolve weak ref locally

2017-01-11 Thread Christophe Lyon
On 11 January 2017 at 17:13, Christophe Lyon  wrote:
> On 11 January 2017 at 16:48, Richard Earnshaw (lists)
>  wrote:
>> On 01/12/16 14:27, Christophe Lyon wrote:
>>> Hi,
>>>
>>>
>>> On 10 November 2016 at 15:10, Christophe Lyon
>>>  wrote:
 On 10 November 2016 at 11:05, Richard Earnshaw
  wrote:
> On 09/11/16 21:29, Christophe Lyon wrote:
>> Hi,
>>
>> PR 78253 shows that the handling of weak references has changed for
>> ARM with gcc-5.
>>
>> When r220674 was committed, default_binds_local_p_2 gained a new
>> parameter (weak_dominate), which, when true, implies that a reference
>> to a weak symbol defined locally will be resolved locally, even though
>> it could be overridden by a strong definition in another object file.
>>
>> With r220674, default_binds_local_p forces weak_dominate=true,
>> effectively changing the previous behavior.
>>
>> The attached patch introduces default_binds_local_p_4 which is a copy
>> of default_binds_local_p_2, but using weak_dominate=false, and updates
>> the ARM target to call default_binds_local_p_4 instead of
>> default_binds_local_p_2.
>>
>> I ran cross-tests on various arm* configurations with no regression,
>> and checked that the test attached to the original bugzilla now works
>> as expected.
>>
>> I am not sure why weak_dominate defaults to true, and I couldn't
>> really understand why by reading the threads related to r220674 and
>> following updates to default_binds_local_p_* which all deal with other
>> corner cases and do not discuss the weak_dominate parameter.
>>
>> Or should this patch be made more generic?
>>
>
> I certainly don't think it should be ARM specific.
 That was my feeling too.

>
> The questions I have are:
>
> 1) What do other targets do today.  Are they the same, or different?

 arm, aarch64, s390 use default_binds_local_p_2 since PR 65780, and
 default_binds_local_p before that. Both have weak_dominate=true
 i386 has its own version, calling default_binds_local_p_3 with true
 for weak_dominate

 But the behaviour of default_binds_local_p changed with r220674 as I said 
 above.
 See https://gcc.gnu.org/viewcvs/gcc?view=revision=220674 and
 notice how weak_dominate was introduced

 The original bug report is about a different case:
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32219

 The original patch submission is
 https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00410.html
 and the 1st version with weak_dominate is in:
 https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00469.html
 but it's not clear to me why this was introduced

> 2) If different why?
 on aarch64, although binds_local_p returns true, the relocations used when
 building the function pointer is still the same (still via the GOT).

 aarch64 has different logic than arm when accessing a symbol
 (eg aarch64_classify_symbol)

> 3) Is the current behaviour really what was intended by the patch?  ie.
> Was the old behaviour actually wrong?
>
 That's what I was wondering.
 Before r220674, calling a weak function directly or via a function
 pointer had the same effect (in other words, the function pointer
 points to the actual implementation: the strong one if any, the weak
 one otherwise).

 After r220674, on arm the function pointer points to the weak
 definition, which seems wrong to me, it should leave the actual
 resolution to the linker.


>>>
>>> After looking at the aarch64 port, I think that references to weak symbols
>>> have to be handled carefully, to make sure they cannot be resolved
>>> by the assembler, since the weak symbol can be overridden by a strong
>>> definition at link-time.
>>>
>>> Here is a new patch which does that.
>>> Validated on arm* targets with no regression, and I checked that the
>>> original testcase now executes as expected.
>>>
>>
>> This looks sensible, however, I think you should use 'non-weak' rather
>> than 'strong' in your comments (I've seen ABIs with weak, normal and
>> strong symbol definitions).
>>
>> Also, you're missing a space before each macro/function call.
>>
>> OK with those changes.
>>
>
> Thanks, I have attached what I have committed (r244320).
>

I forgot to ask: OK to backport to gcc-5 and gcc-6 branches?


> Christophe
>
>
>> R.
>>
>>> Christophe
>>>
>>>
> R.
>> Thanks,
>>
>> Christophe
>>
>
>
> pr78253.chlog.txt
>
>
> gcc/ChangeLog:
>
> 2016-12-01  Christophe Lyon  
>
> PR target/78253
> * config/arm/arm.c (legitimize_pic_address): Handle reference to
> weak symbol.
> 

Re: [PATCH][ARM] [gcc] Add __artificial__ attribute to all NEON intrinsics

2017-01-11 Thread Richard Earnshaw (lists)
On 10/01/17 10:40, Tamar Christina wrote:
> Hi all,
> 
> This patch adds the __artificial__ and __gnu_inline__
> attributes to the intrinsics in arm_neon.h so that
> costs are associated to the user function during profiling
> and during debugging the intrinsics are hidden in trace.
> 
> A similar patch was already applied to Aarch64.
> 
> The artificial attribute does not affect code generation.
> The functions are also changed from static to being extern
> in order for the __gnu_inline__ function to not treat the
> intrinsics as standalone functions.
> 
> No new tests for this since it would require a gdb test
> but regression tests on arm-none-linux-gnueabi was performed.
> 
> The attribute was added with the following bash script:
> 
> #!/bin/bash
> 
> # first apply to the ones in #define blocks and add extra \ at the end
> sed -i -r 's/(__inline.+)(__attribute__\s*)\(\((.+)\)\)\s*\\/\1\\\n\2 \(\(\3, 
> __gnu_inline__,__artificial__\)\) \\/m' \
>  gcc/config/arm/arm_neon.h
> 
> # Then write all normal functions
> sed -i -r 's/(__inline.+)(__attribute__\s*)\(\((.+)\)\)/\1\n\2 \(\(\3, 
> __gnu_inline__, __artificial__\)\)/m' \
>  gcc/config/arm/arm_neon.h
> 
> # Then correct any trailing whitespaces we might have introduced
> sed -i 's/[ \t]*$//' \
>  gcc/config/arm/arm_neon.h
> 
> # And then finish up by correcting some attribute values which don't fit the 
> patterns above.
> sed -i -r 's/(__attribute__\s+)\(\((__always_inline__)\)\)\s+\\/\1\(\(\2, 
> __gnu_inline__, __artificial__\)\) \\/m' \
>  gcc/config/arm/arm_neon.h
> 
> # Replace static definitions with extern
> sed -i -r 's/(__extension__\s+)static(.+)/\1extern\2/m' \
>  gcc/config/arm/arm_neon.h
> 
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> gcc/
> 2017-01-10  Tamar Christina  
> 
>   * config/arm/arm_neon.h: Add __artificial__ and gnu_inline
>   to all inlined functions, change static to extern.
> 


OK.

R.


Re: [patch,avr] PR78883: Implement a dummy scheduler

2017-01-11 Thread Segher Boessenkool
On Wed, Jan 04, 2017 at 12:29:49PM -0700, Jeff Law wrote:
> >We should split off a new "SUBREGS_OF_MEM_ALLOWED" from !INSN_SCHEDULING,
> >and then probably even default it to false.
> That would work for me :-)  The question in my mind would be unexpected 
> fallout at this point in the release process.  Maybe default it to 
> !INSN_SCHEDULING to minimize such fallout now, then to false for gcc-8?

Do we want to change anything right now?  AVR has a workaround for now.


Segher


Re: [PATCH][AArch64] Improve Cortex-A53 scheduling of int/fp transfers

2017-01-11 Thread Richard Earnshaw (lists)
On 10/01/17 17:18, Wilco Dijkstra wrote:
> My previous change to the Cortex-A53 scheduler resulted in a 13% regression 
> on a
> proprietary benchmark.  This turned out to be due to non-optimal scheduling 
> of int
> to float conversions.  This patch separates int to FP transfers from int to 
> float
> conversions based on experiments to determine the best schedule.  As a result 
> of
> these tweaks the performance of the benchmark improves by 20%.
> 
> ChangeLog:
> 2017-01-10  Wilco Dijkstra  
> 
>   * config/arm/cortex-a53.md: Add bypasses for
>   cortex_a53_r2f_cvt.
>   (cortex_a53_r2f): Only use for transfers.
>   (cortex_a53_f2r): Likewise.
>   (cortex_a53_r2f_cvt): Add reservation for conversions.
>   (cortex_a53_f2r_cvt): Likewise.
> 

OK.

R.

> --
> 
> diff --git a/gcc/config/arm/cortex-a53.md b/gcc/config/arm/cortex-a53.md
> index 
> 14822ba0ac0532aaf0dd29cff7a87e32e745cbe8..b367ad403a4a641da34521c17669027b87092737
>  100644
> --- a/gcc/config/arm/cortex-a53.md
> +++ b/gcc/config/arm/cortex-a53.md
> @@ -252,9 +252,18 @@
>"cortex_a53_r2f")
>  
>  (define_bypass 1 "cortex_a53_mul,
> -   cortex_a53_load*"
> +   cortex_a53_load1,
> +   cortex_a53_load2"
>"cortex_a53_r2f")
>  
> +(define_bypass 2 "cortex_a53_alu*"
> +  "cortex_a53_r2f_cvt")
> +
> +(define_bypass 3 "cortex_a53_mul,
> +   cortex_a53_load1,
> +   cortex_a53_load2"
> +  "cortex_a53_r2f_cvt")
> +
>  ;; Model flag forwarding to branches.
>  
>  (define_bypass 0 "cortex_a53_alu*,cortex_a53_shift*"
> @@ -514,16 +523,24 @@
>  ;; Floating-point to/from core transfers.
>  
>  
> -(define_insn_reservation "cortex_a53_r2f" 6
> +(define_insn_reservation "cortex_a53_r2f" 2
>(and (eq_attr "tune" "cortexa53")
> -   (eq_attr "type" "f_mcr,f_mcrr,f_cvti2f,
> - neon_from_gp, neon_from_gp_q"))
> -  "cortex_a53_slot_any,nothing*2,cortex_a53_fp_alu")
> +   (eq_attr "type" "f_mcr,f_mcrr"))
> +  "cortex_a53_slot_any,cortex_a53_fp_alu")
> +
> +(define_insn_reservation "cortex_a53_f2r" 4
> +  (and (eq_attr "tune" "cortexa53")
> +   (eq_attr "type" "f_mrc,f_mrrc"))
> +  "cortex_a53_slot_any,cortex_a53_fp_alu")
> +
> +(define_insn_reservation "cortex_a53_r2f_cvt" 4
> +  (and (eq_attr "tune" "cortexa53")
> +   (eq_attr "type" "f_cvti2f, neon_from_gp, neon_from_gp_q"))
> +  "cortex_a53_slot_any,cortex_a53_fp_alu")
>  
> -(define_insn_reservation "cortex_a53_f2r" 6
> +(define_insn_reservation "cortex_a53_f2r_cvt" 5
>(and (eq_attr "tune" "cortexa53")
> -   (eq_attr "type" "f_mrc,f_mrrc,f_cvtf2i,
> - neon_to_gp, neon_to_gp_q"))
> +   (eq_attr "type" "f_cvtf2i, neon_to_gp, neon_to_gp_q"))
>"cortex_a53_slot_any,cortex_a53_fp_alu")
>  
>  
> 



Re: [PATCH] c++/78771 ICE with inheriting ctor

2017-01-11 Thread Nathan Sidwell

On 01/04/2017 12:53 AM, Jason Merrill wrote:


Hmm, that seems like where the problem is.  We shouldn't try to
instantiate the inheriting constructor until we've already chosen the
base constructor; in the new model the inheriting constructor is just an
implementation detail.


Oh what fun.  This testcase behaves differently for C++17, C++11 
-fnew-inheriting-ctors and C++11 -fno-new-inheriting-ctors compilation 
modes.


Firstly, unpatched G++ is fine in C++17 mode, because:
  /* In C++17, "If the initializer expression is a prvalue and the
 cv-unqualified version of the source type is the same class as the 
class
 of the destination, the initializer expression is used to 
initialize the

 destination object."  Handle that here to avoid doing overload
 resolution.  */
and inside that we have:

  /* FIXME P0135 doesn't say how to handle direct initialization from a
 type with a suitable conversion operator.  Let's handle it like
 copy-initialization, but allowing explict conversions.  */

That conversion lookup short-circuits the subsequent overload resolution 
that would otherwise explode.


Otherwise, with -fnew-inheriting-ctors, you are indeed correct.  There 
needs to be a call to strip_inheriting_ctors in deduce_inheriting_ctor.


With -fno-new-inheriting-ctors we need the original patch I posted 
(included herein).  I suppose we might be able to remove the assert from 
strip_inheriting_ctors and always call that from deduce_inheriting_ctor, 
but that seems a bad idea to me.


I was unable to produce a c++17 testcase that triggered this problem by 
avoiding the above-mentioned overload resolution short-circuiting.


As -fnew-inheriting-ctors is a mangling-affecting flag, I guess we're 
stuck with it for the foreseable future.


ok?

nathan

--
Nathan Sidwell
2017-01-11  Nathan Sidwell  

	PR c++/78771
	* pt.c (instantiate_template_1): Check for recursive instantiation
	of inheriting constructor when not new-inheriting-ctor.
	* method.c (deduce_inheriting_ctor): Use originating ctor when
	new-inheriting-ctor.

	PR c++/78771
	* g++.dg/cpp0x/pr78771-old.C: New.
	* g++.dg/cpp0x/pr78771-new.C: New.
	* g++.dg/cpp1z/pr78771.C: New.

Index: cp/method.c
===
--- cp/method.c	(revision 244314)
+++ cp/method.c	(working copy)
@@ -1858,11 +1858,15 @@ deduce_inheriting_ctor (tree decl)
   gcc_assert (DECL_INHERITED_CTOR (decl));
   tree spec;
   bool trivial, constexpr_, deleted;
+
+  tree inherited = DECL_INHERITED_CTOR (decl);
+  if (flag_new_inheriting_ctors)
+inherited = strip_inheriting_ctors (inherited);
+
   synthesized_method_walk (DECL_CONTEXT (decl), sfk_inheriting_constructor,
 			   false, , , , _,
-			   /*diag*/false,
-			   DECL_INHERITED_CTOR (decl),
-			   FUNCTION_FIRST_USER_PARMTYPE (decl));
+			   /*diag=*/false,
+			   inherited, FUNCTION_FIRST_USER_PARMTYPE (decl));
   if (TREE_CODE (inherited_ctor_binfo (decl)) != TREE_BINFO)
 /* Inherited the same constructor from different base subobjects.  */
 deleted = true;
Index: cp/pt.c
===
--- cp/pt.c	(revision 244314)
+++ cp/pt.c	(working copy)
@@ -17963,10 +17963,22 @@ instantiate_template_1 (tree tmpl, tree
   if (spec == error_mark_node)
 	return error_mark_node;
 
+  /* If this is an inherited ctor, we can recursively clone it
+	 when deducing the validity of the ctor.  But we won't have
+	 cloned the function yet, so do it now.  We'll redo this
+	 later, but any recursive information learnt here can't
+	 change the validity.  */
+  if (!TREE_CHAIN (spec))
+	{
+	  gcc_assert (!flag_new_inheriting_ctors
+		  && DECL_INHERITED_CTOR (spec));
+	  clone_function_decl (spec, /*update_method_vec_p=*/0);
+	}
   /* Look for the clone.  */
   FOR_EACH_CLONE (clone, spec)
 	if (DECL_NAME (clone) == DECL_NAME (tmpl))
 	  return clone;
+
   /* We should always have found the clone by now.  */
   gcc_unreachable ();
   return NULL_TREE;
Index: testsuite/g++.dg/cpp0x/pr78771-new.C
===
--- testsuite/g++.dg/cpp0x/pr78771-new.C	(revision 0)
+++ testsuite/g++.dg/cpp0x/pr78771-new.C	(working copy)
@@ -0,0 +1,28 @@
+// PR c++/78771
+// { dg-do compile { target c++11 } }
+// { dg-additional-options "-fnew-inheriting-ctors" }
+
+// ICE instantiating a deleted inherited ctor
+
+struct Base
+{
+  template  Base (U);
+
+  Base (int);
+};
+
+struct Derived;
+
+struct Middle : Base
+{
+  using Base::Base;
+
+  Middle (Derived);
+};
+
+struct Derived : Middle
+{
+  using Middle::Middle;
+};
+
+Middle::Middle (Derived) : Middle (0) {}
Index: testsuite/g++.dg/cpp0x/pr78771-old.C
===
--- testsuite/g++.dg/cpp0x/pr78771-old.C	(revision 0)
+++ testsuite/g++.dg/cpp0x/pr78771-old.C	(working copy)
@@ -0,0 +1,28 @@
+// PR c++/78771
+// 

Re: [ARM] PR 78253 do not resolve weak ref locally

2017-01-11 Thread Richard Earnshaw (lists)
On 01/12/16 14:27, Christophe Lyon wrote:
> Hi,
> 
> 
> On 10 November 2016 at 15:10, Christophe Lyon
>  wrote:
>> On 10 November 2016 at 11:05, Richard Earnshaw
>>  wrote:
>>> On 09/11/16 21:29, Christophe Lyon wrote:
 Hi,

 PR 78253 shows that the handling of weak references has changed for
 ARM with gcc-5.

 When r220674 was committed, default_binds_local_p_2 gained a new
 parameter (weak_dominate), which, when true, implies that a reference
 to a weak symbol defined locally will be resolved locally, even though
 it could be overridden by a strong definition in another object file.

 With r220674, default_binds_local_p forces weak_dominate=true,
 effectively changing the previous behavior.

 The attached patch introduces default_binds_local_p_4 which is a copy
 of default_binds_local_p_2, but using weak_dominate=false, and updates
 the ARM target to call default_binds_local_p_4 instead of
 default_binds_local_p_2.

 I ran cross-tests on various arm* configurations with no regression,
 and checked that the test attached to the original bugzilla now works
 as expected.

 I am not sure why weak_dominate defaults to true, and I couldn't
 really understand why by reading the threads related to r220674 and
 following updates to default_binds_local_p_* which all deal with other
 corner cases and do not discuss the weak_dominate parameter.

 Or should this patch be made more generic?

>>>
>>> I certainly don't think it should be ARM specific.
>> That was my feeling too.
>>
>>>
>>> The questions I have are:
>>>
>>> 1) What do other targets do today.  Are they the same, or different?
>>
>> arm, aarch64, s390 use default_binds_local_p_2 since PR 65780, and
>> default_binds_local_p before that. Both have weak_dominate=true
>> i386 has its own version, calling default_binds_local_p_3 with true
>> for weak_dominate
>>
>> But the behaviour of default_binds_local_p changed with r220674 as I said 
>> above.
>> See https://gcc.gnu.org/viewcvs/gcc?view=revision=220674 and
>> notice how weak_dominate was introduced
>>
>> The original bug report is about a different case:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32219
>>
>> The original patch submission is
>> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00410.html
>> and the 1st version with weak_dominate is in:
>> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00469.html
>> but it's not clear to me why this was introduced
>>
>>> 2) If different why?
>> on aarch64, although binds_local_p returns true, the relocations used when
>> building the function pointer is still the same (still via the GOT).
>>
>> aarch64 has different logic than arm when accessing a symbol
>> (eg aarch64_classify_symbol)
>>
>>> 3) Is the current behaviour really what was intended by the patch?  ie.
>>> Was the old behaviour actually wrong?
>>>
>> That's what I was wondering.
>> Before r220674, calling a weak function directly or via a function
>> pointer had the same effect (in other words, the function pointer
>> points to the actual implementation: the strong one if any, the weak
>> one otherwise).
>>
>> After r220674, on arm the function pointer points to the weak
>> definition, which seems wrong to me, it should leave the actual
>> resolution to the linker.
>>
>>
> 
> After looking at the aarch64 port, I think that references to weak symbols
> have to be handled carefully, to make sure they cannot be resolved
> by the assembler, since the weak symbol can be overridden by a strong
> definition at link-time.
> 
> Here is a new patch which does that.
> Validated on arm* targets with no regression, and I checked that the
> original testcase now executes as expected.
> 

This looks sensible, however, I think you should use 'non-weak' rather
than 'strong' in your comments (I've seen ABIs with weak, normal and
strong symbol definitions).

Also, you're missing a space before each macro/function call.

OK with those changes.

R.

> Christophe
> 
> 
>>> R.
 Thanks,

 Christophe

>>>
>>>
>>> pr78253.chlog.txt
>>>
>>>
>>> gcc/ChangeLog:
>>>
>>> 2016-12-01  Christophe Lyon  
>>>
>>> PR target/78253
>>> * config/arm/arm.c (legitimize_pic_address): Handle reference to
>>> weak symbol.
>>> (arm_assemble_integer): Likewise.
>>>
>>>
>>>
>>> pr78253.patch.txt
>>>
>>>
>>> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>> index 74cb64c..258ceb1 100644
>>> --- a/gcc/config/arm/arm.c
>>> +++ b/gcc/config/arm/arm.c
>>> @@ -6923,10 +6923,13 @@ legitimize_pic_address (rtx orig, machine_mode 
>>> mode, rtx reg)
>>>  same segment as the GOT.  Unfortunately, the flexibility of linker
>>>  scripts means that we can't be sure of that in general, so assume
>>>  that GOTOFF is never valid on VxWorks.  */
>>> +  /* References to weak symbols cannot be resolved 

Re: Use a specfile that actually allows building programs on NetBSD

2017-01-11 Thread Krister Walfridsson

On Mon, 9 Jan 2017, co...@sdf.org wrote:


3 month ping, 1 week ping (trying again), etc...


Apologies for not getting back to you sooner.



Like most operating systems, NetBSD has a libc which contains
stuff it needs for most programs to work, and people expect
it to be linked without explicitly specifying -lc.


Well, most programs already get -lc automatically -- it is only when you 
pass -shared that it fails... :-)


But I'll fix this, together with some other SPEC issues, in a few days.

   /Krister


Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)

2017-01-11 Thread Martin Sebor

On 01/11/2017 02:05 AM, Christophe Lyon wrote:

Hi Martin,

On 9 January 2017 at 04:14, Jeff Law  wrote:

On 01/08/2017 02:04 PM, Martin Sebor wrote:


On 01/06/2017 09:45 AM, Jeff Law wrote:


On 01/05/2017 08:52 PM, Martin Sebor wrote:


So Richi asked for removal of the VR_ANTI_RANGE handling, which would
imply removal of operand_signed_p.

What are the implications if we do that?



I just got back to this yesterday.  The implications of the removal
of the anti-range handling are a number of false negatives in the
test suite:


I was thinking more at a higher level.  ie, are the warnings still
useful if we don't have the anti-range handling?  I suspect so, but
would like to hear your opinion.


...


  n = ~[-4, MAX];   (I.e., n is either negative or too big.)
  p = malloc (n);


Understood.  The low level question is do we get these kinds of ranges
often enough in computations leading to allocation sizes?



My intuition tells me that they are likely common enough not to
disregard but I don't have a lot of data to back it up with.  In
a Bash build a full 23% of all checked calls are of this kind (24
out of 106).  In a Binutils build only 4% are (9 out of 228).  In
Glibc, a little under 3%.  My guess is that the number will be
inversely proportional to the quality of the code.


So I think you've made a case that we do want to handle this case.  So
what's left is how best to avoid the infinite recursion and mitigate the
pathological cases.

What you're computing seems to be "this object may have been derived
from a signed type".  Right?  It's a property we can compute for any
given SSA_NAME and it's not context/path specific beyond the
context/path information encoded by the SSA graph.

So just thinking out load here, could we walk the IL once to identify
call sites and build a worklist of SSA_NAMEs we care about.  Then we
iterate on the worklist much like Aldy's code he's working on for the
unswitching vs uninitialized variable issue?



Thanks for the suggestion.  It occurred to me while working on the fix
for 78973 (the non-bug) that size ranges should be handled the same by
-Wstringop-overflow as by -Walloc-size-larger-than, and that both have
the same problem: missing or incomplete support for anti-ranges.  The
attached patch moves get_size_range() from builtins.c to calls.c and
adds better support for anti-ranges.  That solves the problems also
lets it get rid of the objectionable operand_positive_p function.

Martin

PS The change to the alloc_max_size function is only needed to make
it possible to specify any argument to the -Walloc-size-larger-than
option, including 0 and -1, so that allocations of any size, including
zero can be flagged.

gcc-78775.diff


PR tree-optimization/78775 - [7 Regression] ICE in
maybe_warn_alloc_args_overflow

gcc/ChangeLog:

PR tree-optimization/78775
* builtins.c (get_size_range): Move...
* calls.c: ...to here.
(alloc_max_size): Accept zero argument.
(operand_signed_p): Remove.
(maybe_warn_alloc_args_overflow): Call get_size_range.
* calls.h (get_size_range): Declare.

gcc/testsuite/ChangeLog:

PR tree-optimization/78775
* gcc.dg/attr-alloc_size-4.c: Add test cases.
* gcc.dg/pr78775.c: New test.
* gcc.dg/pr78973-2.c: New test.
* gcc.dg/pr78973.c: New test.





The new test (gcc.dg/pr78973.c) fails on arm targets (there's no warning).

In addition, I have noticed a new failure:
  gcc.dg/attr-alloc_size-4.c  (test for warnings, line 140)
on target arm-none-linux-gnueabihf --with-cpu=cortex-a5
(works fine --with-cpu=cortex-a9)


Thanks.  I'm tracking the test failure on powerpc64* in bug 79051.
It's caused by a VRP bug/limitation described in bug 79054.  Let
me add arm and m68k to the list of targets.

Martin


[PATCH][GIMPLE FE] Add parsing of MEM_REFs

2017-01-11 Thread Richard Biener

The following fills the gap of missed handling of MEM_REF parsing.
As we want to represent all info that is on a MEM_REF the existing
dumping isn't sufficent so I resorted to

  __MEM '<' type-name [ ',' number ] '>'
'(' [ '(' type-name ')' ] unary-expression
 [ '+' number ] ')'  */

where optional parts are in []s.  So for

  __MEM < int, 16 > ( (char *)  + 1 )

we access x as 16-bit aligned int at offset 1 with TBAA type 'char'.

Naturally VIEW_CONVERT_EXPR would look like __VCE < int > (x) then,
TARGET_MEM_REF would simply get some additional operands in the
above function-like __MEM (rather than + step * index + index2).

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

The testcase parses, dumps and then parses again to the same output.

Any comments / objections to the syntax (of __MEM) and/or suggestions
for VIEW_CONVERT_EXPR or TARGET_MEM_REF?

As you can see I adjusted dumping of pointer constants (we can't
parse the B suffix and large unsigned numbers get a warning so
add 'U').  There's the general issue that we dump

  short x;
  x = 1;

and then lex the '1' as type int and there's no suffixes for integer
types smaller than int which means we can't write those constants
type correct :/  Suggestions welcome (currently we ICE with type
mismatches in those cases, we can avoid that by auto-fixing during
parsing but I'd like to be explicit somehow).

Thanks,
Richard.

2017-01-11  Richard Biener  

* tree-pretty-print.c (dump_generic_node): Provide -gimple
variant for MEM_REF.  Sanitize INTEGER_CST for -gimple.

c/
* gimple-parser.c (c_parser_gimple_postfix_expression): Parse
__MEM.

* gcc.dg/gimplefe-21.c: New testcase.

Index: gcc/tree-pretty-print.c
===
*** gcc/tree-pretty-print.c (revision 244312)
--- gcc/tree-pretty-print.c (working copy)
*** dump_generic_node (pretty_printer *pp, t
*** 1459,1465 
  
  case MEM_REF:
{
!   if (integer_zerop (TREE_OPERAND (node, 1))
/* Dump the types of INTEGER_CSTs explicitly, for we can't
   infer them and MEM_ATTR caching will share MEM_REFs
   with differently-typed op0s.  */
--- 1459,1496 
  
  case MEM_REF:
{
!   if (flags & TDF_GIMPLE)
! {
!   pp_string (pp, "__MEM <");
!   dump_generic_node (pp, TREE_TYPE (node),
!  spc, flags | TDF_SLIM, false);
!   if (TYPE_ALIGN (TREE_TYPE (node))
!   != TYPE_ALIGN (TYPE_MAIN_VARIANT (TREE_TYPE (node
! {
!   pp_string (pp, ", ");
!   pp_decimal_int (pp, TYPE_ALIGN (TREE_TYPE (node)));
! }
!   pp_greater (pp);
!   pp_string (pp, " (");
!   if (TREE_TYPE (TREE_OPERAND (node, 0))
!   != TREE_TYPE (TREE_OPERAND (node, 1)))
! {
!   pp_left_paren (pp);
!   dump_generic_node (pp, TREE_TYPE (TREE_OPERAND (node, 1)),
!  spc, flags | TDF_SLIM, false);
!   pp_right_paren (pp);
! }
!   dump_generic_node (pp, TREE_OPERAND (node, 0),
!  spc, flags | TDF_SLIM, false);
!   if (! integer_zerop (TREE_OPERAND (node, 1)))
! {
!   pp_string (pp, " + ");
!   dump_generic_node (pp, TREE_OPERAND (node, 1),
!  spc, flags | TDF_SLIM, false);
! }
!   pp_right_paren (pp);
! }
!   else if (integer_zerop (TREE_OPERAND (node, 1))
/* Dump the types of INTEGER_CSTs explicitly, for we can't
   infer them and MEM_ATTR caching will share MEM_REFs
   with differently-typed op0s.  */
*** dump_generic_node (pretty_printer *pp, t
*** 1633,1639 
break;
  
  case INTEGER_CST:
!   if (TREE_CODE (TREE_TYPE (node)) == POINTER_TYPE)
{
  /* In the case of a pointer, one may want to divide by the
 size of the pointed-to type.  Unfortunately, this not
--- 1664,1671 
break;
  
  case INTEGER_CST:
!   if (TREE_CODE (TREE_TYPE (node)) == POINTER_TYPE
! && ! (flags & TDF_GIMPLE))
{
  /* In the case of a pointer, one may want to divide by the
 size of the pointed-to type.  Unfortunately, this not
*** dump_generic_node (pretty_printer *pp, t
*** 1661,1667 
else if (tree_fits_shwi_p (node))
pp_wide_integer (pp, tree_to_shwi (node));
else if (tree_fits_uhwi_p (node))
!   pp_unsigned_wide_integer (pp, tree_to_uhwi (node));
else
{
  wide_int val = node;
--- 1693,1703 
else if (tree_fits_shwi_p (node))
pp_wide_integer (pp, tree_to_shwi (node));
else if (tree_fits_uhwi_p (node))
!   {
! 

Re: [PATCH] TS_OPTIMIZATION/TS_TARGET_OPTION need no chain/type

2017-01-11 Thread Richard Biener
On Wed, 11 Jan 2017, Richard Biener wrote:

> 
> LTO bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> 
> (most "gross" are still TS_LIST having a type and TS_VEC having type
> and chain, but that's been hard to fix with the C++ FE in place)

Forgot the tree-core.h part.

Re-bootstrapping testing on x86_64-unknown-linux-gnu.

Richard.

2017-01-11  Richard Biener  

* tree.c (initialize_tree_contains_struct): Make TS_OPTIMIZATION
and TS_TARGET_OPTION directly derive from TS_BASE.
* tree-core.h (tree_optimization_option): Derive from tree_base.
(tree_target_option): Likewise.

Index: gcc/tree.c
===
--- gcc/tree.c  (revision 244309)
+++ gcc/tree.c  (working copy)
@@ -508,6 +508,8 @@ initialize_tree_contains_struct (void)
{
case TS_TYPED:
case TS_BLOCK:
+   case TS_OPTIMIZATION:
+   case TS_TARGET_OPTION:
  MARK_TS_BASE (code);
  break;
 
@@ -532,8 +534,6 @@ initialize_tree_contains_struct (void)
case TS_VEC:
case TS_BINFO:
case TS_OMP_CLAUSE:
-   case TS_OPTIMIZATION:
-   case TS_TARGET_OPTION:
  MARK_TS_COMMON (code);
  break;
 
Index: gcc/tree-core.h
===
--- gcc/tree-core.h (revision 244309)
+++ gcc/tree-core.h (working copy)
@@ -1794,7 +1794,7 @@ struct GTY(()) tree_statement_list
 /* Optimization options used by a function.  */
 
 struct GTY(()) tree_optimization_option {
-  struct tree_common common;
+  struct tree_base base;
 
   /* The optimization options used by the user.  */
   struct cl_optimization *opts;
@@ -1815,7 +1815,7 @@ struct GTY(()) target_globals;
 /* Target options used by a function.  */
 
 struct GTY(()) tree_target_option {
-  struct tree_common common;
+  struct tree_base base;
 
   /* Target globals for the corresponding target option.  */
   struct target_globals *globals;


Re: [PATCH] PR78255: Make postreload aware of NO_FUNCTION_CSE

2017-01-11 Thread Andre Vieira (lists)
On 06/01/17 15:47, Jeff Law wrote:
> On 01/06/2017 03:53 AM, Andre Vieira (lists) wrote:
>> On 09/12/16 16:31, Bernd Schmidt wrote:
>>> On 12/09/2016 05:16 PM, Andre Vieira (lists) wrote:
>>>
 Regardless, 'reload_cse_simplify' would never perform the opposite
 transformation.  It checks whether it can replace anything within the
 first argument INSN, with the second argument TESTREG. As the name
 implies this will always be a register. I double checked, the function
 is only called in 'reload_cse_regs' and 'testreg' is created using
 'gen_rtx_REG'.
>>>
>>> Ok, let's go ahead with it.
>>>
>>>
>>> Bernd
>>>
>> Hello,
>>
>> Is it OK to backport this (including the testcase fix) to gcc-6-branch?
>>
>> Patches apply cleanly and full bootstrap and regression tests for
>> aarch64- and arm-none-linux-gnueabihf. Regression tested for
>> arm-none-eabi.
> Yes, that should be fine to backport to the active release branches.
> 
> jeff
OK, I have committed the backports to gcc-5 and gcc-6 branches.

Cheers,
Andre


[PATCH] PR78134 fix return types of heterogeneous lookup functions

2017-01-11 Thread Jonathan Wakely

As with PR 68190 I was returning the _Rb_tree iterator types, not
converting them to the container's iterator types.

PR libstdc++/78134
* include/bits/stl_map.h (map::lower_bound, map::upper_bound)
(map::equal_range): Fix return type of heterogeneous overloads.
* include/bits/stl_multimap.h (multimap::lower_bound)
(multimap::upper_bound, multimap::equal_range): Likewise.
* include/bits/stl_multiset.h (multiset::lower_bound)
(multiset::upper_bound, multiset::equal_range): Likewise.
* include/bits/stl_set.h (set::lower_bound, set::upper_bound)
(set::equal_range): Likewise.
* testsuite/23_containers/map/operations/2.cc
* testsuite/23_containers/multimap/operations/2.cc
* testsuite/23_containers/multiset/operations/2.cc
* testsuite/23_containers/set/operations/2.cc

Tested powerpc64le-linux, committed to trunk.

I'll backport this to the branches too.

commit c07b13cef46d2ca88891dc5bb91539fdde5eb5e4
Author: Jonathan Wakely 
Date:   Wed Jan 11 14:32:44 2017 +

PR78134 fix return types of heterogeneous lookup functions

PR libstdc++/78134
* include/bits/stl_map.h (map::lower_bound, map::upper_bound)
(map::equal_range): Fix return type of heterogeneous overloads.
* include/bits/stl_multimap.h (multimap::lower_bound)
(multimap::upper_bound, multimap::equal_range): Likewise.
* include/bits/stl_multiset.h (multiset::lower_bound)
(multiset::upper_bound, multiset::equal_range): Likewise.
* include/bits/stl_set.h (set::lower_bound, set::upper_bound)
(set::equal_range): Likewise.
* testsuite/23_containers/map/operations/2.cc
* testsuite/23_containers/multimap/operations/2.cc
* testsuite/23_containers/multiset/operations/2.cc
* testsuite/23_containers/set/operations/2.cc

diff --git a/libstdc++-v3/include/bits/stl_map.h 
b/libstdc++-v3/include/bits/stl_map.h
index 91b80d9..194ce42 100644
--- a/libstdc++-v3/include/bits/stl_map.h
+++ b/libstdc++-v3/include/bits/stl_map.h
@@ -1218,8 +1218,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   template
auto
lower_bound(const _Kt& __x)
-   -> decltype(_M_t._M_lower_bound_tr(__x))
-   { return _M_t._M_lower_bound_tr(__x); }
+   -> decltype(iterator(_M_t._M_lower_bound_tr(__x)))
+   { return iterator(_M_t._M_lower_bound_tr(__x)); }
 #endif
   //@}
 
@@ -1243,8 +1243,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   template
auto
lower_bound(const _Kt& __x) const
-   -> decltype(_M_t._M_lower_bound_tr(__x))
-   { return _M_t._M_lower_bound_tr(__x); }
+   -> decltype(const_iterator(_M_t._M_lower_bound_tr(__x)))
+   { return const_iterator(_M_t._M_lower_bound_tr(__x)); }
 #endif
   //@}
 
@@ -1263,8 +1263,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   template
auto
upper_bound(const _Kt& __x)
-   -> decltype(_M_t._M_upper_bound_tr(__x))
-   { return _M_t._M_upper_bound_tr(__x); }
+   -> decltype(iterator(_M_t._M_upper_bound_tr(__x)))
+   { return iterator(_M_t._M_upper_bound_tr(__x)); }
 #endif
   //@}
 
@@ -1283,8 +1283,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   template
auto
upper_bound(const _Kt& __x) const
-   -> decltype(_M_t._M_upper_bound_tr(__x))
-   { return _M_t._M_upper_bound_tr(__x); }
+   -> decltype(const_iterator(_M_t._M_upper_bound_tr(__x)))
+   { return const_iterator(_M_t._M_upper_bound_tr(__x)); }
 #endif
   //@}
 
@@ -1312,8 +1312,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   template
auto
equal_range(const _Kt& __x)
-   -> decltype(_M_t._M_equal_range_tr(__x))
-   { return _M_t._M_equal_range_tr(__x); }
+   -> decltype(pair(_M_t._M_equal_range_tr(__x)))
+   { return pair(_M_t._M_equal_range_tr(__x)); }
 #endif
   //@}
 
@@ -1341,8 +1341,12 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   template
auto
equal_range(const _Kt& __x) const
-   -> decltype(_M_t._M_equal_range_tr(__x))
-   { return _M_t._M_equal_range_tr(__x); }
+   -> decltype(pair(
+ _M_t._M_equal_range_tr(__x)))
+   {
+ return pair(
+ _M_t._M_equal_range_tr(__x));
+   }
 #endif
   //@}
 
diff --git a/libstdc++-v3/include/bits/stl_multimap.h 
b/libstdc++-v3/include/bits/stl_multimap.h
index 98af1ba..8b37de9 100644
--- a/libstdc++-v3/include/bits/stl_multimap.h
+++ b/libstdc++-v3/include/bits/stl_multimap.h
@@ -887,8 +887,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   template
auto
lower_bound(const _Kt& __x)
-   -> decltype(_M_t._M_lower_bound_tr(__x))
-   { return _M_t._M_lower_bound_tr(__x); }
+   -> decltype(iterator(_M_t._M_lower_bound_tr(__x)))
+   { return 

[PATCH] PR78273 fix count to work with partitioning function

2017-01-11 Thread Jonathan Wakely

I thought it would be an optimization to use _M_find_tr(k) != end()
for the unique associative containers, but as the PR points out the
heterogeneous version of count() can find multiple matches even in a
unique container. We need to use _M_count_tr(k)  to find all matches.

PR libstdc++/78273
* include/bits/stl_map.h (map::count<_Kt>(const _Kt&)): Don't assume
the heterogeneous comparison can only find one match.
* include/bits/stl_set.h (set::count<_Kt>(const _Kt&)): Likewise.
* testsuite/23_containers/map/operations/2.cc: Test count works with
comparison function that just partitions rather than sorting.
* testsuite/23_containers/set/operations/2.cc: Likewise.

Tested powerpc64le-linux, committed to trunk.

I'll also backport this to the branches.

commit 8e5f512a435d4ccc7b592a7c4872418943a5c9c7
Author: Jonathan Wakely 
Date:   Wed Jan 11 13:49:02 2017 +

PR78273 fix count to work with partitioning function

PR libstdc++/78273
* include/bits/stl_map.h (map::count<_Kt>(const _Kt&)): Don't assume
the heterogeneous comparison can only find one match.
* include/bits/stl_set.h (set::count<_Kt>(const _Kt&)): Likewise.
* testsuite/23_containers/map/operations/2.cc: Test count works with
comparison function that just partitions rather than sorting.
* testsuite/23_containers/set/operations/2.cc: Likewise.

diff --git a/libstdc++-v3/include/bits/stl_map.h 
b/libstdc++-v3/include/bits/stl_map.h
index f2a0ffa..91b80d9 100644
--- a/libstdc++-v3/include/bits/stl_map.h
+++ b/libstdc++-v3/include/bits/stl_map.h
@@ -1194,7 +1194,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   template
auto
count(const _Kt& __x) const -> decltype(_M_t._M_count_tr(__x))
-   { return _M_t._M_find_tr(__x) == _M_t.end() ? 0 : 1; }
+   { return _M_t._M_count_tr(__x); }
 #endif
   //@}
 
diff --git a/libstdc++-v3/include/bits/stl_set.h 
b/libstdc++-v3/include/bits/stl_set.h
index 66560a7..ab960f1 100644
--- a/libstdc++-v3/include/bits/stl_set.h
+++ b/libstdc++-v3/include/bits/stl_set.h
@@ -739,7 +739,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
auto
count(const _Kt& __x) const
-> decltype(_M_t._M_count_tr(__x))
-   { return _M_t._M_find_tr(__x) == _M_t.end() ? 0 : 1; }
+   { return _M_t._M_count_tr(__x); }
 #endif
   //@}
 
diff --git a/libstdc++-v3/testsuite/23_containers/map/operations/2.cc 
b/libstdc++-v3/testsuite/23_containers/map/operations/2.cc
index 6509084..ef4e76b 100644
--- a/libstdc++-v3/testsuite/23_containers/map/operations/2.cc
+++ b/libstdc++-v3/testsuite/23_containers/map/operations/2.cc
@@ -133,6 +133,27 @@ test05()
   VERIFY( Cmp::count == 0);
 }
 
+void
+test06()
+{
+  // PR libstdc++/78273
+
+  struct C {
+bool operator()(int l, int r) const { return l < r; }
+
+struct Partition { };
+
+bool operator()(int l, Partition) const { return l < 2; }
+bool operator()(Partition, int r) const { return 4 < r; }
+
+using is_transparent = void;
+  };
+
+  std::map m{ {1,0}, {2,0}, {3,0}, {4, 0}, {5, 0} };
+
+  auto n = m.count(C::Partition{});
+  VERIFY( n == 3 );
+}
 
 int
 main()
@@ -142,4 +163,5 @@ main()
   test03();
   test04();
   test05();
+  test06();
 }
diff --git a/libstdc++-v3/testsuite/23_containers/set/operations/2.cc 
b/libstdc++-v3/testsuite/23_containers/set/operations/2.cc
index aa71ae5..aef808d 100644
--- a/libstdc++-v3/testsuite/23_containers/set/operations/2.cc
+++ b/libstdc++-v3/testsuite/23_containers/set/operations/2.cc
@@ -150,6 +150,28 @@ test06()
   s.find(i);
 }
 
+void
+test07()
+{
+  // PR libstdc++/78273
+
+  struct C {
+bool operator()(int l, int r) const { return l < r; }
+
+struct Partition { };
+
+bool operator()(int l, Partition) const { return l < 2; }
+bool operator()(Partition, int r) const { return 4 < r; }
+
+using is_transparent = void;
+  };
+
+  std::set s{ 1, 2, 3, 4, 5 };
+
+  auto n = s.count(C::Partition{});
+  VERIFY( n == 3 );
+}
+
 int
 main()
 {
@@ -159,4 +181,5 @@ main()
   test04();
   test05();
   test06();
+  test07();
 }


[arm] Replace command-line option .def files with single definition file

2017-01-11 Thread Richard Earnshaw (lists)
The files arm-cores.def, arm-fpus.def and arm-arches.def are parsed and
used in several places and the format is slightly awkward to maintain
as they must be parsable in C and by certain scripts.  Furthermore,
changes to the content that affects every entry is particularly awkward
for dealing with merges.

This patch replaces all three files with a single file that specifies
all the command-line related definitions in a new format that allows for
better checking for consistency as well as (hopefully) easier to merge
changes.

The awk script used to parse it is relatively complicated, but should be
pretty portable.  It works by parsing in all the data and then operating
one of a number of possible sub-commands to generate the desired output.

The new method picked up one error.  The CPU descriptions referred to an
architecture ARMv5tej which was not supported by -march.  This has been
fixed by adding the relevant entry to the architecture list.

gcc:

* config.gcc: Use new awk script to check CPU, FPU and architecture
parameters for --with-... options.
* config/arm/parsecpu.awk: New file
* config/arm/arm-cpus.in: New file.
* config/arm/arm-opts.h: Include arm-cpu.h instead of processing .def
files.
* config/arm/arm.c: Include arm-cpu-data.h instead of processing .def
files.
* config/arm/t-arm: Update dependency rules.
* common/config/arm/arm-common.c: Include arm-cpu-cdata.h instead
of processing .def files.
* config/arm/genopt.sh: Deleted.
* config/arm/gentune.sh: Deleted.
* config/arm/arm-cores.def: Deleted.
* config/arm/arm-arches.def: Deleted.
* config/arm/arm-fpus.def: Deleted.
* config/arm/arm-tune.md: Regenerated.
* config/arm/arm-tables.opt: Regenerated.
* config/arm/arm-cpu.h: New generated file.
* config/arm/arm-cpu-data.h: New generated file.
* config/arm/arm-cpu-cdata.h: New generated file.

contrib:
* gcc_update: Adjust touch list.
diff --git a/contrib/gcc_update b/contrib/gcc_update
index 2df9da4..ab5b33d 100755
--- a/contrib/gcc_update
+++ b/contrib/gcc_update
@@ -80,8 +80,11 @@ gcc/cstamp-h.in: gcc/configure.ac
 gcc/config.in: gcc/cstamp-h.in
 gcc/fixinc/fixincl.x: gcc/fixinc/fixincl.tpl gcc/fixinc/inclhack.def
 gcc/config/aarch64/aarch64-tune.md: gcc/config/aarch64/aarch64-cores.def 
gcc/config/aarch64/gentune.sh
-gcc/config/arm/arm-tune.md: gcc/config/arm/arm-cores.def 
gcc/config/arm/gentune.sh
-gcc/config/arm/arm-tables.opt: gcc/config/arm/arm-arches.def 
gcc/config/arm/arm-cores.def gcc/config/arm/arm-fpus.def 
gcc/config/arm/genopt.sh
+gcc/config/arm/arm-tune.md: gcc/config/arm/arm-cpus.in 
gcc/config/arm/parsecpu.awk
+gcc/config/arm/arm-tables.opt: gcc/config/arm/arm-cpus.in 
gcc/config/arm/parsecpu.awk
+gcc/config/arm/arm-cpu.h: gcc/config/arm/arm-cpus.in 
gcc/config/arm/parsecpu.awk
+gcc/config/arm/arm-cpu-data.h: gcc/config/arm/arm-cpus.in 
gcc/config/arm/parsecpu.awk
+gcc/config/arm/arm-cpu-cdata.h: gcc/config/arm/arm-cpus.in 
gcc/config/arm/parsecpu.awk
 gcc/config/avr/avr-tables.opt: gcc/config/avr/avr-mcus.def 
gcc/config/avr/genopt.sh
 gcc/config/avr/t-multilib: gcc/config/avr/avr-mcus.def 
gcc/config/avr/genmultilib.awk
 gcc/config/c6x/c6x-tables.opt: gcc/config/c6x/c6x-isas.def 
gcc/config/c6x/genopt.sh
diff --git a/gcc/common/config/arm/arm-common.c 
b/gcc/common/config/arm/arm-common.c
index 4103c67..7ecc68d 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -104,19 +104,7 @@ struct arm_arch_core_flag
   const enum isa_feature isa_bits[isa_num_bits];
 };
 
-static const struct arm_arch_core_flag arm_arch_core_flags[] =
-{
-#undef ARM_CORE
-#define ARM_CORE(NAME, X, IDENT, TUNE_FLAGS, ARCH, ISA, COSTS) \
-  {NAME, {ISA isa_nobit}},
-#include "config/arm/arm-cores.def"
-#undef ARM_CORE
-#undef ARM_ARCH
-#define ARM_ARCH(NAME, CORE, TUNE_FLAGS, ARCH, ISA)\
-  {NAME, {ISA isa_nobit}},
-#include "config/arm/arm-arches.def"
-#undef ARM_ARCH
-};
+#include "config/arm/arm-cpu-cdata.h"
 
 /* Scan over a raw feature array BITS checking for BIT being present.
This is slower than the normal bitmask checks, but we would spend longer
diff --git a/gcc/config.gcc b/gcc/config.gcc
index bb25d54..69fddda 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3660,41 +3660,24 @@ case "${target}" in
 
arm*-*-*)
supported_defaults="arch cpu float tune fpu abi mode tls"
-   for which in cpu tune; do
-   # See if it matches any of the entries in arm-cores.def
+   for which in cpu tune arch; do
+   # See if it matches a supported value
eval "val=\$with_$which"
-   if [ x"$val" = x ] \
-   || grep "^ARM_CORE(\"$val\"," \
-   ${srcdir}/config/arm/arm-cores.def \
-

Re: [patch,avr] PR78883: Implement a dummy scheduler

2017-01-11 Thread Georg-Johann Lay

On 04.01.2017 20:29, Jeff Law wrote:

On 01/04/2017 12:18 PM, Segher Boessenkool wrote:

On Wed, Jan 04, 2017 at 06:42:23PM +, Richard Sandiford wrote:

1. reload has a bug that no-one really wants to fix (understandable)
2. the bug is triggered by paradoxical subregs of mems
3. those subregs are normally disabled on targets that support insn
   scheduling
4. therefore, define an insn scheduler
5. we don't actually want insn scheduling, so either
   (a) make sure the insn scheduler is never actually used for insn
   scheduling, or
   (b) allow the insn scheduler to run anyway but encourage it to do
nothing
   (other than take compile time)

(4) and (5) feel like too much of a hack to me.  They're going to have
other consequences, e.g. we'll no longer give the warning:

  instruction scheduling not supported on this target machine

if users try to use -fschedule-insns.  And since we don't support
meaningful insn scheduling even after this patch, giving the warning
seems more user-friendly than dropping it.

I think the consensus is that we don't want these subregs for AVR
regardless of whether scheduling is used, and probably wouldn't want
them even without this bug.


Right, and the same is true for most targets.  Subregs of memory are not
something you want.  As rtl.texi says:


@item mem
@code{subreg}s of @code{mem} were common in earlier versions of GCC and
are still supported.  During the reload pass these are replaced by plain
@code{mem}s.  On machines that do not do instruction scheduling, use of
@code{subreg}s of @code{mem} are still used, but this is no longer
recommended.  Such @code{subreg}s are considered to be
@code{register_operand}s rather than @code{memory_operand}s before and
during reload.  Because of this, the scheduling passes cannot properly
schedule instructions with @code{subreg}s of @code{mem}, so for machines
that do scheduling, @code{subreg}s of @code{mem} should never be used.
To support this, the combine and recog passes have explicit code to
inhibit the creation of @code{subreg}s of @code{mem} when
@code{INSN_SCHEDULING} is defined.



So why not instead change the condition
used by general_operand, like we were talking about yesterday?
It seems simpler and more direct.


We should split off a new "SUBREGS_OF_MEM_ALLOWED" from !INSN_SCHEDULING,
and then probably even default it to false.

That would work for me :-)  The question in my mind would be unexpected
fallout at this point in the release process.  Maybe default it to
!INSN_SCHEDULING to minimize such fallout now, then to false for gcc-8?


jeff


Bit if we disable it, what's the point of introducing changes to combine
which come up with even more of such subregs?

For targets with scheduling, which applies to most of the targets, the
"optimization" in combine will be void as rejected by general_operand,
hence a target would have explicit paradoxical subregs in the back end
or use some home brew predicated that allow that stuff and use internal
knowledge of what combine does.

Moreover I have some problems in explaining what the new hook macro is
supposed to do:

"Disable/enable paradoxical SUBREGs of MEM in general_operands before
register allocation.  Use this hook if your back end has trouble with
paradoxical subregs of mem.  Enabled per default iff the target
provides an insn scheduler."

Who would understand this and infer from the docs whether this macro
should be used?

And if such subregs are forbidden, why are they generated in the first
place? Shouldn't combine also respect that hook?

Johann




Re: [PATCH] PR77528 add default constructors for container adaptors

2017-01-11 Thread Tim Song
On Wed, Jan 11, 2017 at 8:30 AM, Jonathan Wakely  wrote:

>>> Re the new DMI, my brain compiler says that _Sequence c = _Sequence();
>>> breaks anything with an explicit copy/move constructor pre-C++17, but
>>> I also don't think we care about those, right?
>>
>>
>> I dislike them,
>
>
> I meant to add "but we try to support them where plausible".
>
> If the standard requires CopyConstructible then we don't need to care
> about explicit copy constructors. But if it only requires
> is_copy_constructible then that does work with explicit copy ctors.
> And if it says one thing, but means the other, then we just have to
> guess what's intended! :-)
>
>

Clause 23 is ... not a model of clarity. It depends on how strongly
you read the "any sequence container" phrasing, I suppose. Table 83
requires X u = a; to work for containers, but it also requires a == b
to work.

There's also the problem of Compare (which I don't see any requirement
w/r/t CopyConstructible and like on). It does say things like
"initializes comp with x", but doesn't say what kind of
initialization...


Re: [PATCH] PR77528 add default constructors for container adaptors

2017-01-11 Thread Jonathan Wakely

On 11/01/17 13:25 +, Jonathan Wakely wrote:

On 11/01/17 08:04 -0500, Tim Song wrote:

On Wed, Jan 11, 2017 at 7:21 AM, Jonathan Wakely  wrote:

This patch uses the _Enable_default_constructor mixin to properly
delete the default constructors. It's a bit cumbersome, because we
have to add an initializer for the base class to every
ctor-initializer-list, but I think I prefer this to making the default
constructor a constrained template.



I'm happy with either approach - my primary concern is making sure
that is_constructible and friends work and don't lie, in a world where
increasing numbers of library components depend on it. Though I'm a


Yes, it's important that we give the right answer.


bit curious as to why you found this approach more preferable.


I dislike making functions into templates when they aren't "supposed"
to be. But I'm in two minds for this case. It's certainly a smaller,
more self-contained change to just add a default constructor template
and not mess about with a new base class and DMIs and all those
mem-initializers.


Re the new DMI, my brain compiler says that _Sequence c = _Sequence();
breaks anything with an explicit copy/move constructor pre-C++17, but
I also don't think we care about those, right?


I dislike them,


I meant to add "but we try to support them where plausible".

If the standard requires CopyConstructible then we don't need to care
about explicit copy constructors. But if it only requires
is_copy_constructible then that does work with explicit copy ctors.
And if it says one thing, but means the other, then we just have to
guess what's intended! :-)


but maybe the fact they won't work here is a strong
enough reason to get over my dislike of the original approach and just
do it that way instead.




Re: [PATCH] PR77528 add default constructors for container adaptors

2017-01-11 Thread Jonathan Wakely

On 11/01/17 08:04 -0500, Tim Song wrote:

On Wed, Jan 11, 2017 at 7:21 AM, Jonathan Wakely  wrote:

This patch uses the _Enable_default_constructor mixin to properly
delete the default constructors. It's a bit cumbersome, because we
have to add an initializer for the base class to every
ctor-initializer-list, but I think I prefer this to making the default
constructor a constrained template.



I'm happy with either approach - my primary concern is making sure
that is_constructible and friends work and don't lie, in a world where
increasing numbers of library components depend on it. Though I'm a


Yes, it's important that we give the right answer.


bit curious as to why you found this approach more preferable.


I dislike making functions into templates when they aren't "supposed"
to be. But I'm in two minds for this case. It's certainly a smaller,
more self-contained change to just add a default constructor template
and not mess about with a new base class and DMIs and all those
mem-initializers.


Re the new DMI, my brain compiler says that _Sequence c = _Sequence();
breaks anything with an explicit copy/move constructor pre-C++17, but
I also don't think we care about those, right?


I dislike them, but maybe the fact they won't work here is a strong
enough reason to get over my dislike of the original approach and just
do it that way instead.




Re: [PATCH C++] Fix PR70182 -- missing "on" in mangling of unresolved operators

2017-01-11 Thread Nathan Sidwell

On 01/11/2017 08:16 AM, Markus Trippelsdorf wrote:


--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -2813,6 +2813,8 @@ write_template_args (tree args)
 static void
 write_member_name (tree member)
 {
+  if (abi_version_at_least (11) && IDENTIFIER_OPNAME_P (member))
+write_string ("on");


It looks like you need to:
1) add documentation to doc/invoke.texi (-fabi-version)
2) add something like:
  if (abi_warn_or_compat_version_crosses (11))
G.need_abi_warning = 1;
into that if clause.

nathan
--
Nathan Sidwell


[PATCH C++] Fix PR70182 -- missing "on" in mangling of unresolved operators

2017-01-11 Thread Markus Trippelsdorf
The ABI says:


   ::= [gs] 
   ::= sr  
   ::= srN  + E 

   ::= [gs] sr + E 


   ::= 
   ::= on 
   ::= on  
   ::= dn  int f ();
diff --git a/gcc/testsuite/g++.dg/abi/mangle37.C 
b/gcc/testsuite/g++.dg/abi/mangle37.C
index 691566b384ba..4dd87e84c108 100644
--- a/gcc/testsuite/g++.dg/abi/mangle37.C
+++ b/gcc/testsuite/g++.dg/abi/mangle37.C
@@ -1,5 +1,6 @@
 // Testcase for mangling of expressions involving operator names.
 // { dg-do compile { target c++11 } }
+// { dg-options "-fabi-version=10" }
 // { dg-final { scan-assembler "_Z1fI1AEDTclonplfp_fp_EET_" } }
 // { dg-final { scan-assembler "_Z1gI1AEDTclonplIT_Efp_fp_EES1_" } }
 // { dg-final { scan-assembler "_Z1hI1AEDTcldtfp_miEET_" } }
diff --git a/gcc/testsuite/g++.dg/abi/pr70182.C 
b/gcc/testsuite/g++.dg/abi/pr70182.C
new file mode 100644
index ..d299362910c1
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/pr70182.C
@@ -0,0 +1,28 @@
+// { dg-options "-fabi-version=0" }
+
+struct A {
+  template  int f ();
+  int operator+();
+  operator int ();
+  template  
+  int operator-();
+};
+
+typedef int (A::*P)();
+
+template  struct S {};
+
+template  void g (S<::template f >) {}
+template  void g (S<::operator+ >) {}
+template  void g (S<::operator int>) {}
+template  void g (S<::template operator-  >) {}
+
+template void g (S<::f >);
+template void g (S<::operator+>);
+template void g (S<::operator int>);
+template void g (S<::operator- >);
+
+// { dg-final { scan-assembler _Z1gI1AEv1SIXadsrT_1fIiEEE } }
+// { dg-final { scan-assembler _Z1gI1AEv1SIXadsrT_onplEE } }
+// { dg-final { scan-assembler _Z1gI1AEv1SIXadsrT_oncviEE } }
+// { dg-final { scan-assembler _Z1gI1AEv1SIXadsrT_onmiIdEEE } }
diff --git a/gcc/testsuite/g++.dg/dfp/mangle-1.C 
b/gcc/testsuite/g++.dg/dfp/mangle-1.C
index 455d3e4c0ef6..ee9644b27a53 100644
--- a/gcc/testsuite/g++.dg/dfp/mangle-1.C
+++ b/gcc/testsuite/g++.dg/dfp/mangle-1.C
@@ -1,4 +1,5 @@
 // { dg-do compile }
+// { dg-options "-fabi-version=10" }
 
 // Mangling of classes from std::decimal are special-cased.
 // Derived from g++.dg/abi/mangle13.C.
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index d84929eca20d..f0dbf9381c6b 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -1594,6 +1594,8 @@ d_unqualified_name (struct d_info *di)
 ret = d_source_name (di);
   else if (IS_LOWER (peek))
 {
+  if (peek == 'o' && d_peek_next_char (di) == 'n')
+   d_advance (di, 2);
   ret = d_operator_name (di);
   if (ret != NULL && ret->type == DEMANGLE_COMPONENT_OPERATOR)
{
diff --git a/libiberty/testsuite/demangle-expected 
b/libiberty/testsuite/demangle-expected
index 07e258fe58b3..c1cfa1545eca 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -4682,3 +4682,10 @@ _ZZ3foovE8localVar__9_
 
 _ZZ3foovE8localVar__12
 _ZZ3foovE8localVar__12
+
+# PR 70182
+_Z1gI1AEv1SIXadsrT_onplEE
+void g(S<::operator+>)
+
+_Z1gI1AEv1SIXadsrT_plEE
+void g(S<::operator+>)
-- 
Markus


[PATCH] TS_OPTIMIZATION/TS_TARGET_OPTION need no chain/type

2017-01-11 Thread Richard Biener

LTO bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

(most "gross" are still TS_LIST having a type and TS_VEC having type
and chain, but that's been hard to fix with the C++ FE in place)

Richard.

2017-01-11  Richard Biener  

* tree.c (initialize_tree_contains_struct): Make TS_OPTIMIZATION
and TS_TARGET_OPTION directly derive from TS_BASE.

Index: gcc/tree.c
===
--- gcc/tree.c  (revision 244309)
+++ gcc/tree.c  (working copy)
@@ -508,6 +508,8 @@ initialize_tree_contains_struct (void)
{
case TS_TYPED:
case TS_BLOCK:
+   case TS_OPTIMIZATION:
+   case TS_TARGET_OPTION:
  MARK_TS_BASE (code);
  break;
 
@@ -532,8 +534,6 @@ initialize_tree_contains_struct (void)
case TS_VEC:
case TS_BINFO:
case TS_OMP_CLAUSE:
-   case TS_OPTIMIZATION:
-   case TS_TARGET_OPTION:
  MARK_TS_COMMON (code);
  break;
 


Re: [PATCH] PR77528 add default constructors for container adaptors

2017-01-11 Thread Tim Song
On Wed, Jan 11, 2017 at 7:21 AM, Jonathan Wakely  wrote:
> This patch uses the _Enable_default_constructor mixin to properly
> delete the default constructors. It's a bit cumbersome, because we
> have to add an initializer for the base class to every
> ctor-initializer-list, but I think I prefer this to making the default
> constructor a constrained template.
>

I'm happy with either approach - my primary concern is making sure
that is_constructible and friends work and don't lie, in a world where
increasing numbers of library components depend on it. Though I'm a
bit curious as to why you found this approach more preferable.

Re the new DMI, my brain compiler says that _Sequence c = _Sequence();
breaks anything with an explicit copy/move constructor pre-C++17, but
I also don't think we care about those, right?


Re: [PATCH] PR77528 add default constructors for container adaptors

2017-01-11 Thread Jonathan Wakely

On 10/01/17 13:15 -0500, Tim Song wrote:

On Tue, Jan 10, 2017 at 12:33 PM, Jonathan Wakely  wrote:

The standard says that the container adaptors have a constructor with
a default argument, which serves as a default constructor. That
involves default-constructing the underlying sequence as the default
argument and then move-constructing the member variable from that
argument. Because std::deque allocates memory in its move constructor
this means the default constructor of an adaptor using std::deque
will allocate twice, which is wasteful and expensive.

This change adds a separate default constructor, defined as defaulted
(and adding default member-initializers to ensure the member variables
get value-initialized). This avoids the move-construction, so we only
allocate once when using std::deque.



The new default member initializers use {}, and it's not too hard to find
test cases where {} and value-initialization do different things, including
cases where {} doesn't compile but () does. (So much for "uniform"
initialization.)


OK that's easily fixed.


Because the default constructor is defined as defaulted it will be
deleted when the underlying sequence isn't default constructible,


That's not correct. There's no implicit deletion due to the presence of DMIs.
The reason the explicit instantiation works is that constructors explicitly
defaulted at their first declaration are not implicitly defined until ODR-used.

So, unlike the SFINAE-based approach outlined in the bugzilla issue, this
patch causes is_default_constructible>
to trigger a hard error (though the standard's version isn't SFINAE-friendly
either).


Oops, thanks.


Also, the change to .../priority_queue/requirements/explicit_instantiation/1.cc
adds a Cmp class but doesn't actually use it in the explicit instantiation.


Fixed.

This patch uses the _Enable_default_constructor mixin to properly
delete the default constructors. It's a bit cumbersome, because we
have to add an initializer for the base class to every
ctor-initializer-list, but I think I prefer this to making the default
constructor a constrained template.

This also includes tests for is_default_constructible to ensure we
don't get the same hard errors as the previous version.

I'm still testing this, and could be persuaded to go with the
constrained templates if there's a good reason to.

commit 2bd10a76508336d92decfefe858fc23c5bc272f0
Author: Jonathan Wakely 
Date:   Wed Jan 11 12:15:27 2017 +

PR77528 conditionally delete default constructors

	PR libstdc++/77528
	* include/bits/stl_queue.h (queue, priority_queue): Use derivation
	from _Enable_default_constructor mixin to conditionally delete disable
	default construction
	* include/bits/stl_stack.h (stack): Likewise.
	* testsuite/23_containers/priority_queue/requirements/constructible.cc:
	New.
	* testsuite/23_containers/priority_queue/requirements/
	explicit_instantiation/1.cc: Test more instantiations.
	* testsuite/23_containers/priority_queue/requirements/
	explicit_instantiation/1_c++98.cc: Likewise.
	* testsuite/23_containers/queue/requirements/constructible.cc: New.
	* testsuite/23_containers/stack/requirements/constructible.cc: New.

diff --git a/libstdc++-v3/include/bits/stl_queue.h b/libstdc++-v3/include/bits/stl_queue.h
index 6417b30..ebd6089 100644
--- a/libstdc++-v3/include/bits/stl_queue.h
+++ b/libstdc++-v3/include/bits/stl_queue.h
@@ -60,6 +60,7 @@
 #include 
 #if __cplusplus >= 201103L
 # include 
+# include 
 #endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
@@ -94,6 +95,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   */
   template >
 class queue
+#if __cplusplus >= 201103L
+: _Enable_default_constructor::value,
+  queue<_Tp, _Sequence>>
+#endif
 {
   // concept requirements
   typedef typename _Sequence::value_type _Sequence_value_type;
@@ -114,6 +119,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 	using _Uses = typename
 	  enable_if::value>::type;
+
+  using _Tag = _Enable_default_constructor_tag;
+  using _Base = _Enable_default_constructor<
+	is_default_constructible<_Sequence>::value, queue>;
 #endif
 
 public:
@@ -133,7 +142,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   /// @c c is the underlying container.
 #if __cplusplus >= 201103L
-  _Sequence c{};
+  _Sequence c = _Sequence();
 #else
   _Sequence c;
 #endif
@@ -151,32 +160,32 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   explicit
   queue(const _Sequence& __c)
-  : c(__c) { }
+  : _Base(_Tag()), c(__c) { }
 
   explicit
   queue(_Sequence&& __c)
-  : c(std::move(__c)) { }
+  : _Base(_Tag()), c(std::move(__c)) { }
 
   template>
 	explicit
 	queue(const _Alloc& __a)
-	: c(__a) { }
+	: _Base(_Tag()), c(__a) { }
 
   template>
 	queue(const 

RE: [PATCH] Enable SGX intrinsics

2017-01-11 Thread Koval, Julia
Here is it.

gcc/
* common/config/i386/i386-common.c
   (OPTION_MASK_ISA_SGX_UNSET, OPTION_MASK_ISA_SGX_SET): New.
   (ix86_handle_option): Handle OPT_msgx.
* config.gcc: Added sgxintrin.h.
* config/i386/cpuid.h (bit_SGX): New.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect sgx.
* config/i386/i386-c.c (ix86_target_macros_internal): Define __SGX__.
* config/i386/i386.c
   (ix86_target_string): Add -msgx.
   (PTA_SGX): New.
   (ix86_option_override_internal): Handle new options.
   (ix86_valid_target_attribute_inner_p): Add sgx.
* config/i386/i386.h (TARGET_SGX, TARGET_SGX_P): New.
* config/i386/i386.opt: Add msgx.
* config/i386/sgxintrin.h: New file.
* config/i386/x86intrin.h: Add sgxintrin.h.
* testsuite/gcc.target/i386/sgx.c New test

libgcc/
config/i386/cpuinfo.c (get_available_features): Handle FEATURE_SGX.
config/i386/cpuinfo.h (FEATURE_SGX): New.

Thanks,
Julia

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Wednesday, January 11, 2017 1:02 PM
To: Koval, Julia 
Cc: Andrew Senkevich ; GCC Patches 
; vaalfr...@gmail.com; kirill.yuk...@gmail.com; Jakub 
Jelinek 
Subject: Re: [PATCH] Enable SGX intrinsics

On Wed, Jan 11, 2017 at 12:40 PM, Koval, Julia  wrote:
> Ok, fixed it. Can you please commit it for me, cause I don't have rights to 
> commit?

OK, but please send me updated ChangeLogs.

Uros.

> Thanks,
> Julia
>
> -Original Message-
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Wednesday, January 11, 2017 12:11 PM
> To: Koval, Julia 
> Cc: Andrew Senkevich ; GCC Patches 
> ; vaalfr...@gmail.com; kirill.yuk...@gmail.com; 
> Jakub Jelinek 
> Subject: Re: [PATCH] Enable SGX intrinsics
>
> On Wed, Jan 11, 2017 at 11:31 AM, Koval, Julia  wrote:
>> Ok. I fixed the enum formatting and the enums remain internal.
>
> @@ -7023,7 +7029,6 @@ ix86_can_inline_p (tree caller, tree callee)
>bool ret = false;
>tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
>tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
> -
>/* If callee has no option attributes, then it is ok to inline.  */
>if (!callee_tree)
>  ret = true;
>
>
> No need for the above whitespace change.
>
> OK for mainline with the above part reverted.
>
> Thanks,
> Uros.


Re: [PATCH C++] Fix PR77489 -- mangling of discriminator >= 10

2017-01-11 Thread Jakub Jelinek
On Wed, Jan 11, 2017 at 12:48:29PM +0100, Markus Trippelsdorf wrote:
> @@ -1965,7 +1966,11 @@ write_discriminator (const int discriminator)
>if (discriminator > 0)
>  {
>write_char ('_');
> +  if (abi_version_at_least(11) && discriminator - 1 >= 10)
> + write_char ('_');
>write_unsigned_number (discriminator - 1);
> +  if (abi_version_at_least(11) && discriminator - 1 >= 10)
> + write_char ('_');

Formatting nits, there should be space before (11).

> +// { dg-final { scan-assembler "_ZZ3foovE8localVar__10_" } }
> +// { dg-final { scan-assembler "_ZZ3foovE8localVar__11_" } }

Would be nice to also
// { dg-final { scan-assembler "_ZZ3foovE8localVar_9" } }

Otherwise, I defer to Jason (primarily whether this doesn't need
ABI version 12).

Jakub


Re: [PATCH] Enable SGX intrinsics

2017-01-11 Thread Uros Bizjak
On Wed, Jan 11, 2017 at 12:40 PM, Koval, Julia  wrote:
> Ok, fixed it. Can you please commit it for me, cause I don't have rights to 
> commit?

OK, but please send me updated ChangeLogs.

Uros.

> Thanks,
> Julia
>
> -Original Message-
> From: Uros Bizjak [mailto:ubiz...@gmail.com]
> Sent: Wednesday, January 11, 2017 12:11 PM
> To: Koval, Julia 
> Cc: Andrew Senkevich ; GCC Patches 
> ; vaalfr...@gmail.com; kirill.yuk...@gmail.com; 
> Jakub Jelinek 
> Subject: Re: [PATCH] Enable SGX intrinsics
>
> On Wed, Jan 11, 2017 at 11:31 AM, Koval, Julia  wrote:
>> Ok. I fixed the enum formatting and the enums remain internal.
>
> @@ -7023,7 +7029,6 @@ ix86_can_inline_p (tree caller, tree callee)
>bool ret = false;
>tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
>tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
> -
>/* If callee has no option attributes, then it is ok to inline.  */
>if (!callee_tree)
>  ret = true;
>
>
> No need for the above whitespace change.
>
> OK for mainline with the above part reverted.
>
> Thanks,
> Uros.


[PATCH C++] Fix PR77489 -- mangling of discriminator >= 10

2017-01-11 Thread Markus Trippelsdorf
Currently gcc mangles symbols wrongly when the discriminator is greater
than ten. The fix is straightforward. The demangler now handles both the
old and the new correct mangling.

Tested on ppc64le. OK for trunk?

Thanks.

libiberty:

PR c++/77489
* cp-demangle.c (d_discriminator): Handle discriminator >= 10.
* testsuite/demangle-expected: Add tests for discriminator.

gcc/cp:

PR c++/77489
* mangle.c (write_discriminator): Handle discriminator >= 10.

diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index 5f2fa35d29e8..ee75f4a25621 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -1952,7 +1952,8 @@ discriminator_for_string_literal (tree /*function*/,
   return 0;
 }
 
-/*:= _ 
+/*:= _ # when number < 10
+ := __  _ # when number >= 10
 
The discriminator is used only for the second and later occurrences
of the same name within a single function. In this case  is
@@ -1965,7 +1966,11 @@ write_discriminator (const int discriminator)
   if (discriminator > 0)
 {
   write_char ('_');
+  if (abi_version_at_least(11) && discriminator - 1 >= 10)
+   write_char ('_');
   write_unsigned_number (discriminator - 1);
+  if (abi_version_at_least(11) && discriminator - 1 >= 10)
+   write_char ('_');
 }
 }
 
diff --git a/gcc/testsuite/g++.dg/abi/pr77489.C 
b/gcc/testsuite/g++.dg/abi/pr77489.C
new file mode 100644
index ..f640bb6f0676
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/pr77489.C
@@ -0,0 +1,62 @@
+// { dg-options -fabi-version=11 }
+
+extern void bar(int*);
+
+void foo()
+{
+  {
+static int localVar = 0;
+bar();
+  }
+  {
+static int localVar = 1;
+bar();
+  }
+  {
+static int localVar = 2;
+bar();
+  }
+  {
+static int localVar = 3;
+bar();
+  }
+  {
+static int localVar = 4;
+bar();
+  }
+  {
+static int localVar = 5;
+bar();
+  }
+  {
+static int localVar = 6;
+bar();
+  }
+  {
+static int localVar = 7;
+bar();
+  }
+  {
+static int localVar = 8;
+bar();
+  }
+  {
+static int localVar = 9;
+bar();
+  }
+  {
+static int localVar = 10;
+bar();
+  }
+  {
+static int localVar = 11;
+bar();
+  }
+  {
+static int localVar = 12;
+bar();
+  }
+}
+
+// { dg-final { scan-assembler "_ZZ3foovE8localVar__10_" } }
+// { dg-final { scan-assembler "_ZZ3foovE8localVar__11_" } }
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 15ef3b48785f..d84929eca20d 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -3609,7 +3609,11 @@ d_local_name (struct d_info *di)
 }
 }
 
-/*  ::= _ <(non-negative) number>
+/*  ::= _ # when number < 10
+   ::= __  _ # when number >= 10
+
+::= _ # when number >=10
+   is also accepted to support gcc versions that wrongly mangled that way.
 
We demangle the discriminator, but we don't print it out.  FIXME:
We should print it out in verbose mode.  */
@@ -3617,14 +3621,28 @@ d_local_name (struct d_info *di)
 static int
 d_discriminator (struct d_info *di)
 {
-  int discrim;
+  int discrim, num_underscores = 1;
 
   if (d_peek_char (di) != '_')
 return 1;
   d_advance (di, 1);
+  if (d_peek_char (di) == '_')
+{
+  ++num_underscores;
+  d_advance (di, 1);
+}
+
   discrim = d_number (di);
   if (discrim < 0)
 return 0;
+  if (num_underscores > 1 && discrim >= 10)
+{
+  if (d_peek_char (di) == '_')
+   d_advance (di, 1);
+  else
+   return 0;
+}
+
   return 1;
 }
 
diff --git a/libiberty/testsuite/demangle-expected 
b/libiberty/testsuite/demangle-expected
index b65dcd3450e9..07e258fe58b3 100644
--- a/libiberty/testsuite/demangle-expected
+++ b/libiberty/testsuite/demangle-expected
@@ -4666,3 +4666,19 @@ void eat(int*&, Foo()::{lambda(auto:1
 
 _Z3eatIPiZ3BarIsEvvEUlPsPT_PT0_E0_EvRS3_RS5_
 void eat(int*&, 
void Bar()::{lambda(short*, auto:1*, auto:2*)#2}&)
+
+# PR 77489
+_ZZ3foovE8localVar_9
+foo()::localVar
+
+_ZZ3foovE8localVar_10
+foo()::localVar
+

+_ZZ3foovE8localVar__10_
+foo()::localVar
+
+_ZZ3foovE8localVar__9_
+_ZZ3foovE8localVar__9_
+
+_ZZ3foovE8localVar__12
+_ZZ3foovE8localVar__12

-- 
Markus


[committed] Small tweak for the decomp4.C testcase

2017-01-11 Thread Jakub Jelinek
Hi!

When compiling this testcase with trunk clang++, I've noticed the error
is different, because there are in fact 2 errors, one that a struct
has 2 non-static data members and the decomposition just one identifier,
the other that one of those non-static data members is private.

I've committed as obvious a change which makes the number of identifiers
correct, so there is just a single reason to diagnose on that line.

2017-01-11  Jakub Jelinek  

* g++.dg/cpp1z/decomp4.C (test): Use 2 identifier decomposition
instead of just 1 for the decomposition from struct C.

--- gcc/testsuite/g++.dg/cpp1z/decomp4.C.jj 2016-11-14 08:52:18.0 
+0100
+++ gcc/testsuite/g++.dg/cpp1z/decomp4.C2017-01-11 12:03:54.244784029 
+0100
@@ -18,7 +18,7 @@ test (A , B , C , D , E , F 
// { dg-warning "decomposition 
declaration only available with -std=c..1z or -std=gnu..1z" "" { target 
c++14_down } .-1 }
   auto [ k ] { b };// { dg-error "cannot decompose class 
type 'B' because it has an anonymous union member" }
// { dg-warning "decomposition 
declaration only available with -std=c..1z or -std=gnu..1z" "" { target 
c++14_down } .-1 }
-  auto [ l ] = c;  // { dg-error "cannot decompose 
non-public member 'C::b' of 'C'" }
+  auto [ l, l2 ] = c;  // { dg-error "cannot decompose 
non-public member 'C::b' of 'C'" }
// { dg-warning "decomposition 
declaration only available with -std=c..1z or -std=gnu..1z" "" { target 
c++14_down } .-1 }
   auto [ m ] = d;  // { dg-warning "decomposition 
declaration only available with -std=c..1z or -std=gnu..1z" "" { target 
c++14_down } }
   auto [ n ] { e };// { dg-error "cannot decompose 
non-public member 'E::a' of 'E'" }

Jakub


RE: [PATCH] Enable SGX intrinsics

2017-01-11 Thread Koval, Julia
Ok, fixed it. Can you please commit it for me, cause I don't have rights to 
commit?

Thanks,
Julia

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Wednesday, January 11, 2017 12:11 PM
To: Koval, Julia 
Cc: Andrew Senkevich ; GCC Patches 
; vaalfr...@gmail.com; kirill.yuk...@gmail.com; Jakub 
Jelinek 
Subject: Re: [PATCH] Enable SGX intrinsics

On Wed, Jan 11, 2017 at 11:31 AM, Koval, Julia  wrote:
> Ok. I fixed the enum formatting and the enums remain internal.

@@ -7023,7 +7029,6 @@ ix86_can_inline_p (tree caller, tree callee)
   bool ret = false;
   tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
   tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
-
   /* If callee has no option attributes, then it is ok to inline.  */
   if (!callee_tree)
 ret = true;


No need for the above whitespace change.

OK for mainline with the above part reverted.

Thanks,
Uros.


0001-Enable-SGX.PATCH
Description: 0001-Enable-SGX.PATCH


Re: C++ PATCH for C++17 NB comment FI20 (parenthesized initialization with decomposition)

2017-01-11 Thread Jakub Jelinek
On Tue, Jan 10, 2017 at 02:03:14PM -0500, Jason Merrill wrote:
> The FI20 comment on the decomposition declarations proposal complained
> that the syntax unnecessarily excluded parenthesized initialization.
> This patch implements the resolution.
> 
> Tested x86_64-pc-linux-gnu, applying to trunk.

> commit 749ec367e50b356a40fd41a3daae10d9d948062b
> Author: Jason Merrill 
> Date:   Mon Jan 9 17:33:42 2017 -0500
> 
> FI 20, decomposition declaration with parenthesized initializer.
>
> * parser.c (cp_parser_decomposition_declaration): Use
> cp_parser_initializer.

I had a testcase for this from the time when I've been trying to implement
FI 20 myself.  Here is just the testcase change, but it fails,
instead of invoking the explicit A (const A ) ctor I was expecting
for auto [b,c,d,e,f,g] ( a );, that is the ctor invoked also for
auto [b,c,d,e,f,g] { a };, it is invoking the
template  A (const T ) ctor that is invoked for
auto [b,c,d,e,f,g] = a;
So, is auto [b,c,d,e,f,g] ( a ); direct initialization like
auto [b,c,d,e,f,g] { a }; or is it copy initialization like
auto [b,c,d,e,f,g] = a; ?
Note, I've tried to compile the decomp6.C testcase (and all other decomp*
testcases) with svn trunk clang++ too, and it fails on the
  auto [b,c,d,e,f,g] { a };
on line 62 already:
decomp6.C:12:24: error: member reference base type 'A const[6]' is not a 
structure or union
  A (const T ) : a (x.a) { tccnt++; }
  ~^~
decomp6.C:62:28: note: in instantiation of function template specialization 
'A::A' requested here
  auto [b,c,d,e,f,g] { a }; // { dg-warning "decomposition 
declaration only available with" "" { target c++14_down } }
   ^
Is that a clang++ bug, right?  That said, when I change that line 62 to:
auto [b,c,d,e,f,g] ( a );, the test passes with clang++ even when patched
with the patch, so (a) seems to be treated as direct-initialization.
clang++ ICEd on a bunch of testcases, but when it didn't, appart from decomp6.C
matched my expectations on where errors should be reported and where not.

Lastly, I have a testcase I'm not sure about:
int a[3];
struct S { int b, c, d; } s;
void
foo ()
{
  auto [ b, c, d ] = a;
  auto [ e, f, g ] = s;
  auto [ h, i, j ] { s };
  auto [ k, l, m ] { s, };
  auto [ n, o, p ] { a };   // { dg-error "invalid conversion from 'int.' 
to 'int'" }
  auto [ q, r, t ] ( s );
  auto [ u, v, w ] ( s, );  // { dg-error "expected primary-expression 
before '.' token" }
  auto [ x, y, z ] ( a );   // { dg-error "expression list treated as 
compound expression in initializer" "" { target *-*-* } .-1 }
}
but where clang++ actually matches the current g++ behavior.
In the
https://github.com/cplusplus/draft/commit/6f3920a66311ec2893a9e30ce2b54cecba4951ba
wording, it talks only about assignment-expression in between () or {}, does
that mean:
auto [ k, l, m ] { s, };
should be invalid?  And is the auto [ n, o, p ] { a }; error a bug or not
(though clang++ agrees on that with g++).  We don't error on
auto [ x, y, z ] ( a );

2017-01-11  Jakub Jelinek  

FI 20, decomposition declaration with parenthesized initializer.
* g++.dg/cpp1z/decomp6.C (main): Add further tests.

--- gcc/testsuite/g++.dg/cpp1z/decomp6.C.jj 2016-11-14 08:52:27.0 
+0100
+++ gcc/testsuite/g++.dg/cpp1z/decomp6.C2016-11-29 18:11:33.0 
+0100
@@ -89,4 +89,40 @@ main ()
   }
   if (ccnt != 12 || dcnt != 24 || cccnt != 6 || tccnt != 6)
 __builtin_abort ();
+
+  {
+A a[6];
+if (ccnt != 18 || dcnt != 24 || cccnt != 6 || tccnt != 6)
+  __builtin_abort ();
+{
+  auto [b,c,d,e,f,g] ( a );// { dg-warning "decomposition 
declaration only available with" "" { target c++14_down } }
+  if (ccnt != 18 || dcnt != 24 || cccnt != 12 || tccnt != 6)
+   __builtin_abort ();
+  b.a++;
+  c.a += 2;
+  f.a += 3;
+  if (b.a != 7 || c.a != 8 || d.a != 6 || e.a != 6 || f.a != 9 || g.a != 6)
+   __builtin_abort ();
+  if ( == [0] ||  == [1] ||  == [2] ||  == [3] ||  == 
[4] ||  == [5])
+   __builtin_abort ();
+  {
+   auto&[ h, i, j, k, l, m ] (a);  // { dg-warning "decomposition 
declaration only available with" "" { target c++14_down } }
+   if (ccnt != 18 || dcnt != 24 || cccnt != 12 || tccnt != 6)
+ __builtin_abort ();
+   j.a += 4;
+   k.a += 5;
+   m.a += 6;
+   if (a[0].a != 6 || a[1].a != 6 || a[2].a != 10 || a[3].a != 11 || 
a[4].a != 6 || a[5].a != 12)
+ __builtin_abort ();
+   if ( != [0] ||  != [1] ||  != [2] ||  != [3] ||  != 
[4] ||  != [5])
+ __builtin_abort ();
+  }
+  if (ccnt != 18 || dcnt != 24 || cccnt != 12 || tccnt != 6)
+   __builtin_abort ();
+}
+if (ccnt != 18 || dcnt != 30 || cccnt != 12 || tccnt != 6)
+  __builtin_abort ();
+  }
+  if (ccnt != 18 || dcnt != 36 || cccnt != 12 || tccnt != 6)
+

Re: [PATCH][PR lto/79042] Propagate node->dynamically_initialized bit for varpool node to LTRANS stage.

2017-01-11 Thread Richard Biener
On Wed, Jan 11, 2017 at 12:21 PM, Maxim Ostapenko
 wrote:
> On 11/01/17 14:17, Richard Biener wrote:
>>
>> On Wed, Jan 11, 2017 at 10:00 AM, Maxim Ostapenko
>>  wrote:
>>>
>>> Hi,
>>>
>>> as mentioned in PR, LTO doesn't propagate node->dynamically_initialized
>>> bit
>>> for varpool nodes that leads to ASan fails to detect initialization order
>>> fiasco even for trivial example (e.g. from here:
>>>
>>> https://github.com/google/sanitizers/wiki/AddressSanitizerExampleInitOrderFiasco).
>>> This trivial patch fixes the issue. Regtested on
>>> x86_64-unknown-linux-gnu,
>>> OK for mainline?
>>
>> Ok.  This is also needed on branches, correct?
>
>
> Yes, branches also need this. gcc-5-branch and gcc-6-branch, right?

Yes.  Please bump lto-streamer.h:LTO_minor_version on the branches
with such change
as the LTO binary format will be incompatible.  (not that we always
remember to do that...)

Thanks,
Richard.

> Thanks,
> -Maxim
>
>>
>> Richard.
>>
>>> -Maxim
>>
>>
>


[PATCH] Fix parts of PR79052 (gimplefe)

2017-01-11 Thread Richard Biener

The following hopefully fixes the gimplefe ubsan bootstrap warnings.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

Richard.

2017-01-11  Richard Biener  

PR bootstrap/79052
* gimple-parser.c (c_parser_gimple_switch_stmt): Add missing
returns on parse errors.

Index: gcc/c/gimple-parser.c
===
--- gcc/c/gimple-parser.c   (revision 244305)
+++ gcc/c/gimple-parser.c   (working copy)
@@ -1259,118 +1259,120 @@ c_parser_gimple_switch_stmt (c_parser *p
   gimple_seq switch_body = NULL;
   c_parser_consume_token (parser);
 
-  if (c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
-{
-  cond_expr = c_parser_gimple_postfix_expression (parser);
-  if (! c_parser_require (parser, CPP_CLOSE_PAREN, "expected %<)%>"))
-   return;
-}
+  if (! c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
+return;
+  cond_expr = c_parser_gimple_postfix_expression (parser);
+  if (! c_parser_require (parser, CPP_CLOSE_PAREN, "expected %<)%>"))
+return;
 
-  if (c_parser_require (parser, CPP_OPEN_BRACE, "expected %<{%>"))
+  if (! c_parser_require (parser, CPP_OPEN_BRACE, "expected %<{%>"))
+return;
+
+  while (c_parser_next_token_is_not (parser, CPP_CLOSE_BRACE))
 {
-  while (c_parser_next_token_is_not (parser, CPP_CLOSE_BRACE))
+  if (c_parser_next_token_is (parser, CPP_EOF))
{
- if (c_parser_next_token_is (parser, CPP_EOF))
-   {
- c_parser_error (parser, "expected statement");
- return;
-   }
+ c_parser_error (parser, "expected statement");
+ return;
+   }
 
- switch (c_parser_peek_token (parser)->keyword)
-   {
-   case RID_CASE:
+  switch (c_parser_peek_token (parser)->keyword)
+   {
+   case RID_CASE:
+ {
+   c_expr exp1;
+   location_t loc = c_parser_peek_token (parser)->location;
+   c_parser_consume_token (parser);
+
+   if (c_parser_next_token_is (parser, CPP_NAME)
+   || c_parser_peek_token (parser)->type == CPP_NUMBER)
+ exp1 = c_parser_gimple_postfix_expression (parser);
+   else
  {
-   c_expr exp1;
-   location_t loc = c_parser_peek_token (parser)->location;
-   c_parser_consume_token (parser);
-
-   if (c_parser_next_token_is (parser, CPP_NAME)
-   || c_parser_peek_token (parser)->type == CPP_NUMBER)
- exp1 = c_parser_gimple_postfix_expression (parser);
-   else
- c_parser_error (parser, "expected expression");
+   c_parser_error (parser, "expected expression");
+   return;
+ }
 
-   if (c_parser_next_token_is (parser, CPP_COLON))
+   if (c_parser_next_token_is (parser, CPP_COLON))
+ {
+   c_parser_consume_token (parser);
+   if (c_parser_next_token_is (parser, CPP_NAME))
  {
+   label = c_parser_peek_token (parser)->value;
c_parser_consume_token (parser);
-   if (c_parser_next_token_is (parser, CPP_NAME))
- {
-   label = c_parser_peek_token (parser)->value;
-   c_parser_consume_token (parser);
-   tree decl = lookup_label_for_goto (loc, label);
-   case_label = build_case_label (exp1.value, NULL_TREE,
-  decl);
-   labels.safe_push (case_label);
-   if (! c_parser_require (parser, CPP_SEMICOLON,
-   "expected %<;%>"))
- return;
- }
-   else if (! c_parser_require (parser, CPP_NAME,
-"expected label"))
+   tree decl = lookup_label_for_goto (loc, label);
+   case_label = build_case_label (exp1.value, NULL_TREE,
+  decl);
+   labels.safe_push (case_label);
+   if (! c_parser_require (parser, CPP_SEMICOLON,
+   "expected %<;%>"))
  return;
  }
-   else if (! c_parser_require (parser, CPP_SEMICOLON,
-   "expected %<:%>"))
+   else if (! c_parser_require (parser, CPP_NAME,
+"expected label"))
  return;
-   break;
  }
-   case RID_DEFAULT:
+   else if (! c_parser_require (parser, CPP_SEMICOLON,
+"expected %<:%>"))
+ return;
+   break;

Re: [PATCH][PR lto/79042] Propagate node->dynamically_initialized bit for varpool node to LTRANS stage.

2017-01-11 Thread Maxim Ostapenko

On 11/01/17 14:17, Richard Biener wrote:

On Wed, Jan 11, 2017 at 10:00 AM, Maxim Ostapenko
 wrote:

Hi,

as mentioned in PR, LTO doesn't propagate node->dynamically_initialized bit
for varpool nodes that leads to ASan fails to detect initialization order
fiasco even for trivial example (e.g. from here:
https://github.com/google/sanitizers/wiki/AddressSanitizerExampleInitOrderFiasco).
This trivial patch fixes the issue. Regtested on x86_64-unknown-linux-gnu,
OK for mainline?

Ok.  This is also needed on branches, correct?


Yes, branches also need this. gcc-5-branch and gcc-6-branch, right?

Thanks,
-Maxim



Richard.


-Maxim






Re: [PATCH 2/2] IPA ICF: make algorithm stable to survive -fcompare-debug

2017-01-11 Thread Richard Biener
On Wed, Jan 11, 2017 at 11:48 AM, Martin Liška  wrote:
> On 01/11/2017 11:28 AM, Jakub Jelinek wrote:
>> On Wed, Jan 11, 2017 at 11:21:08AM +0100, Christophe Lyon wrote:
>>> Since then, I've noticed that
>>>   gcc.dg/tree-ssa/flatten-3.c scan-assembler cycle[123][: \t\n]
>>> now fails on aarch64 and arm targets.
>>
>> It fails on x86_64-linux and i686-linux too.
>>
>>   Jakub
>>
>
> Ok, problem is that we used to merge:
>
> Semantic equality hit:doubleindirect1->subcycle1
> Semantic equality hit:doubleindirect1->doublesubcycle1
> Semantic equality hit:subcycle->doublesubcycle
>
> and after my patch it changed to:
>
> Semantic equality hit:doublesubcycle->subcycle
> Semantic equality hit:doublesubcycle1->subcycle1
> Semantic equality hit:doublesubcycle1->doubleindirect1
>
> As output is grepped for a cycle[123], so of them would be merged.
> Thus, adding -fno-ipa-icf would be the right fix.
>
> Ready to be installed?

Ok.

Richard.

> Thanks,
> Martin


Re: [PATCH][PR lto/79042] Propagate node->dynamically_initialized bit for varpool node to LTRANS stage.

2017-01-11 Thread Richard Biener
On Wed, Jan 11, 2017 at 10:00 AM, Maxim Ostapenko
 wrote:
> Hi,
>
> as mentioned in PR, LTO doesn't propagate node->dynamically_initialized bit
> for varpool nodes that leads to ASan fails to detect initialization order
> fiasco even for trivial example (e.g. from here:
> https://github.com/google/sanitizers/wiki/AddressSanitizerExampleInitOrderFiasco).
> This trivial patch fixes the issue. Regtested on x86_64-unknown-linux-gnu,
> OK for mainline?

Ok.  This is also needed on branches, correct?

Richard.

> -Maxim


Re: [PATCH] Enable SGX intrinsics

2017-01-11 Thread Uros Bizjak
On Wed, Jan 11, 2017 at 11:31 AM, Koval, Julia  wrote:
> Ok. I fixed the enum formatting and the enums remain internal.

@@ -7023,7 +7029,6 @@ ix86_can_inline_p (tree caller, tree callee)
   bool ret = false;
   tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller);
   tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee);
-
   /* If callee has no option attributes, then it is ok to inline.  */
   if (!callee_tree)
 ret = true;


No need for the above whitespace change.

OK for mainline with the above part reverted.

Thanks,
Uros.


Re: [PATCH] avoid infinite recursion in maybe_warn_alloc_args_overflow (pr 78775)

2017-01-11 Thread Andreas Schwab
On Jan 11 2017, Christophe Lyon  wrote:

> The new test (gcc.dg/pr78973.c) fails on arm targets (there's no warning).

Also fails on m68k.

> In addition, I have noticed a new failure:
>   gcc.dg/attr-alloc_size-4.c  (test for warnings, line 140)
> on target arm-none-linux-gnueabihf --with-cpu=cortex-a5
> (works fine --with-cpu=cortex-a9)

Dito.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [LRA] Fix PR rtl-optimization/79032

2017-01-11 Thread Richard Biener
On Tue, Jan 10, 2017 at 9:31 PM, Eric Botcazou  wrote:
> Hi,
>
> LRA generates an unaligned memory access for 32-bit SPARC on the attached
> testcase when it is compiled with optimization.  It's again the business of
> paradoxical subregs of memory dealt with by simplify_operand_subreg:
>
>   /* If we change the address for a paradoxical subreg of memory, the
> address might violate the necessary alignment or the access might
>  be slow.  So take this into consideration.  We need not worry
>  about accesses beyond allocated memory for paradoxical memory
>  subregs as we don't substitute such equiv memory (see processing
> equivalences in function lra_constraints) and because for spilled
>  pseudos we allocate stack memory enough for the biggest
>  corresponding paradoxical subreg.  */
>   if (!(MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (mode)
> && SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (reg)))
>   || (MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (innermode)
>   && SLOW_UNALIGNED_ACCESS (innermode, MEM_ALIGN (reg
> return true;
>
> However the code contains a small inaccuracy: it tests the old MEM (reg) which
> has mode INNERMODE in the first branch of the condition instead of testing the
> new MEM (subst) which has mode MODE.  That's benign for little-endian targets
> since the offset doesn't change, but not for big-endian ones where it changes
> and thus also can change the alignment.
>
> The attached fix was bootstrapped/regtested on SPARC/Solaris, OK for mainline?

Ok.

Richard.

>
> 2017-01-10  Eric Botcazou  
>
> PR rtl-optimization/79032
> * lra-constraints.c (simplify_operand_subreg): In the MEM case, test
> the alignment of the adjusted memory reference against that of MODE,
> instead of the alignment of the original memory reference.
>
>
> 2017-01-10  Eric Botcazou  
>
> * gcc.c-torture/execute/20170110-1.c: New test.
>
> --
> Eric Botcazou


Re: [PATCH][PR tree-optimization/78856] Invalidate cached iteration information when threading across multiple loop headers

2017-01-11 Thread Richard Biener
On Tue, Jan 10, 2017 at 8:16 PM, Jeff Law  wrote:
> On 01/04/2017 05:25 AM, Richard Biener wrote:
>>
>> On Wed, Jan 4, 2017 at 6:31 AM, Jeff Law  wrote:
>>>
>>>
>>> So as noted in the BZ comments the jump threading code has code that
>>> detects
>>> when a jump threading path wants to cross multiple loop headers and
>>> truncates the jump threading path in that case.
>>>
>>> What we should have done instead is invalidate the cached loop
>>> information.
>>>
>>> Additionally, this BZ shows that just looking at loop headers is not
>>> sufficient -- we might cross from a reducible to an irreducible region
>>> which
>>> is equivalent to crossing into another loop in that we need to invalidate
>>> the cached loop iteration information.
>>>
>>> What's so damn funny here is that eventually we take nested loops and
>>> irreducible regions, thread various edges and end up with a nice natural
>>> loop and no irreducible regions in the end :-)  But the cached iteration
>>> information is still bogus.
>>>
>>> Anyway, this patch corrects both issues.  It treats moving between an
>>> reducible and irreducible region as crossing a loop header and it
>>> invalidates the cached iteration information rather than truncating the
>>> jump
>>> thread path.
>>>
>>> Bootstrapped and regression tested on x86_64-linux-gnu.  That compiler
>>> was
>>> also used to build all the configurations in config-list.mk.
>>>
>>> Installing on the trunk.  I could be convinced to install on the gcc-6
>>> branch as well since it's affected by the same problem.
>>>
>>> Jeff
>>>
>>>
>>> commit 93e3964a4664350446eefe786e3b73eb41d99036
>>> Author: law 
>>> Date:   Wed Jan 4 05:31:23 2017 +
>>>
>>> PR tree-optimizatin/78856
>>> * tree-ssa-threadupdate.c: Include tree-vectorizer.h.
>>> (mark_threaded_blocks): Remove code to truncate thread paths that
>>> cross multiple loop headers.  Instead invalidate the cached loop
>>> iteration information and handle case of a thread path walking
>>> into an irreducible region.
>>>
>>> PR tree-optimization/78856
>>> * gcc.c-torture/execute/pr78856.c: New test.
>>>
>>> git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@244045
>>> 138bc75d-0d04-0410-961f-82ee72b054a4
>>>
>>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>>> index 3114e02..6b2888f 100644
>>> --- a/gcc/ChangeLog
>>> +++ b/gcc/ChangeLog
>>> @@ -1,3 +1,12 @@
>>> +2017-01-03  Jeff Law  
>>> +
>>> +   PR tree-optimizatin/78856
>>> +   * tree-ssa-threadupdate.c: Include tree-vectorizer.h.
>>> +   (mark_threaded_blocks): Remove code to truncate thread paths that
>>> +   cross multiple loop headers.  Instead invalidate the cached loop
>>> +   iteration information and handle case of a thread path walking
>>> +   into an irreducible region.
>>> +
>>>  2016-12-30  Michael Meissner  
>>>
>>> PR target/78900
>>> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
>>> index cd2a065..cadfbc9 100644
>>> --- a/gcc/testsuite/ChangeLog
>>> +++ b/gcc/testsuite/ChangeLog
>>> @@ -1,3 +1,8 @@
>>> +2017-01-03  Jeff Law  
>>> +
>>> +   PR tree-optimization/78856
>>> +   * gcc.c-torture/execute/pr78856.c: New test.
>>> +
>>>  2017-01-03  Michael Meissner  
>>>
>>> PR target/78953
>>> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr78856.c
>>> b/gcc/testsuite/gcc.c-torture/execute/pr78856.c
>>> new file mode 100644
>>> index 000..80f2317
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.c-torture/execute/pr78856.c
>>> @@ -0,0 +1,25 @@
>>> +extern void exit (int);
>>> +
>>> +int a, b, c, d, e, f[3];
>>> +
>>> +int main()
>>> +{
>>> +  while (d)
>>> +while (1)
>>> +  ;
>>> +  int g = 0, h, i = 0;
>>> +  for (; g < 21; g += 9)
>>> +{
>>> +  int j = 1;
>>> +  for (h = 0; h < 3; h++)
>>> +   f[h] = 1;
>>> +  for (; j < 10; j++) {
>>> +   d = i && (b ? 0 : c);
>>> +   i = 1;
>>> +   if (g)
>>> + a = e;
>>> +  }
>>> +  }
>>> +  exit (0);
>>> +}
>>> +
>>> diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
>>> index adbb6e0..2da93a8 100644
>>> --- a/gcc/tree-ssa-threadupdate.c
>>> +++ b/gcc/tree-ssa-threadupdate.c
>>> @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "cfgloop.h"
>>>  #include "dbgcnt.h"
>>>  #include "tree-cfg.h"
>>> +#include "tree-vectorizer.h"
>>>
>>>  /* Given a block B, update the CFG and SSA graph to reflect redirecting
>>> one or more in-edges to B to instead reach the destination of an
>>> @@ -2084,10 +2085,8 @@ mark_threaded_blocks (bitmap threaded_blocks)
>>>/* Look for jump threading paths which cross multiple loop headers.
>>>
>>>   The code to thread through loop headers will change the CFG in ways
>>> - that break assumptions made by the loop 

Re: [PATCH 2/2] IPA ICF: make algorithm stable to survive -fcompare-debug

2017-01-11 Thread Martin Liška
On 01/11/2017 11:28 AM, Jakub Jelinek wrote:
> On Wed, Jan 11, 2017 at 11:21:08AM +0100, Christophe Lyon wrote:
>> Since then, I've noticed that
>>   gcc.dg/tree-ssa/flatten-3.c scan-assembler cycle[123][: \t\n]
>> now fails on aarch64 and arm targets.
> 
> It fails on x86_64-linux and i686-linux too.
> 
>   Jakub
> 

Ok, problem is that we used to merge:

Semantic equality hit:doubleindirect1->subcycle1
Semantic equality hit:doubleindirect1->doublesubcycle1
Semantic equality hit:subcycle->doublesubcycle

and after my patch it changed to:

Semantic equality hit:doublesubcycle->subcycle
Semantic equality hit:doublesubcycle1->subcycle1
Semantic equality hit:doublesubcycle1->doubleindirect1

As output is grepped for a cycle[123], so of them would be merged.
Thus, adding -fno-ipa-icf would be the right fix.

Ready to be installed?
Thanks,
Martin
>From 2facdc8b5730568ead389e7d4af8a4f6b04e9cbc Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 11 Jan 2017 11:46:14 +0100
Subject: [PATCH] Fix flatten-3.c test-case.

gcc/testsuite/ChangeLog:

2017-01-11  Martin Liska  

	* gcc.dg/tree-ssa/flatten-3.c: Add -fno-ipa-icf to dg-options.
---
 gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c b/gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c
index a1edb910e9d..153165c72e3 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/flatten-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 } */
+/* { dg-options -O2 -fno-ipa-icf } */
 
 extern void do_something_usefull();
 /* Check that we finish compiling even if instructed to
-- 
2.11.0



Re: [v3 PATCH] Reduce the size of variant, it doesn't need an index of type size_t internally.

2017-01-11 Thread Jonathan Wakely

On 11/01/17 10:29 +, Jonathan Wakely wrote:

On 11/01/17 00:19 +0200, Ville Voutilainen wrote:

@@ -1086,7 +1099,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 { return !this->_M_valid(); }

 constexpr size_t index() const noexcept
-  { return this->_M_index; }
+  {
+   if (this->_M_index ==
+   typename _Base::_Storage::__index_type(variant_npos))
+ return variant_npos;
+   return this->_M_index;


GCC doesn't seem to be smart enough to optimize the branch away here.


But that's only for 32-bit x86. It optimizes well for x86_64, so no
need to obfuscate it.




Re: [PATCH] Enable SGX intrinsics

2017-01-11 Thread Jakub Jelinek
On Wed, Jan 11, 2017 at 10:31:33AM +, Koval, Julia wrote:
> Ok. I fixed the enum formatting and the enums remain internal.

No further objections from me, if Uros acks it, check it in.

> > Sure.  Plus it depends on if users of the APIs should just write the 
> > operands on their own as numbers, or as __SGX_E*, or as E*.
> > In the first case the patch sans formatting is reasonable, in the second 
> > case the enums should be moved to file scope, in the last case we have to 
> > live with the namespace pollution.
> > The pdf you've referenced in the thread doesn't list the _encls_u32 and
> > _enclu_u32 intrinsics, so I think it depends on what ICC does (if it has 
> > been shipped with such a support already, or on coordination with ICC if 
> > not).
> 
> Jakub, it is in accordance with ICC.
> So the first case will be used.

Jakub


RE: [PATCH] Enable SGX intrinsics

2017-01-11 Thread Koval, Julia
Ok. I fixed the enum formatting and the enums remain internal.

-Julia

-Original Message-
From: Andrew Senkevich [mailto:andrew.n.senkev...@gmail.com] 
Sent: Tuesday, January 10, 2017 5:48 PM
To: Uros Bizjak 
Cc: Koval, Julia ; GCC Patches 
; vaalfr...@gmail.com
Subject: Re: [PATCH] Enable SGX intrinsics

On Fri, Dec 30, 2016 at 03:37:14PM +0100, Uros Bizjak wrote:
>> As suggested in [1], you should write multi-line enums like:
>>
>> enum foo
>> {
>>   a = ...
>>   b = ...
>> }
>
> Sure.  Plus it depends on if users of the APIs should just write the operands 
> on their own as numbers, or as __SGX_E*, or as E*.
> In the first case the patch sans formatting is reasonable, in the second case 
> the enums should be moved to file scope, in the last case we have to live 
> with the namespace pollution.
> The pdf you've referenced in the thread doesn't list the _encls_u32 and
> _enclu_u32 intrinsics, so I think it depends on what ICC does (if it has been 
> shipped with such a support already, or on coordination with ICC if not).

Jakub, it is in accordance with ICC.
So the first case will be used.


--
WBR,
Andrew


0001-Enable-SGX.PATCH
Description: 0001-Enable-SGX.PATCH


Re: [PATCH 2/2] IPA ICF: make algorithm stable to survive -fcompare-debug

2017-01-11 Thread Martin Liška
On 01/11/2017 11:28 AM, Jakub Jelinek wrote:
> On Wed, Jan 11, 2017 at 11:21:08AM +0100, Christophe Lyon wrote:
>> Since then, I've noticed that
>>   gcc.dg/tree-ssa/flatten-3.c scan-assembler cycle[123][: \t\n]
>> now fails on aarch64 and arm targets.
> 
> It fails on x86_64-linux and i686-linux too.
> 
>   Jakub
> 

Sorry for the breakage, I'm going to fix that.

Martin


Re: [v3 PATCH] Reduce the size of variant, it doesn't need an index of type size_t internally.

2017-01-11 Thread Jonathan Wakely

On 11/01/17 00:19 +0200, Ville Voutilainen wrote:

@@ -1086,7 +1099,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  { return !this->_M_valid(); }

  constexpr size_t index() const noexcept
-  { return this->_M_index; }
+  {
+   if (this->_M_index ==
+   typename _Base::_Storage::__index_type(variant_npos))
+ return variant_npos;
+   return this->_M_index;


GCC doesn't seem to be smart enough to optimize the branch away here.
Something like this would avoid it:

 constexpr size_t index const noexcept
 {
   using __index_type = typename _Base::_Storage::__index_type;
   return size_t(__index_type(this->_M_index + 1)) - 1;
 }

But we can worry about that later.


+  }

  void
  swap(variant& __rhs)
diff --git a/libstdc++-v3/testsuite/20_util/variant/index_type.cc 
b/libstdc++-v3/testsuite/20_util/variant/index_type.cc
new file mode 100644
index 000..b7f3a7b
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/variant/index_type.cc
@@ -0,0 +1,24 @@
+// { dg-options "-std=gnu++17" }
+// { dg-do compile { target x86_64-*-* powerpc*-*-*} }


Please add a space after the second target (it seems to work anyway,
but still better to add it).

OK for trunk with that space character added, thanks.



  1   2   >