Re: Forward list default default and move constructors

2017-07-12 Thread François Dumont

On 05/07/2017 18:10, Jonathan Wakely wrote:

On 19/06/17 22:48 +0200, François Dumont wrote:

Hi

   Here is the patch to default the default and move constructors on 
the std::forward_list. Putting a move constructor on 
_Fwd_list_node_base helped limiting the code impact of this patch. It 
doesn't have any side effect as iterator types using this base type 
are not defining any move semantic.


I don't understand this comment.

1) The iterators only _Fwd_list_node_base* pointers, so that's why
they aren't affected. It's not because the iterators don't define move
semantics.

2) The iterators *do* have move semantics, they have
implicitly-declared move operations, which are identical to their
implicitly-defined copy operations (because moving a pointer is
identical to copying it).

3) Adding this move constructor has a pretty large side effect because
now its copy constructor and copy assignment operator are defined as
deleted, and it has no move assignment operator. That's OK, because we
never copy or move nodes (except in the new _Fwd_list_impl move ctor
you're adding). But it's a significant side effect. Please consider
adding the following to make those side effects explicit:

 _Fwd_list_node_base(const _Fwd_list_node_base&) = delete;
 _Fwd_list_node_base& operator=(const _Fwd_list_node_base&) = delete;
 _Fwd_list_node_base& operator=(_Fwd_list_node_base&&) = delete;


Yes, sorry, my comment was indeed wrong. I should have limit it to say 
that I was not seeing real side effect considering current code. I added 
those explicit special deleted members.





   I also took the time to optimize the move constructor with 
allocator when allocator is always equal. It avoids initializing an 
empty forward list for nothing.


   I think it is fine but could we have an abi issue because of the 
change in forward_list.tcc ?


Old code with undefined references to that constructor will still find
a definition in new code that explicitly instantiates a forward_list.

New code compiled after your change would not find the new
constructors (the ones with true_type and false_type parameters) in
old code that explicitly instantiated a forward_list.

Could you split that part of the change into a separate patch? The
changes to define constructors as defaulted are OK, so I'd like to
considere the proposed optimisation separately 


Done in attached patch, ok to commit ?

François

diff --git a/libstdc++-v3/include/bits/forward_list.h b/libstdc++-v3/include/bits/forward_list.h
index f319b7f..6a0e54b 100644
--- a/libstdc++-v3/include/bits/forward_list.h
+++ b/libstdc++-v3/include/bits/forward_list.h
@@ -53,6 +53,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   struct _Fwd_list_node_base
   {
 _Fwd_list_node_base() = default;
+_Fwd_list_node_base(_Fwd_list_node_base&& __x) noexcept
+  : _M_next(__x._M_next)
+{ __x._M_next = nullptr; }
+
+_Fwd_list_node_base(const _Fwd_list_node_base&) = delete;
+_Fwd_list_node_base& operator=(const _Fwd_list_node_base&) = delete;
+_Fwd_list_node_base& operator=(_Fwd_list_node_base&&) = delete;
 
 _Fwd_list_node_base* _M_next = nullptr;
 
@@ -283,17 +290,16 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   {
 _Fwd_list_node_base _M_head;
 
-_Fwd_list_impl()
-: _Node_alloc_type(), _M_head()
-{ }
+	_Fwd_list_impl()
+	  noexcept( noexcept(_Node_alloc_type()) )
+	: _Node_alloc_type()
+	{ }
 
-_Fwd_list_impl(const _Node_alloc_type& __a)
-: _Node_alloc_type(__a), _M_head()
-{ }
+	_Fwd_list_impl(_Fwd_list_impl&&) = default;
 
-_Fwd_list_impl(_Node_alloc_type&& __a)
-	: _Node_alloc_type(std::move(__a)), _M_head()
-{ }
+	_Fwd_list_impl(_Node_alloc_type&& __a)
+	: _Node_alloc_type(std::move(__a))
+	{ }
   };
 
   _Fwd_list_impl _M_impl;
@@ -311,20 +317,14 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   _M_get_Node_allocator() const noexcept
   { return this->_M_impl; }
 
-  _Fwd_list_base()
-  : _M_impl() { }
+  _Fwd_list_base() = default;
 
   _Fwd_list_base(_Node_alloc_type&& __a)
   : _M_impl(std::move(__a)) { }
 
   _Fwd_list_base(_Fwd_list_base&& __lst, _Node_alloc_type&& __a);
 
-  _Fwd_list_base(_Fwd_list_base&& __lst)
-  : _M_impl(std::move(__lst._M_get_Node_allocator()))
-  {
-	this->_M_impl._M_head._M_next = __lst._M_impl._M_head._M_next;
-	__lst._M_impl._M_head._M_next = 0;
-  }
+  _Fwd_list_base(_Fwd_list_base&&) = default;
 
   ~_Fwd_list_base()
   { _M_erase_after(&_M_impl._M_head, 0); }
@@ -436,10 +436,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   /**
*  @brief  Creates a %forward_list with no elements.
*/
-  forward_list()
-  noexcept(is_nothrow_default_constructible<_Node_alloc_type>::value)
-  : _Base()
-  { }
+  forward_list() = default;
 
   /**
*  @brief  Creates a %forward_list with no elements.
@@ -532,15 +529,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
   /**
*  

Re: [RFC/SCCVN] Handle BIT_INSERT_EXPR in vn_nary_op_eq

2017-07-12 Thread Andrew Pinski
On Wed, Jul 12, 2017 at 9:10 PM, Marc Glisse  wrote:
> On Wed, 12 Jul 2017, Andrew Pinski wrote:
>
>> Hi,
>>  Unlike most other expressions, BIT_INSERT_EXPR has an implicit
>> operand of the precision/size of the second operand.  This means if we
>> have an integer constant for the second operand and that compares to
>> the same constant value, vn_nary_op_eq would return that these two
>> expressions are the same.  But in the case I was looking into the
>> integer constants had different types, one with 1 bit precision and
>> the other with 2 bit precision which means the BIT_INSERT_EXPR were
>> not equal at all.
>>
>> This patches the problem by checking to see if BIT_INSERT_EXPR's
>> operand 1's (second operand) type  has different precision to return
>> false.
>>
>> Is this the correct location or should we be checking for this
>> differently?  If this is the correct location, is the patch ok?
>> Bootstrapped and tested on aarch64-linux-gnu with no regressions (and
>> also tested with a few extra patches to expose BIT_INSERT_EXPR).
>>
>> Thanks,
>> Andrew Pinski
>>
>> ChangeLog:
>> * tree-ssa-sccvn.c (vn_nary_op_eq): Check BIT_INSERT_EXPR's operand 1
>> to see if the types precision matches.
>
>
> Hello,
>
> since BIT_INSERT_EXPR is implicitly an expression with 4 arguments, it makes
> sense that we may need a few such special cases. But shouldn't the hash
> function be in sync with the equality comparator? Does operand_equal_p need
> the same?

The hash function does not need to be exactly the same.  The only
requirement there is if vn_nary_op_eq returns true then the hash has
to be the same.  Now we could improve the hash by using the precision
which will allow us not to compare as much in some cases.

Yes operand_equal_p needs the same handling; I did not notice that
until you mention it..
Right now it does:
case BIT_INSERT_EXPR:
  return OP_SAME (0) && OP_SAME (1) && OP_SAME (2);

Thanks,
Andrew Pinski

>
> --
> Marc Glisse


Re: [RFC/SCCVN] Handle BIT_INSERT_EXPR in vn_nary_op_eq

2017-07-12 Thread Marc Glisse

On Wed, 12 Jul 2017, Andrew Pinski wrote:


Hi,
 Unlike most other expressions, BIT_INSERT_EXPR has an implicit
operand of the precision/size of the second operand.  This means if we
have an integer constant for the second operand and that compares to
the same constant value, vn_nary_op_eq would return that these two
expressions are the same.  But in the case I was looking into the
integer constants had different types, one with 1 bit precision and
the other with 2 bit precision which means the BIT_INSERT_EXPR were
not equal at all.

This patches the problem by checking to see if BIT_INSERT_EXPR's
operand 1's (second operand) type  has different precision to return
false.

Is this the correct location or should we be checking for this
differently?  If this is the correct location, is the patch ok?
Bootstrapped and tested on aarch64-linux-gnu with no regressions (and
also tested with a few extra patches to expose BIT_INSERT_EXPR).

Thanks,
Andrew Pinski

ChangeLog:
* tree-ssa-sccvn.c (vn_nary_op_eq): Check BIT_INSERT_EXPR's operand 1
to see if the types precision matches.


Hello,

since BIT_INSERT_EXPR is implicitly an expression with 4 arguments, it 
makes sense that we may need a few such special cases. But shouldn't the 
hash function be in sync with the equality comparator? Does 
operand_equal_p need the same?


--
Marc Glisse


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 01/08

2017-07-12 Thread Jeff Law
On 07/12/2017 06:31 PM, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Jul 11, 2017 at 03:19:57PM -0600, Jeff Law wrote:
>> It introduces a new style of stack probing -fstack-check=clash,
>> including documentation of the new option, how it differs from
>> -fstack-check=specific, etc.
>>
>> FWIW -fstack-check=specific is dreadfully named.  I haven't tried to
>> address that.
> 
> -fstack-check=clash is itself not such a super name either.  It's not
> checking stack, and it isn't clashing: it just does a store to every
> page the stack will touch (minus the first and last or something).
Yea.  I don't particularly like it either.  Suggestions?  I considered
"probe" as well, but "specific" also does probing.  In the end I used
the part of the marketing name of the exploits.

I also considered trying to separate the fact that we are doing stack
probing from the exact method of probing.  So something like:

-fstack-check -fstack-check-probe=frob

And the current default would map to

-fstack-check -fstack-check-probe=ahead 

I didn't particularly like that either and using something like
-fstack-check -fstack-check-probe=frob is a bit cumbersome.  It's also
the case that we're changing two key aspects relative to
-fstack-check=specific.

1. We never probe ahead of need.  ie, if a function requests N bytes of
stack space, then we'll never probe beyond N bytes.  In contrast
-fstack-check=specific will tend to probe 2-3 pages beyond the N byte
request.

2. We probe as each page is allocated.  in contrast most
-fstack-check=specific implementations allocate all the space, then
probe into the space.

THe concept I keep coming back to is that we're changing probing policy.
 as-needed vs ahead-of-need for how much to probe.  And "as-allocated"
rather than "after-allocation" for when we emit the probe.

Certainly open to ideas on the interface aspects.

> 
> How does this work for targets that want to enable this by default?  How
> does that interact with checking for stack size overflow?
I don't currently have a way to enable it by default -- for my tests I
just slam the value I want into the initializer in common.opt :-)

It's independent of stack size overflow checking.  They could (in
theory) even co-exist on ports that support stack size overflow
checking, but I didn't test that.

> 
>> However, a sufficiently smart compiler could realize that a call to a
>> noreturn function could be converted into a jump, even if the caller has
>> a frame because that frame need not be torn down.
> 
> Currently GCC will never make a tail call to a noreturn function (see
> calls.c:can_implement_as_sibling_call_p).  It would be nice (for code
> size, if nothing else) if that was changed.  But you know all this :-)
:-)  Initially when I started looking at this issue I fully expected to
need to explicitly reject tail calls to noreturn functions to make
things safe.  Imagine my surprise when I found out that we look at the
preds of the exit block to find opportunities.  Which obviously doesn't
work with noreturn calls.

On the theory that someone might want to change that one day, there is a
test in the suite that will complain if we have a tail optimized call to
a noreturn function.  At the least it will spark a discussion about the
validity of such an optimization in a world where jumping the guard is a
concern.


> 
>> @@ -11333,7 +11334,8 @@ target support in the compiler but comes with the 
>> following drawbacks:
>>  @enumerate
>>  @item
>>  Modified allocation strategy for large objects: they are always
>> -allocated dynamically if their size exceeds a fixed threshold.
>> +allocated dynamically if their size exceeds a fixed threshold.  Note this
>> +may change the semantics of some code.
> 
> How so?  Semantics of valid (portable) code, as well?  It would be nice
> to have some detail here, not just a threat :-)
I'd have to go back and run the testsuite with generic checking enabled
by default.  I didn't dig deeply (because generic testing just isn't
interesting for so many reasons).  But what happened was we had a large
auto object, which gets turned into an alloca'd object with generic
checking.

We created that alloca'd object at the wrong lexical scope which mucked
up its expected persistence.  I'm sure I'd spot it trivially once I set
up the test again.

> 
>> +@samp{clash} is designed to prevent jumping the stack guard page as seen in
>> +stack clash style attacks.  However, it does not guarantee sufficient stack
>> +space to handle a signal in the event that the program hits the stack guard
>> +and is thus incompatible with Ada's needs.
> 
> Is there some way we could have both?
With kernel support, yes.  The kernel would set up a reserved stack area
contiguous with the guard and the two areas would move in unison.  Once
they're unable to move, the kernel could map in the reserved page(s) and
trigger the signal handler.


> 
> In principle stack-clash protection is completely orthogonal to
> -fstack-check.
To some degree,

Re: [gofrontend-dev] [PATCH] PR81393: S/390: libgo: Fix ptrace register set accessors.

2017-07-12 Thread Ian Lance Taylor
On Wed, Jul 12, 2017 at 7:44 AM, Andreas Krebbel
 wrote:
>
> ptrace SETREGS and GETREGS were never supported on S/390.  The macros
> were accidentally defined in the Glibc header though.  A recent Glibc
> change removed them breaking libgo build on S/390.
>
> This patch changes the ptrace calls to use PEEKUSR_AREA/POKEUSR_AREA to
> access the register sets.  That's what GDB does.
>
> Bootstrapped and regression tested on s390x.

Thanks.

Committed to trunk.

Ian


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 02/08

2017-07-12 Thread Segher Boessenkool
On Tue, Jul 11, 2017 at 03:20:12PM -0600, Jeff Law wrote:
>   * conifg/mips/mips.c (mips_expand_prologue): Likewise.

Typo ("conifg").

> --- a/gcc/defaults.h
> +++ b/gcc/defaults.h
> @@ -1408,8 +1408,11 @@ see the files COPYING3 and COPYING.RUNTIME 
> respectively.  If not, see
>  #endif
>  
>  /* The default is not to move the stack pointer.  */
> +/* The default is not to move the stack pointer, unless we are using
> +   stack clash prevention stack checking.  */
>  #ifndef STACK_CHECK_MOVING_SP
> -#define STACK_CHECK_MOVING_SP 0
> +#define STACK_CHECK_MOVING_SP\
> +  (flag_stack_check == STACK_CLASH_BUILTIN_STACK_CHECK)
>  #endif

Missing space before that backslash.

The documentation for STACK_CHECK_CONFIG_SP needs updating (its default
is no longer zero, for one).

I don't really see why this is so complicated, and why the rs6000
target changes (a later patch) are so big.  Why isn't it just simple
patches to allocate_stack (and the prologue thing), that check the
flag and if it is set do some probes?

I'll go sleep now, maybe I'll see it in the morning :-)


Segher


[RFC/SCCVN] Handle BIT_INSERT_EXPR in vn_nary_op_eq

2017-07-12 Thread Andrew Pinski
Hi,
  Unlike most other expressions, BIT_INSERT_EXPR has an implicit
operand of the precision/size of the second operand.  This means if we
have an integer constant for the second operand and that compares to
the same constant value, vn_nary_op_eq would return that these two
expressions are the same.  But in the case I was looking into the
integer constants had different types, one with 1 bit precision and
the other with 2 bit precision which means the BIT_INSERT_EXPR were
not equal at all.

This patches the problem by checking to see if BIT_INSERT_EXPR's
operand 1's (second operand) type  has different precision to return
false.

Is this the correct location or should we be checking for this
differently?  If this is the correct location, is the patch ok?
Bootstrapped and tested on aarch64-linux-gnu with no regressions (and
also tested with a few extra patches to expose BIT_INSERT_EXPR).

Thanks,
Andrew Pinski

ChangeLog:
* tree-ssa-sccvn.c (vn_nary_op_eq): Check BIT_INSERT_EXPR's operand 1
to see if the types precision matches.
Index: tree-ssa-sccvn.c
===
--- tree-ssa-sccvn.c(revision 250159)
+++ tree-ssa-sccvn.c(working copy)
@@ -2636,6 +2636,14 @@ vn_nary_op_eq (const_vn_nary_op_t const
 if (!expressions_equal_p (vno1->op[i], vno2->op[i]))
   return false;
 
+  /* BIT_INSERT_EXPR has an implict operand as the type precision
+ of op1.  Need to check to make sure they are the same.  */
+  if (vno1->opcode == BIT_INSERT_EXPR)
+if (INTEGRAL_TYPE_P (TREE_TYPE (vno1->op[0]))
+   && TYPE_PRECISION (TREE_TYPE (vno1->op[1]))
+   != TYPE_PRECISION (TREE_TYPE (vno2->op[1])))
+  return false;
+
   return true;
 }
 


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 01/08

2017-07-12 Thread Segher Boessenkool
Hi!

On Tue, Jul 11, 2017 at 03:19:57PM -0600, Jeff Law wrote:
> It introduces a new style of stack probing -fstack-check=clash,
> including documentation of the new option, how it differs from
> -fstack-check=specific, etc.
> 
> FWIW -fstack-check=specific is dreadfully named.  I haven't tried to
> address that.

-fstack-check=clash is itself not such a super name either.  It's not
checking stack, and it isn't clashing: it just does a store to every
page the stack will touch (minus the first and last or something).

How does this work for targets that want to enable this by default?  How
does that interact with checking for stack size overflow?

> However, a sufficiently smart compiler could realize that a call to a
> noreturn function could be converted into a jump, even if the caller has
> a frame because that frame need not be torn down.

Currently GCC will never make a tail call to a noreturn function (see
calls.c:can_implement_as_sibling_call_p).  It would be nice (for code
size, if nothing else) if that was changed.  But you know all this :-)

> @@ -11333,7 +11334,8 @@ target support in the compiler but comes with the 
> following drawbacks:
>  @enumerate
>  @item
>  Modified allocation strategy for large objects: they are always
> -allocated dynamically if their size exceeds a fixed threshold.
> +allocated dynamically if their size exceeds a fixed threshold.  Note this
> +may change the semantics of some code.

How so?  Semantics of valid (portable) code, as well?  It would be nice
to have some detail here, not just a threat :-)

> +@samp{clash} is designed to prevent jumping the stack guard page as seen in
> +stack clash style attacks.  However, it does not guarantee sufficient stack
> +space to handle a signal in the event that the program hits the stack guard
> +and is thus incompatible with Ada's needs.

Is there some way we could have both?

In principle stack-clash protection is completely orthogonal to
-fstack-check.

>/* Check the stack and entirely rely on the target configuration
> - files, i.e. do not use the generic mechanism at all.  */
> + files, i.e. do not use the generic mechanism at all.  This
> + does not prevent stack guard jumping and stack clash style
> + attacks.  */
>FULL_BUILTIN_STACK_CHECK
>  };

> +  else if (!strcmp (arg, "clash"))
> + {
> +   /* This is the stack checking method, designed to prevent
> +  stack-clash attacks.  */
> +   if (!STACK_GROWS_DOWNWARD)
> + sorry ("-fstack-check=clash not implemented on this target");
> +   else
> + opts->x_flag_stack_check = (STACK_CHECK_BUILTIN
> + ? FULL_BUILTIN_STACK_CHECK
> + : (STACK_CHECK_STATIC_BUILTIN
> +? STACK_CLASH_BUILTIN_STACK_CHECK
> +: GENERIC_STACK_CHECK));
> + }

So targets that define STACK_CHECK_BUILTIN (spu and alpha) do not get
stack clash protection if you asked for it specifically, without warning,
if I read that correctly?

> +# Return 1 if the target has support for stack probing designed
> +# to avoid stack-clash style attacks
> +#
> +# This is used to restrict the stack-clash mitigation tests to
> +# just those targets that have been explicitly supported

Missing full stop, twice.  More later in the file.

> +proc check_effective_target_stack_clash_protected { } {

The name is maybe not so great: nothing is protected until you actually
use the option.  "supported", maybe?

> +# Return 1 if the target's calling sequence or its ABI
> +# create implicit stack probes at *sp at function
> +# entry.
> +proc check_effective_target_caller_implicit_probes { } {

"At function entry" isn't really true for Power ("when setting up a
stack frame", instead -- and you are required to set one up before
calling anything).

> +# The stack realignment often causes residual stack allocations and
> +# can also make it difficult to elide loops or residual allocations
> +# for dynamic allocations
> +proc check_effective_target_callee_realigns_stack { } {

Does this return 1 if the callee always realigns the stack?  Or maybe
realigns the stack?


Segher


Re: [PATCH], PR target/81193, Add warning for using __builtin_cpu_* on old PowerPC GLIBC's

2017-07-12 Thread Michael Meissner
Hmmm, I didn't realize that gcc 6.x also supported __builtin_cpu_*.  I imagine
we will need backports there as well.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



[PATCH, rs6000] Add support for vec_extract_fp_from_shorth() and vec_extract_fp_from_short

2017-07-12 Thread Carl Love
GCC Maintainers:

The following patch adds support for the vec_extract_fp_from_shorth()
and vec_extract_fp_from_short builtin functions. The patch has been
tested on powerpc64le-unknown-linux-gnu (Power 8 LE) and
powerpc64le-unknown-linux-gnu (Power 9 LE).  The test generates 1
unsupported test on Power 8 and 2 test passes on Power 9.

Please let me know if the following patch is acceptable.  Thanks.

Carl Love



gcc/ChangeLog:

2017-07-12  Carl Love  

* config/rs6000/rs6000-c.c: Add support for built-in functions
vector float vec_extract_fp32_from_shorth (vector unsigned short);
vector float vec_extract_fp32_from_shortl (vector unsigned short);
* config/rs6000/altivec.h (vec_extract_fp_from_shorth,
vec_extract_fp_from_shortl): Add defines for the two builtins.
* config/rs6000/rs6000-builtin.def (VEXTRACT_FP_FROM_SHORTH,
VEXTRACT_FP_FROM_SHORTL): Add BU_P9V_OVERLOAD_1 and BU_P9V_VSX_1
new builtins.
* config/rs6000/vsx.md(vsx_xvcvhpsp): Add define_insn.
(vextract_fp_from_shorth, vextract_fp_from_shortl): Add define_expands.
* doc/extend.texi: Update the built-in documentation file for the
new built-in function.

gcc/testsuite/ChangeLog:
2017-07-12  Carl Love  

* gcc.target/powerpc/builtins-3-p9-runnable.c: Add new test file for
the new built-ins.
---
 gcc/config/rs6000/altivec.h|  3 +
 gcc/config/rs6000/rs6000-builtin.def   |  5 ++
 gcc/config/rs6000/rs6000-c.c   |  5 ++
 gcc/config/rs6000/vsx.md   | 70 +-
 gcc/doc/extend.texi|  3 +
 .../gcc.target/powerpc/builtins-3-p9-runnable.c| 36 +++
 6 files changed, 121 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/builtins-3-p9-runnable.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 71cdca5..4d34a97 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -449,6 +449,9 @@
 #define vec_insert_exp __builtin_vec_insert_exp
 #define vec_test_data_class __builtin_vec_test_data_class
 
+#define vec_extract_fp_from_shorth __builtin_vec_vextract_fp_from_shorth
+#define vec_extract_fp_from_shortl __builtin_vec_vextract_fp_from_shortl
+
 #define scalar_extract_exp __builtin_vec_scalar_extract_exp
 #define scalar_extract_sig __builtin_vec_scalar_extract_sig
 #define scalar_insert_exp __builtin_vec_scalar_insert_exp
diff --git a/gcc/config/rs6000/rs6000-builtin.def 
b/gcc/config/rs6000/rs6000-builtin.def
index e098e1c..400189e 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -2057,6 +2057,9 @@ BU_P9V_OVERLOAD_1 (VSTDCNSP,  "scalar_test_neg_sp")
 
 BU_P9V_OVERLOAD_1 (REVB,   "revb")
 
+BU_P9V_OVERLOAD_1 (VEXTRACT_FP_FROM_SHORTH, "vextract_fp_from_shorth")
+BU_P9V_OVERLOAD_1 (VEXTRACT_FP_FROM_SHORTL, "vextract_fp_from_shortl")
+
 /* ISA 3.0 vector scalar overloaded 2 argument functions.  */
 BU_P9V_OVERLOAD_2 (VSIEDP, "scalar_insert_exp")
 
@@ -2074,6 +2077,8 @@ BU_P9V_VSX_1 (VEEDP, "extract_exp_dp", CONST, xvxexpdp)
 BU_P9V_VSX_1 (VEESP, "extract_exp_sp", CONST, xvxexpsp)
 BU_P9V_VSX_1 (VESDP, "extract_sig_dp", CONST, xvxsigdp)
 BU_P9V_VSX_1 (VESSP, "extract_sig_sp", CONST, xvxsigsp)
+BU_P9V_VSX_1 (VEXTRACT_FP_FROM_SHORTH, "vextract_fp_from_shorth", CONST, 
vextract_fp_from_shorth)
+BU_P9V_VSX_1 (VEXTRACT_FP_FROM_SHORTL, "vextract_fp_from_shortl", CONST, 
vextract_fp_from_shortl)
 
 /* 2 argument vsx vector functions added in ISA 3.0 (power9).  */
 BU_P9V_VSX_2 (VIEDP, "insert_exp_dp", CONST, xviexpdp)
diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index c769442..a1d09ba 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -5164,6 +5164,11 @@ const struct altivec_builtin_types 
altivec_overloaded_builtins[] = {
   { P9V_BUILTIN_VEC_VEXTRACT4B, P9V_BUILTIN_VEXTRACT4B,
 RS6000_BTI_INTDI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_UINTSI, 0 },
 
+  { P9V_BUILTIN_VEC_VEXTRACT_FP_FROM_SHORTH, 
P9V_BUILTIN_VEXTRACT_FP_FROM_SHORTH,
+RS6000_BTI_V4SF, RS6000_BTI_unsigned_V8HI, 0, 0 },
+  { P9V_BUILTIN_VEC_VEXTRACT_FP_FROM_SHORTL, 
P9V_BUILTIN_VEXTRACT_FP_FROM_SHORTL,
+RS6000_BTI_V4SF, RS6000_BTI_unsigned_V8HI, 0, 0 },
+
   { P9V_BUILTIN_VEC_VEXTULX, P9V_BUILTIN_VEXTUBLX,
 RS6000_BTI_INTQI, RS6000_BTI_UINTSI,
 RS6000_BTI_V16QI, 0 },
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 2ddfae5..573eb3f 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -326,6 +326,7 @@
UNSPEC_VSX_CVDPSXWS
UNSPEC_VSX_CVDPUXWS
UNSPEC_VSX_CVSPDP
+   UNSPEC_VSX_CVHPSP
UNSPEC_VSX_CVSPDPN
UNSPEC_VSX_CVDPSPN
UNSPEC_VSX_CVSXWDP
@@ -367,6 +368,8 @@
UNSPEC_VSX_SIEXPDP
UNSPEC_VSX_SCMPEXPDP
UNSPEC_VSX_S

Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

2017-07-12 Thread Segher Boessenkool
On Tue, Jul 11, 2017 at 03:19:36PM -0600, Jeff Law wrote:
> Examples of implicit probes include

>   2. ABI mandates that *sp always contain a backchain pointer (ppc)

In the ELFv2 ABI a backchain is not required.  GCC still always has
one afaik.  I'll find out more.

> To get a sense of overhead, just 1.5% of routines in glibc need probing
> in their prologues (x86) in the testing I performed.  IIRC each and
> every one of those routines needed just 1-4 inlined probes.
> 
> Significantly more functions need alloca space probed (IIRC ~5%), but
> given the amazingly inefficient alloca code, I can't believe anyone will
> ever notice the probing overhead.

That is quite a lot of functions IMO, but it's just one stor per page
(or per alloca), and supposedly you'll store to that stack anyway (or
it is stupid slow code in the first place).  Did you measure any real
timings?


Segher


Re: Fix riscv port breakage.

2017-07-12 Thread Andrew Waterman
Thank you for the fix and the cleanup, Jeff!


Fix riscv port breakage.

2017-07-12 Thread Jeff Law
My tester has been complaining regularly for a little while WRT the
riscv port failing to build due minor header file goof's.

While just reordering the headers is sufficient to fix one of the two
problems, I took the opportunity to remove > 50 unnecessary includes
from riscv.c.

The second problem was just a missed #include which is trivially fixed.

My tester has verified that riscv32 and riscv64 ports build through
libgcc with this change.  Installed on the trunk.



* config/riscv/riscv.c: Remove unnecessary includes.  Reorder
remaining includes slightly.
* config/riscv/riscv-builtins.c: Include profile-count.h.



diff --git a/gcc/config/riscv/riscv-builtins.c 
b/gcc/config/riscv/riscv-builtins.c
index 626a6a33f99..1311fee6f70 100644
--- a/gcc/config/riscv/riscv-builtins.c
+++ b/gcc/config/riscv/riscv-builtins.c
@@ -27,6 +27,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-expr.h"
 #include "memmodel.h"
 #include "expmed.h"
+#include "profile-count.h"
 #include "optabs.h"
 #include "recog.h"
 #include "diagnostic-core.h"
diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 220de4b2cba..57b2edbcb43 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -25,86 +25,30 @@ along with GCC; see the file COPYING3.  If not see
 #include "tm.h"
 #include "rtl.h"
 #include "regs.h"
-#include "hard-reg-set.h"
 #include "insn-config.h"
-#include "conditions.h"
 #include "insn-attr.h"
 #include "recog.h"
 #include "output.h"
-#include "hash-set.h"
-#include "machmode.h"
-#include "vec.h"
-#include "double-int.h"
-#include "input.h"
 #include "alias.h"
-#include "symtab.h"
-#include "wide-int.h"
-#include "inchash.h"
 #include "tree.h"
-#include "fold-const.h"
 #include "varasm.h"
-#include "stringpool.h"
 #include "stor-layout.h"
 #include "calls.h"
 #include "function.h"
-#include "hashtab.h"
-#include "flags.h"
-#include "statistics.h"
-#include "real.h"
-#include "fixed-value.h"
-#include "expmed.h"
-#include "dojump.h"
 #include "explow.h"
 #include "memmodel.h"
 #include "emit-rtl.h"
-#include "stmt.h"
-#include "expr.h"
-#include "insn-codes.h"
-#include "optabs.h"
-#include "libfuncs.h"
 #include "reload.h"
 #include "tm_p.h"
-#include "ggc.h"
-#include "gstab.h"
-#include "hash-table.h"
-#include "debug.h"
 #include "target.h"
 #include "target-def.h"
-#include "common/common-target.h"
-#include "langhooks.h"
-#include "dominance.h"
-#include "profile-count.h"
-#include "cfg.h"
-#include "cfgrtl.h"
-#include "cfganal.h"
-#include "lcm.h"
-#include "cfgbuild.h"
-#include "cfgcleanup.h"
-#include "predict.h"
 #include "basic-block.h"
+#include "expr.h"
+#include "optabs.h"
 #include "bitmap.h"
-#include "regset.h"
 #include "df.h"
-#include "sched-int.h"
-#include "tree-ssa-alias.h"
-#include "internal-fn.h"
-#include "gimple-fold.h"
-#include "tree-eh.h"
-#include "gimple-expr.h"
-#include "is-a.h"
-#include "gimple.h"
-#include "gimplify.h"
 #include "diagnostic.h"
-#include "target-globals.h"
-#include "opts.h"
-#include "tree-pass.h"
-#include "context.h"
-#include "hash-map.h"
-#include "plugin-api.h"
-#include "ipa-ref.h"
-#include "cgraph.h"
 #include "builtins.h"
-#include "rtl-iter.h"
 
 /* True if X is an UNSPEC wrapper around a SYMBOL_REF or LABEL_REF.  */
 #define UNSPEC_ADDRESS_P(X)\


Re: [PATCH, rs6000] Do not do gimple-folding of expressions that are missing their LHS

2017-07-12 Thread Segher Boessenkool
On Wed, Jul 12, 2017 at 01:34:01PM -0500, Bill Schmidt wrote:
> Although I said this was invalid code, it really isn't -- it's legal code.  
> It's more of an ice-on-stupid-code situation. :)  So probably you should 
> remove the "invalid" language from the commentary.  Sorry for misleading you.

We could fold this to nothing (if there are no side effects), but it
would be better if we made stupid code slower instead of faster ;-)


Segher


Re: [PATCH, rs6000] Do not do gimple-folding of expressions that are missing their LHS

2017-07-12 Thread Segher Boessenkool
On Wed, Jul 12, 2017 at 11:45:07AM -0500, Will Schmidt wrote:
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 10c5521..e21b56f 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -16297,6 +16297,9 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
>  = (enum rs6000_builtins) DECL_FUNCTION_CODE (fndecl);
>tree arg0, arg1, lhs;
>  
> +  /*  Generic solution to prevent gimple folding of code without a LHS.  */

Only one space after /* please.

> +  if (!gimple_call_lhs (stmt)) return false;

I know I typed that code, but "return" should be on a new line, indented.
Sorry :-)

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-missing-lhs.c
> @@ -0,0 +1,24 @@
> +/* This test is meant to verify that the gimple-folding does not
> +occur when the LHS portion of an expression is missing.
> +Though we would consider this invalid code, this should not generate an ICE.
> +This was noticed during debug of PR81317.  */

We usually indent comments, so three spaces at the start of each of the
last three lines.  Not that testcases really have to follow the GNU coding
style.  This comment is nicely informative in either case, thanks :-)

The patch is fine with the typographical stuff fixed (and Bill's comment).


Segher


Re: [PATCH], PR target/81193, Add warning for using __builtin_cpu_* on old PowerPC GLIBC's

2017-07-12 Thread Michael Meissner
On Wed, Jul 12, 2017 at 04:07:42PM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Wed, Jul 12, 2017 at 11:38:11AM -0400, Michael Meissner wrote:
> > I also changed the current warning in target_clones handling to be an error
> > instead of a warning, since it really makes no sense to use target_clones 
> > if we
> > can't generate a resolver function.
> 
> Okay.  Another option is to then always use the default, but this
> certainly is easier; let's hope we can get away with it.

We can always change it, but it needs to make sure the user doesn't get two
warnings then (one for target_clones, and a second one for
__builtin_cpu_supports).

> > (rs6000_get_function_versions_dispatcher): Change the warnging
> 
> Typo.

Thanks.

> > --- gcc/config/rs6000/rs6000-c.c
> > (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
> > (revision 250063)
> > +++ gcc/config/rs6000/rs6000-c.c(.../gcc/config/rs6000) (working copy)
> > @@ -644,6 +644,10 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi
> >  builtin_define ("__FLOAT128_HARDWARE__");
> >if (TARGET_LONG_DOUBLE_128 && FLOAT128_IBM_P (TFmode))
> >  builtin_define ("__ibm128=long double");
> > +#ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
> > +  builtin_define ("__BUILTIN_CPU_SUPPORTS__");
> > +  builtin_define ("__BUILTIN_CPU_IS__");
> > +#endif
> 
> Is it useful to have two defines?  They always are enabled at the same
> time afaics?  Thinking of a good name isn't so easy of course, but maybe
> just __BUILTIN_CPU_SUPPORTS__ is fine always.

Yeah, I was wondering about that.  I'll delete __BUILTIN_CPU_IS__, since I find
that to be less useful.

> 
> Or do other architectures already have both?

The only other architecture that supports __builtin_cpu_{init,is,supports} is
the X86, and they don't have the macros.  But then they also don't have the
issue that you can't do a lot in the ifunc resolver.

> > +The @code{__builtin_cpu_is} function requires GLIBC 2.23 or newer in
> > +order to export the hardware capability bits.  The
> > +@code{__builtin_cpu_is} function will return 0 if the compiler was
> > +configured for an earlier GCC.  GCC defines the macro
> > +@code{__BUILTIN_CPU_IS__} if you can use the @code{__builtin_cpu_is}
> > +built-in function.
> 
> "if you can use it" is not very clear (the line before you said you
> always can use it, it just returns 0).

I will reword it.

> Okay for trunk and 7 with those things taken care of.  Thanks!

Thanks.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH][RFA/RFC] Stack clash mitigation patch 07/08

2017-07-12 Thread Jeff Law
On 07/12/2017 06:47 AM, Trevor Saunders wrote:
> On Tue, Jul 11, 2017 at 08:02:26PM -0600, Jeff Law wrote:
>> On 07/11/2017 06:20 PM, Wilco Dijkstra wrote:
>>> Jeff Law wrote:
 aarch64 is the first target that does not have any implicit probes in
 the caller.  Thus at prologue entry it must make conservative
 assumptions about the offset of the most recent probed address relative
 to the stack pointer.
>>>
>>> No - like I mentioned before that's not correct nor acceptable as it would 
>>> imply
>>> that ~70% of functions need a probe at entry. I did a quick run across SPEC 
>>> and
>>> found the outgoing argument size is > 1024 in just 9 functions out of 
>>> 147000!
>>> Those 9 were odd special cases due to auto generated code to interface 
>>> between
>>> C and Fortran. This is extremely unlikely to occur anywhere else. So even 
>>> assuming
>>> an unchecked caller, large outgoing arguments are simply not a realistic 
>>> threat.
>> Whether or not such frames exist in SPEC isn't the question.   THere's
>> nothing in the ABI or ISA that allows us to avoid those probes without
>> compromising security.
>>
>> Mixed code scenarios are going to be a fact of life, probably forever
>> unless we can convince every ISV providing software that works on top of
>> RHEL/*BSD/whatever  to turn on probing (based on my experiences, that
>> has exactly a zero chance of occurring).
> 
> On the other hand if probing is fast enough that it can be on by default
> in gcc there should be much less of it.  Even if you did change the ABI
> to require probing it seems unlikely that code violating that
> requirement would hit problems other than this security concern, so I'd
> expect there will be some non compliant asm out there.
Certainly my goal is to enable it by default one day.  Even if that's
ultimately done at the distro level or by a configure time switch.

Certainly if/when we reach that point the amount of unprotected code
will drop dramatically, but based on my experience I fully expect some
folks to turn it off.  They'll say something like "hey, we've audited
our code and we don't have any large stack or allocas, so I'm turning
this thing off to get some un-measurable performance gain."  It'd be an
uber-dumb thing to do, but it's going to happen.

And once that happens, they're open to attack again, even if they were
correct in their audit and their code only calls stack-clash protected
libraries.

Tweaking the ABI to mandate touching *sp in the outgoing args area &
alloca space is better because we likely wouldn't have an option to
avoid that access.   So a well meaning, but clueless, developer couldn't
turn the option off like they could stack checking.

> 
>> In a mixed code scenario a caller may have a large alloca or large
>> outgoing argument area that pushes the stack pointer into the guard page
>> without actually accessing the guard page.  That requires a callee which
>> is compiled with stack checking to make worst case assumptions at
>> function entry to protect itself as much as possible from these attacks.
> 
> It seems to me pretty important to ask how many programs out there have
> a caller that can push the stack into the guard page, but not past it.
I've actually spent a lot of time thing about that precise problem. You
don't need large frames to do that -- you just need controlled heap
leaks and/or controlled recursion.  ie, even if a function has no
allocas and a small frame, it can put the stack pointer into the guard.

THus in the callee we have to make worst case assumptions to be safe on
targets where there is no implicit probe in the caller.



> I'd expect that's not the case for most allocas, either they are safe,
> or can increase the stack arbitrarily.  I'd expect its more likely with
> outgoing arg or large buffer allocations though.
Unfortunately not.  A small alloca is not inherently safe unless you
also touch the allocated space.  That's precisely why the generic code
touches residuals after the loop that handles PROBE_INTERVAL chunks
(consider a small alloca in a loop).

That's also why I suggested a small backwards compatible tweak to the
aarch64 ABI.  Specifically there should be a mandated touch of *sp in
any function where the outgoing args size is greater than some well
defined value or if the function allocates any dynamic stack space.

That mandated touch of *sp at the ABI level would be enough to allow for
a much less conservative assumed state at function start and would allow
the vast majority of functions to need no probing at all.



  I think the largest
> buffer Qualys found was less than 400k? So 1 256k guard page should
> protect 95% of functions, and 1m or 2m  seems like enough to protect
> against all non malicious programs.  I'm not sure, but this is a 64 bit
> arch, so it seems like we should have the adress space for large guard
> pages like that.
I'm all for larger guards, particularly on 64 bit architectures.

We use 64k pages on aarch64 for RHEL 

Re: [PATCH], PR target/81193, Add warning for using __builtin_cpu_* on old PowerPC GLIBC's

2017-07-12 Thread Segher Boessenkool
Hi Mike,

On Wed, Jul 12, 2017 at 11:38:11AM -0400, Michael Meissner wrote:
> I also changed the current warning in target_clones handling to be an error
> instead of a warning, since it really makes no sense to use target_clones if 
> we
> can't generate a resolver function.

Okay.  Another option is to then always use the default, but this
certainly is easier; let's hope we can get away with it.

>   (rs6000_get_function_versions_dispatcher): Change the warnging

Typo.

> --- gcc/config/rs6000/rs6000-c.c  
> (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
> (revision 250063)
> +++ gcc/config/rs6000/rs6000-c.c  (.../gcc/config/rs6000) (working copy)
> @@ -644,6 +644,10 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfi
>  builtin_define ("__FLOAT128_HARDWARE__");
>if (TARGET_LONG_DOUBLE_128 && FLOAT128_IBM_P (TFmode))
>  builtin_define ("__ibm128=long double");
> +#ifdef TARGET_LIBC_PROVIDES_HWCAP_IN_TCB
> +  builtin_define ("__BUILTIN_CPU_SUPPORTS__");
> +  builtin_define ("__BUILTIN_CPU_IS__");
> +#endif

Is it useful to have two defines?  They always are enabled at the same
time afaics?  Thinking of a good name isn't so easy of course, but maybe
just __BUILTIN_CPU_SUPPORTS__ is fine always.

Or do other architectures already have both?

> +The @code{__builtin_cpu_is} function requires GLIBC 2.23 or newer in
> +order to export the hardware capability bits.  The
> +@code{__builtin_cpu_is} function will return 0 if the compiler was
> +configured for an earlier GCC.  GCC defines the macro
> +@code{__BUILTIN_CPU_IS__} if you can use the @code{__builtin_cpu_is}
> +built-in function.

"if you can use it" is not very clear (the line before you said you
always can use it, it just returns 0).

Okay for trunk and 7 with those things taken care of.  Thanks!


Segher


Re: Default std::list default and move constructors

2017-07-12 Thread François Dumont

On 05/07/2017 17:22, Jonathan Wakely wrote:

It's mostly good, but I'd like to make a few suggestions ...


diff --git a/libstdc++-v3/include/bits/stl_list.h 
b/libstdc++-v3/include/bits/stl_list.h

index 232885a..7e5 100644
--- a/libstdc++-v3/include/bits/stl_list.h
+++ b/libstdc++-v3/include/bits/stl_list.h
@@ -82,6 +82,17 @@ namespace std _GLIBCXX_VISIBILITY(default)
  _List_node_base* _M_next;
  _List_node_base* _M_prev;

+#if __cplusplus >= 201103L
+  _List_node_base() = default;
+#else
+  _List_node_base()
+  { }
+#endif
+
+  _List_node_base(_List_node_base* __next, _List_node_base* __prev)
+: _M_next(__next), _M_prev(__prev)
+  { }
+


I think I'd prefer to leave this struct with no user-defined
constructors, instead of adding these.


My goal was to make sure that as much as possible instances are 
initialized through their initialization list. But it is still possible 
without it so I made the change.





  static void
  swap(_List_node_base& __x, _List_node_base& __y) 
_GLIBCXX_USE_NOEXCEPT;


@@ -99,6 +110,79 @@ namespace std _GLIBCXX_VISIBILITY(default)
  _M_unhook() _GLIBCXX_USE_NOEXCEPT;
};

+/// The %list node header.
+struct _List_node_header : public _List_node_base
+{
+private:
+#if _GLIBCXX_USE_CXX11_ABI
+  std::size_t _M_size;
+#endif


I don't think this needs to be private, because we don't have to worry
about users accessing this member. It's an internal-only type, and the
_M_next and _M_prev members are already public.


It's not internal-only as those members are exposed on std::list through 
inheritance. But I agree that consistency is more important here so I 
made it public.




If it's public then the _List_base::_M_inc_size, _M_dec_size etc.
could access it directly, and we don't need to add duplicates of those
functions to _List_impl.


+
+  _List_node_base* _M_base() { return this; }


Is this function necessary?


It is a nice replacement for calls to __addressof.




+public:
+  _List_node_header() _GLIBCXX_NOEXCEPT
+  : _List_node_base(this, this)
+# if _GLIBCXX_USE_CXX11_ABI
+  , _M_size(0)
+# endif
+  { }


This could be:

 _List_node_header() _GLIBCXX_NOEXCEPT
 { _M_init(); }


Sure, I was only interested in the initialization-list approach. As 
there is no default initialization of the members in _List_node_base for 
_M_prev/_M_next neither for _M_size I guess it will be super easy for 
the compiler to optimize it as if it was an initialization-list.






+#if __cplusplus >= 201103L
+  _List_node_header(_List_node_header&& __x) noexcept
+  : _List_node_base(__x._M_next, __x._M_prev)


And this could use aggregate-initialization:

 : _List_node_base{__x._M_next, __x._M_prev}


Nice.


+#if _GLIBCXX_USE_CXX11_ABI
+  size_t _M_get_size() const { return _M_size; }
+  void _M_set_size(size_t __n) { _M_size = __n; }
+  void _M_inc_size(size_t __n) { _M_size += __n; }
+  void _M_dec_size(size_t __n) { _M_size -= __n; }
+#else
+  // dummy implementations used when the size is not stored
+  size_t _M_get_size() const { return 0; }
+  void _M_set_size(size_t) { }
+  void _M_inc_size(size_t) { }
+  void _M_dec_size(size_t) { }
+#endif


What do you think about only having _M_set_size() here?
The other functions are not needed here (assuming _M_size is public).

We could even get rid of _M_set_size and use #if in _M_init:


+  void
+  _M_init() _GLIBCXX_NOEXCEPT
+  {
+this->_M_next = this->_M_prev = this;

#if _GLIBCXX_USE_CXX11_ABI
   _M_size = 0;
#endif


+_M_set_size(0);
+  }
+};


This replaces a #if and #else and four functions with a #if and no
functions, which seems simpler to me.
_List_impl _M_impl;


-#if _GLIBCXX_USE_CXX11_ABI
-  size_t _M_get_size() const { return 
*_M_impl._M_node._M_valptr(); }

+  size_t
+  _M_get_size() const { return _M_impl._M_node._M_get_size(); }

-  void _M_set_size(size_t __n) { *_M_impl._M_node._M_valptr() = 
__n; }

+  void
+  _M_set_size(size_t __n) { _M_impl._M_node._M_set_size(__n); }

-  void _M_inc_size(size_t __n) { *_M_impl._M_node._M_valptr() += 
__n; }

+  void
+  _M_inc_size(size_t __n) { _M_impl._M_node._M_inc_size(__n); }

-  void _M_dec_size(size_t __n) { *_M_impl._M_node._M_valptr() -= 
__n; }

+  void
+  _M_dec_size(size_t __n) { _M_impl._M_node._M_dec_size(__n); }


These functions could just access _M_impl._M_size directly if it was
public. Introducing new functions to _List_impl to be used here seems
like unnecessary complication. We don't get rid of the #if because
it's still needed for these functions anyway:

Yes, I wanted to manage as much as possible usage of C++11 abi in the 
new _List_node_header type.


So here is the new patch limited to this evolution. Optimization for 
always equal allocator will come after along with another simplification 
to replace _S_distance with std::distanc

Re: [C++ PATCH] ctor predicates

2017-07-12 Thread Nathan Sidwell

On 07/12/2017 03:14 PM, Christophe Lyon wrote:

Hi Nathan,



/gccsrc/libcc1/libcp1plugin.cc: In function ‘gcc_decl
plugin_build_decl(cc1_plugin::connection*,
 const char*, gcc_cp_symbol_kind, gcc_type, const char*, gcc_address,
const char*, unsigned int)’:
/gccsrc/libcc1/libcp1plugin.cc:1422: error: lvalue required as left
operand of assignment
/gccsrc/libcc1/libcp1plugin.cc:1424: error: lvalue required as left
operand of assignment
make[3]: *** [libcp1plugin.lo] Error 1


whoops, forgot that I had changes there too.  Applied the attached.

nathan

--
Nathan Sidwell
2017-07-12  Nathan Sidwell  

* libcp1plugin.cc (plugin_build_decl): Use
DECL_CXX_{CON,DE}STRUCTOR directly.

Index: libcp1plugin.cc
===
--- libcp1plugin.cc (revision 250090)
+++ libcp1plugin.cc (working copy)
@@ -1419,9 +1419,9 @@ plugin_build_decl (cc1_plugin::connectio
   if (ctor || dtor)
{
  if (ctor)
-   DECL_CONSTRUCTOR_P (decl) = 1;
+   DECL_CXX_CONSTRUCTOR_P (decl) = 1;
  if (dtor)
-   DECL_DESTRUCTOR_P (decl) = 1;
+   DECL_CXX_DESTRUCTOR_P (decl) = 1;
}
   else
{


Re: [ping #4][patch] Fix PR80929: Realistic PARALLEL cost in seq_cost.

2017-07-12 Thread Segher Boessenkool
On Wed, Jul 12, 2017 at 03:30:00PM +0200, Georg-Johann Lay wrote:
> On 12.07.2017 14:11, Segher Boessenkool wrote:
> >On Tue, Jul 11, 2017 at 10:47:27AM +0200, Georg-Johann Lay wrote:
> >>This small addition improves costs of PARALLELs in
> >>rtlanal.c:seq_cost().  Up to now, these costs are
> >>assumed to be 1 which gives gross inexact costs for,
> >>e.g. divmod which is represented as PARALLEL.
> >
> >insn_rtx_cost returns 0 ("unknown") for such a PARALLEL, so your
> >current patch does not change this at all?
> 
> Huh?  It returns the costs of 1st SET in a PARALLEL (provided it
> has one), no?  Or even costs for come compares.

No, it returns 0 if there is more than one normal SET (or more than
one compare).

> >>+  else if (INSN_P (seq)
> >>+  && PARALLEL == GET_CODE (PATTERN (seq)))
> >
> >Yoda conditions have we should not.
> 
> hmm, I didn't find something like PARALLEL_P (rtx).
> Is comparing rtx_codes deprecated now?

I meant it should be written

  else if (INSN_P (seq) && GET_CODE (PATTERN (seq)) == PARALLEL)

i.e. constant on the right.

> >This whole thing could be something like
> >
> >   if (INSN_P (seq))
> > {
> >   int t = insn_rtx_cost (PATTERN (seq), speed);
> 
> This will behave differently.

Yes, I know, I even said so.

> >(Why do you need a check for INSN_P here?  Why does it increment the
> 
> >cost for non-insns?  So many questions).
> 
> Again, you'll have to ask the original author for reasoning.

Since you want to change the code, to make it better, I was hoping
you would dig in a bit.  To make it better, not just different :-/


Segher


Re: [C++ PATCH] ctor predicates

2017-07-12 Thread Christophe Lyon
Hi Nathan,


On 12 July 2017 at 19:34, Nathan Sidwell  wrote:
> Now that all the cdtors have special names, we can detect them by looking at
> the name, rather than a collection of other things.
>
> For the DECL_[CD]TOR_P cases we're now comparing identifiers, removing a
> STRIP_TEMPLATE
>
> For the IN_CHARGE case we're replacing a conjunction of 3 checks (2 of which
> contain STRIP_TEMPLATES) with looking at bitflags on the identifier.
>
> That all seems like a win to me.  Applied to trunk.
>

After this commit (r250158), gcc fails to build at least for aarch64 and arm.

The build logs ends with:

/gccsrc/libcc1/libcp1plugin.cc: In function ‘gcc_decl
plugin_build_decl(cc1_plugin::connection*,
 const char*, gcc_cp_symbol_kind, gcc_type, const char*, gcc_address,
const char*, unsigned int)’:
/gccsrc/libcc1/libcp1plugin.cc:1422: error: lvalue required as left
operand of assignment
/gccsrc/libcc1/libcp1plugin.cc:1424: error: lvalue required as left
operand of assignment
make[3]: *** [libcp1plugin.lo] Error 1


Christophe

> nathan
> --
> Nathan Sidwell


Re: [PATCH, rs6000] Do not do gimple-folding of expressions that are missing their LHS

2017-07-12 Thread Bill Schmidt
Although I said this was invalid code, it really isn't -- it's legal code.  
It's more of an ice-on-stupid-code situation. :)  So probably you should remove 
the "invalid" language from the commentary.  Sorry for misleading you.

-- Bill

Bill Schmidt, Ph.D.
GCC for Linux on Power
Linux on Power Toolchain
IBM Linux Technology Center
wschm...@linux.vnet.ibm.com

> On Jul 12, 2017, at 11:45 AM, Will Schmidt  wrote:
> 
> Hi,
> 
> 
> Do not do the gimple-folding optimizations of expressions that are
> missing their LHS.  (Preventing an ICE on invalid code).
> 
> This was noticed during debug of PR81317, but is not a fix for that PR.
> This is based on a patch suggested by Segher.
> 
> (This will need a revisit if/when we get as far as doing early gimple
> folding for expressions without a lhs, but for now, this seems sufficient).
> 
> This seems straightforward.  regtest going on ppc64LE just in case.
> OK for trunk?
> 
> [gcc]
> 
>   2017-07-12  Will Schmidt  
> 
>   * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Return
>   early if there is no lhs.
> 
> [gcc/testsuite]
> 
>   2017-07-12  Will Schmidt  
> 
>   * gcc.target/powerpc/fold-vec-missing-lhs.c: New.
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 10c5521..e21b56f 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -16297,6 +16297,9 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> = (enum rs6000_builtins) DECL_FUNCTION_CODE (fndecl);
>   tree arg0, arg1, lhs;
> 
> +  /*  Generic solution to prevent gimple folding of code without a LHS.  */
> +  if (!gimple_call_lhs (stmt)) return false;
> +
>   switch (fn_code)
> {
> /* Flavors of vec_add.  We deliberately don't expand
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-missing-lhs.c 
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-missing-lhs.c
> new file mode 100644
> index 000..d62f913
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-missing-lhs.c
> @@ -0,0 +1,24 @@
> +/* This test is meant to verify that the gimple-folding does not
> +occur when the LHS portion of an expression is missing.
> +Though we would consider this invalid code, this should not generate an ICE.
> +This was noticed during debug of PR81317.  */
> +
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_altivec_ok } */
> +/* { dg-options "-maltivec" } */
> +
> +#include 
> +
> +vector signed short
> +test1_nolhs (vector bool short x, vector signed short y)
> +{
> +  vec_add (x, y);
> +  return vec_add (x, y);
> +}
> +
> +vector signed short
> +test2_nolhs (vector signed short x, vector bool short y)
> +{
> +  vec_add (x, y);
> +  return vec_add (x, y);
> +}
> 
> 



[C++ PATCH] ctor predicates

2017-07-12 Thread Nathan Sidwell
Now that all the cdtors have special names, we can detect them by looking at the 
name, rather than a collection of other things.


For the DECL_[CD]TOR_P cases we're now comparing identifiers, removing a 
STRIP_TEMPLATE


For the IN_CHARGE case we're replacing a conjunction of 3 checks (2 of which 
contain STRIP_TEMPLATES) with looking at bitflags on the identifier.


That all seems like a win to me.  Applied to trunk.

nathan
--
Nathan Sidwell
2017-07-12  Nathan Sidwell  

* cp-tree.h (DECL_CONSTRUCTOR_P, DECL_MAYBE_IN_CHARGE_CONSTRUCTOR,
DECL_DESTRUCTOR_P, DECL_MAYBE_IN_CHARGE_DESTRCTOR): Look at
identifier flags.
* decl.c (grokfndecl): Set DECL_CXX_CONSTRUCTOR and
DECL_CXX_DESTRUCTOR explicitly.
* decl2.c (grokclassfn): Likewise.
* friend.c (do_friend): Likewise.
* method.c (make_thunk, make_alias_for,
implicitly_declare_fn): Likewise.

Index: cp-tree.h
===
--- cp-tree.h   (revision 250157)
+++ cp-tree.h   (working copy)
@@ -2706,7 +2706,7 @@ struct GTY(()) lang_decl {
 /* For FUNCTION_DECLs and TEMPLATE_DECLs: nonzero means that this function
is a constructor.  */
 #define DECL_CONSTRUCTOR_P(NODE) \
-  DECL_CXX_CONSTRUCTOR_P (STRIP_TEMPLATE (NODE))
+  IDENTIFIER_CTOR_P (DECL_NAME (NODE))
 
 /* Nonzero if NODE (a FUNCTION_DECL) is a constructor for a complete
object.  */
@@ -2722,8 +2722,7 @@ struct GTY(()) lang_decl {
specialized in-charge constructor or the specialized not-in-charge
constructor.  */
 #define DECL_MAYBE_IN_CHARGE_CONSTRUCTOR_P(NODE)   \
-  (DECL_DECLARES_FUNCTION_P (NODE) && DECL_CONSTRUCTOR_P (NODE) \
-   && !DECL_CLONED_FUNCTION_P (NODE))
+  (DECL_NAME (NODE) == ctor_identifier)
 
 /* Nonzero if NODE (a FUNCTION_DECL) is a copy constructor.  */
 #define DECL_COPY_CONSTRUCTOR_P(NODE) \
@@ -2736,14 +2735,13 @@ struct GTY(()) lang_decl {
 /* Nonzero if NODE (a FUNCTION_DECL or TEMPLATE_DECL)
is a destructor.  */
 #define DECL_DESTRUCTOR_P(NODE)\
-  DECL_CXX_DESTRUCTOR_P (STRIP_TEMPLATE (NODE))
+  IDENTIFIER_DTOR_P (DECL_NAME (NODE))
 
 /* Nonzero if NODE (a FUNCTION_DECL) is a destructor, but not the
specialized in-charge constructor, in-charge deleting constructor,
or the base destructor.  */
 #define DECL_MAYBE_IN_CHARGE_DESTRUCTOR_P(NODE)\
-  (DECL_DECLARES_FUNCTION_P (NODE) && DECL_DESTRUCTOR_P (NODE) \
-   && !DECL_CLONED_FUNCTION_P (NODE))
+  (DECL_NAME (NODE) == dtor_identifier)
 
 /* Nonzero if NODE (a FUNCTION_DECL) is a destructor for a complete
object.  */
Index: decl.c
===
--- decl.c  (revision 250157)
+++ decl.c  (working copy)
@@ -8513,11 +8513,11 @@ grokfndecl (tree ctype,
 case sfk_constructor:
 case sfk_copy_constructor:
 case sfk_move_constructor:
-  DECL_CONSTRUCTOR_P (decl) = 1;
+  DECL_CXX_CONSTRUCTOR_P (decl) = 1;
   DECL_NAME (decl) = ctor_identifier;
   break;
 case sfk_destructor:
-  DECL_DESTRUCTOR_P (decl) = 1;
+  DECL_CXX_DESTRUCTOR_P (decl) = 1;
   DECL_NAME (decl) = dtor_identifier;
   break;
 default:
Index: decl2.c
===
--- decl2.c (revision 250157)
+++ decl2.c (working copy)
@@ -342,7 +342,7 @@ grokclassfn (tree ctype, tree function,
   DECL_CONTEXT (function) = ctype;
 
   if (flags == DTOR_FLAG)
-DECL_DESTRUCTOR_P (function) = 1;
+DECL_CXX_DESTRUCTOR_P (function) = 1;
 
   if (flags == DTOR_FLAG || DECL_CONSTRUCTOR_P (function))
 maybe_retrofit_in_chrg (function);
Index: friend.c
===
--- friend.c(revision 250157)
+++ friend.c(working copy)
@@ -529,7 +529,7 @@ do_friend (tree ctype, tree declarator,
 
   /* A method friend.  */
   if (flags == NO_SPECIAL && declarator == cname)
-   DECL_CONSTRUCTOR_P (decl) = 1;
+   DECL_CXX_CONSTRUCTOR_P (decl) = 1;
 
   grokclassfn (ctype, decl, flags);
 
Index: method.c
===
--- method.c(revision 250157)
+++ method.c(working copy)
@@ -137,8 +137,8 @@ make_thunk (tree function, bool this_adj
   DECL_SAVED_FUNCTION_DATA (thunk) = NULL;
   /* The thunk itself is not a constructor or destructor, even if
  the thing it is thunking to is.  */
-  DECL_DESTRUCTOR_P (thunk) = 0;
-  DECL_CONSTRUCTOR_P (thunk) = 0;
+  DECL_CXX_DESTRUCTOR_P (thunk) = 0;
+  DECL_CXX_CONSTRUCTOR_P (thunk) = 0;
   DECL_EXTERNAL (thunk) = 1;
   DECL_ARTIFICIAL (thunk) = 1;
   /* The THUNK is not a pending inline, even if the FUNCTION is.  */
@@ -223,8 +223,8 @@ make_alias_for (tree target, tree newid)
   if (TREE_CODE (alias) == FUNCTION_DECL)
 {
   DECL_SAVED_FUNCTION_DATA (alias) = NULL;
-  DECL_DESTRUCTOR_P (alias) = 0;
-  DECL_CON

Re: [PING^3][RFC, PATCH][ASAN] Implement dynamic allocas/VLAs sanitization.​

2017-07-12 Thread Andreas Schwab
On Jul 11 2017, Maxim Ostapenko  wrote:

> Ok, I see, it seems that we need to add convert in
> expand_asan_emit_allocas_unpoison too. This patch seems to work for me on
> aarch64 -mabi=ilp32, could you check it as well?

This fixes all regressions.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


[PATCH, rs6000] Do not do gimple-folding of expressions that are missing their LHS

2017-07-12 Thread Will Schmidt
Hi,

   
Do not do the gimple-folding optimizations of expressions that are
missing their LHS.  (Preventing an ICE on invalid code).

This was noticed during debug of PR81317, but is not a fix for that PR.
This is based on a patch suggested by Segher.

(This will need a revisit if/when we get as far as doing early gimple
folding for expressions without a lhs, but for now, this seems sufficient).

This seems straightforward.  regtest going on ppc64LE just in case.
OK for trunk?

[gcc]

2017-07-12  Will Schmidt  

* config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Return
early if there is no lhs.

[gcc/testsuite]

2017-07-12  Will Schmidt  

* gcc.target/powerpc/fold-vec-missing-lhs.c: New.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 10c5521..e21b56f 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -16297,6 +16297,9 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
 = (enum rs6000_builtins) DECL_FUNCTION_CODE (fndecl);
   tree arg0, arg1, lhs;
 
+  /*  Generic solution to prevent gimple folding of code without a LHS.  */
+  if (!gimple_call_lhs (stmt)) return false;
+
   switch (fn_code)
 {
 /* Flavors of vec_add.  We deliberately don't expand
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-missing-lhs.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-missing-lhs.c
new file mode 100644
index 000..d62f913
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-missing-lhs.c
@@ -0,0 +1,24 @@
+/* This test is meant to verify that the gimple-folding does not
+occur when the LHS portion of an expression is missing.
+Though we would consider this invalid code, this should not generate an ICE.
+This was noticed during debug of PR81317.  */
+
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-options "-maltivec" } */
+
+#include 
+
+vector signed short
+test1_nolhs (vector bool short x, vector signed short y)
+{
+  vec_add (x, y);
+  return vec_add (x, y);
+}
+
+vector signed short
+test2_nolhs (vector signed short x, vector bool short y)
+{
+  vec_add (x, y);
+  return vec_add (x, y);
+}




[rs6000] Avoid rotates of floating-point modes

2017-07-12 Thread Richard Sandiford
The little-endian VSX code uses rotates to swap the two 64-bit halves of
128-bit scalar modes.  This is fine for TImode and V1TImode, but it
isn't really valid to use RTL rotates on floating-point modes like
KFmode and TFmode, and doing that triggered an assert added by the
SVE series.  This patch uses bit-casts to V1TImode instead.

Tested on powerpc64le-linux-gnu.  OK to install?

Richard


2017-07-12  Richard Sandiford  

gcc/
* config/rs6000/rs6000-protos.h (rs6000_emit_le_vsx_permute): Declare.
* config/rs6000/rs6000.c (rs6000_gen_le_vsx_permute): Replace with...
(rs6000_emit_le_vsx_permute): ...this.  Take the destination as input.
Emit instructions rather than returning an expression.  Handle TFmode
and KFmode by casting to TImode.
(rs6000_emit_le_vsx_load): Update to use rs6000_emit_le_vsx_permute.
(rs6000_emit_le_vsx_store): Likewise.
* config/rs6000/vsx.md (VSX_LE_128I): New iterator.
(*vsx_le_permute_): Use it instead of VSX_LE_128.
(*vsx_le_undo_permute_): Likewise.
(*vsx_le_perm_load_): Use rs6000_emit_le_vsx_permute to
emit the split sequence.
(*vsx_le_perm_store_): Likewise.

Index: gcc/config/rs6000/rs6000-protos.h
===
--- gcc/config/rs6000/rs6000-protos.h   2017-06-30 12:50:38.886633045 +0100
+++ gcc/config/rs6000/rs6000-protos.h   2017-07-12 16:30:38.728631839 +0100
@@ -151,6 +151,7 @@ extern rtx rs6000_longcall_ref (rtx);
 extern void rs6000_fatal_bad_address (rtx);
 extern rtx create_TOC_reference (rtx, rtx);
 extern void rs6000_split_multireg_move (rtx, rtx);
+extern void rs6000_emit_le_vsx_permute (rtx, rtx, machine_mode);
 extern void rs6000_emit_le_vsx_move (rtx, rtx, machine_mode);
 extern bool valid_sf_si_move (rtx, rtx, machine_mode);
 extern void rs6000_emit_move (rtx, rtx, machine_mode);
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  2017-07-08 11:37:45.740795846 +0100
+++ gcc/config/rs6000/rs6000.c  2017-07-12 16:30:38.732631678 +0100
@@ -10503,17 +10503,24 @@ rs6000_const_vec (machine_mode mode)
 
 /* Generate a permute rtx that represents an lxvd2x, stxvd2x, or xxpermdi
for a VSX load or store operation.  */
-rtx
-rs6000_gen_le_vsx_permute (rtx source, machine_mode mode)
+void
+rs6000_emit_le_vsx_permute (rtx dest, rtx source, machine_mode mode)
 {
   /* Use ROTATE instead of VEC_SELECT on IEEE 128-bit floating point, and
  128-bit integers if they are allowed in VSX registers.  */
-  if (FLOAT128_VECTOR_P (mode) || mode == TImode || mode == V1TImode)
-return gen_rtx_ROTATE (mode, source, GEN_INT (64));
+  if (FLOAT128_VECTOR_P (mode))
+{
+  dest = gen_lowpart (V1TImode, dest);
+  source = gen_lowpart (V1TImode, source);
+  mode = V1TImode;
+}
+  if (mode == TImode || mode == V1TImode)
+emit_insn (gen_rtx_SET (dest, gen_rtx_ROTATE (mode, source,
+ GEN_INT (64;
   else
 {
   rtx par = gen_rtx_PARALLEL (VOIDmode, rs6000_const_vec (mode));
-  return gen_rtx_VEC_SELECT (mode, source, par);
+  emit_insn (gen_rtx_SET (dest, gen_rtx_VEC_SELECT (mode, source, par)));
 }
 }
 
@@ -10523,8 +10530,6 @@ rs6000_gen_le_vsx_permute (rtx source, m
 void
 rs6000_emit_le_vsx_load (rtx dest, rtx source, machine_mode mode)
 {
-  rtx tmp, permute_mem, permute_reg;
-
   /* Use V2DImode to do swaps of types with 128-bit scalare parts (TImode,
  V1TImode).  */
   if (mode == TImode || mode == V1TImode)
@@ -10534,11 +10539,9 @@ rs6000_emit_le_vsx_load (rtx dest, rtx s
   source = adjust_address (source, V2DImode, 0);
 }
 
-  tmp = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (dest) : dest;
-  permute_mem = rs6000_gen_le_vsx_permute (source, mode);
-  permute_reg = rs6000_gen_le_vsx_permute (tmp, mode);
-  emit_insn (gen_rtx_SET (tmp, permute_mem));
-  emit_insn (gen_rtx_SET (dest, permute_reg));
+  rtx tmp = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (dest) : dest;
+  rs6000_emit_le_vsx_permute (tmp, source, mode);
+  rs6000_emit_le_vsx_permute (dest, tmp, mode);
 }
 
 /* Emit a little-endian store to vector memory location DEST from VSX
@@ -10547,8 +10550,6 @@ rs6000_emit_le_vsx_load (rtx dest, rtx s
 void
 rs6000_emit_le_vsx_store (rtx dest, rtx source, machine_mode mode)
 {
-  rtx tmp, permute_src, permute_tmp;
-
   /* This should never be called during or after reload, because it does
  not re-permute the source register.  It is intended only for use
  during expand.  */
@@ -10563,11 +10564,9 @@ rs6000_emit_le_vsx_store (rtx dest, rtx
   source = gen_lowpart (V2DImode, source);
 }
 
-  tmp = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (source) : source;
-  permute_src = rs6000_gen_le_vsx_permute (source, mode);
-  permute_tmp = rs6000_gen_le_vsx_permute (tmp, mode);
-  emit_insn (gen_rtx_SET (tmp, permut

[patch,avr,committed] Fix PR79883: Quote key words in diagnostics.

2017-07-12 Thread Georg-Johann Lay

This patchlet fixes a complaint from translation
projects because some non-quoted key words like
"interrupt" or "signal" caused problems there.

Enclosed the sequence in "WITH_AVRLIBC" because that is
only relevant when AVR-LibC start-up code is in use.

Also warns for functions named "ISR", "SIGNAL" and "INTERRUPT".
Such names indicate missing system include.

Applied as obvious.

Johann

PR target/79883
* config/avr/avr.c (avr_set_current_function): In diagnostic
messages: Quote keywords and (parts of) identifiers.
[WITH_AVRLIBC]: Warn for functions named "ISR", "SIGNAL" or
"INTERRUPT".

Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 250155)
+++ config/avr/avr.c	(revision 250156)
@@ -1076,12 +1076,6 @@ avr_set_current_function (tree decl)
 
   name = default_strip_name_encoding (name);
 
-  /* Silently ignore 'signal' if 'interrupt' is present.  AVR-LibC startet
- using this when it switched from SIGNAL and INTERRUPT to ISR.  */
-
-  if (cfun->machine->is_interrupt)
-cfun->machine->is_signal = 0;
-
   /* Interrupt handlers must be  void __vector (void)  functions.  */
 
   if (args && TREE_CODE (TREE_VALUE (args)) != VOID_TYPE)
@@ -1090,14 +1084,36 @@ avr_set_current_function (tree decl)
   if (TREE_CODE (ret) != VOID_TYPE)
 error_at (loc, "%qs function cannot return a value", isr);
 
+#if defined WITH_AVRLIBC
+  /* Silently ignore 'signal' if 'interrupt' is present.  AVR-LibC startet
+ using this when it switched from SIGNAL and INTERRUPT to ISR.  */
+
+  if (cfun->machine->is_interrupt)
+cfun->machine->is_signal = 0;
+
   /* If the function has the 'signal' or 'interrupt' attribute, ensure
  that the name of the function is "__vector_NN" so as to catch
  when the user misspells the vector name.  */
 
   if (!STR_PREFIX_P (name, "__vector"))
 warning_at (loc, OPT_Wmisspelled_isr, "%qs appears to be a misspelled "
-"%s handler, missing __vector prefix", name, isr);
+"%qs handler, missing %<__vector%> prefix", name, isr);
+#endif // AVR-LibC naming conventions
+}
+
+#if defined WITH_AVRLIBC
+  // Common problem is using "ISR" without first including avr/interrupt.h.
+  const char *name = IDENTIFIER_POINTER (DECL_NAME (decl));
+  name = default_strip_name_encoding (name);
+  if (0 == strcmp ("ISR", name)
+  || 0 == strcmp ("INTERRUPT", name)
+  || 0 == strcmp ("SIGNAL", name))
+{
+  warning_at (loc, OPT_Wmisspelled_isr, "%qs is a reserved indentifier"
+  " in AVR-LibC.  Consider %<#include %>"
+  " before using the %qs macro", name, name);
 }
+#endif // AVR-LibC naming conventions
 
   /* Don't print the above diagnostics more than once.  */
 


[PATCH v1 3/3] dwarf: purge DIEs for unreferenced extern globals.

2017-07-12 Thread Franklin “Snaipe” Mathieu
From: Franklin “Snaipe” Mathieu 

Due to an earlier change in gcc that split the dwarf info generation
in two steps (one early, one late), the DIE for unreferenced extern
globals are no longer removed (in fact, they didn't emit it at
all since they had already processed the translation unit and
knew whether or not a variable was referenced). This is no longer
the case during the early generation.

This change addresses this problem by revisiting during the late stage
global declarations on the C side, and for each namespace on the C++
side, removing DIEs when they are unreferenced in both cases.

gcc/c/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
* c-decl.c (c_parse_final_cleanups): Call the late_global_decl
hook for each global in the extern block.

gcc/cp/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
* decl2.c (c_parse_final_cleanups): Call the late_global_decl
for each extern variable in each namespace.
(purge_unused_extern_globals): New.

gcc/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
* dwarf2out: Remove DIEs for unreferenced externs.
(struct die_struct): Add removed field.
(lookup_decl_die): Remove DIEs marked for removal.
(mark_removed): New.
(dwarf2out_late_global_decl): Refactor to remove DIE from the
sibling list, and actually check that the code filling new
new location information acts on the same conditions as it
had when it was called from the symtab code.
(dwarf2out_imported_module_or_decl_1): make the referenced
DIE perennial to avoid it being removed when deemed unused, as
it would be referenced by a DW_TAG_imported_declaration entry.

gcc/testsuite/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
gcc.dg/debug/dwarf2/pr81135.c: New test.
g++.dg/debug/dwarf2/pr81135.C: New test.
---
 gcc/c/c-decl.c  |  10 +++
 gcc/cp/decl2.c  |  27 
 gcc/dwarf2out.c | 101 +++-
 gcc/testsuite/g++.dg/debug/dwarf2/pr81135.C |  25 +++
 gcc/testsuite/gcc.dg/debug/dwarf2/pr81135.C |  13 
 5 files changed, 159 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/debug/dwarf2/pr81135.C
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/pr81135.C

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index e8bafed..69efb04 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -10900,6 +10900,16 @@ c_parse_final_cleanups (void)
   c_write_global_declarations_1 (BLOCK_VARS (ext_block));
 
   timevar_stop (TV_PHASE_DEFERRED);
+
+  /* Purge unreferenced extern variables from the debug information.  */
+  if (!seen_error ())
+  {
+tree decl;
+for (decl = BLOCK_VARS (ext_block); decl; decl = DECL_CHAIN (decl))
+   if (DECL_EXTERNAL (decl))
+(*debug_hooks->late_global_decl) (decl);
+  }
+
   timevar_start (TV_PHASE_PARSING);
 
   ext_block = NULL;
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index a9511de..efbe2f6 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "calls.h"
 #include "decl.h"
 #include "toplev.h"
+#include "debug.h"
 #include "c-family/c-objc.h"
 #include "c-family/c-pragma.h"
 #include "dumpfile.h"
@@ -4523,6 +4524,27 @@ lower_var_init ()
 }
 }
 
+/* We call this routine on each namespace to remove unreferenced extern
+   variables from the debug information.  */
+
+static int
+purge_unused_extern_globals (tree name_space, void *data ATTRIBUTE_UNUSED)
+{
+  cp_binding_level *level = NAMESPACE_LEVEL (name_space);
+  tree decl;
+
+  if (seen_error ())
+return 1;
+
+  if (!level)
+return 0;
+
+  for (decl = level->names; decl; decl = DECL_CHAIN (decl))
+if (TREE_CODE (decl) == VAR_DECL && DECL_EXTERNAL (decl))
+  (*debug_hooks->late_global_decl) (decl);
+  return 0;
+}
+
 /* This routine is called at the end of compilation.
Its job is to create all the code needed to initialize and
destroy the global aggregates.  We do the destruction
@@ -4925,6 +4947,11 @@ c_parse_final_cleanups (void)
 }
 
   timevar_stop (TV_PHASE_DEFERRED);
+
+  /* Perform a late round of debugging information, to remove unreferenced
+ extern variables from the outputted information.  */
+  walk_namespaces (purge_unused_extern_globals, NULL);
+
   timevar_start (TV_PHASE_PARSING);
 
   /* Indicate that we're done with front end processing.  */
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index f241073..d321e56 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "function.h"
 #include "rtl.h"
 #include "tree.h"
+#include "cp/cp-tree.h"
 #include "tm_p.h"
 #include "stringpool.h"
 #include "insn-config.h"
@@ -2709,6 +2710,7 @@ typedef struct G

[PATCH v1 2/3] dwarf: purge DIEs for unreferenced extern globals.

2017-07-12 Thread Franklin “Snaipe” Mathieu
From: Franklin “Snaipe” Mathieu 

Due to an earlier change in gcc that split the dwarf info generation
in two steps (one early, one late), the DIE for unreferenced extern
globals are no longer removed (in fact, they didn't emit it at
all since they had already processed the translation unit and
knew whether or not a variable was referenced). This is no longer
the case during the early generation.

This change addresses this problem by revisiting during the late stage
global declarations on the C side, and for each namespace on the C++
side, removing DIEs when they are unreferenced in both cases.

gcc/c/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
* c-decl.c (c_parse_final_cleanups): Call the late_global_decl
hook for each global in the extern block.

gcc/cp/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
* decl2.c (c_parse_final_cleanups): Call the late_global_decl
for each extern variable in each namespace.
(purge_unused_extern_globals): New.

gcc/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
* dwarf2out: Remove DIEs for unreferenced externs.
(struct die_struct): Add removed field.
(lookup_decl_die): Remove DIEs marked for removal.
(mark_removed): New.
(dwarf2out_late_global_decl): Refactor to remove DIE from the
sibling list, and actually check that the code filling new
new location information acts on the same conditions as it
had when it was called from the symtab code.
(dwarf2out_imported_module_or_decl_1): make the referenced
DIE perennial to avoid it being removed when deemed unused, as
it would be referenced by a DW_TAG_imported_declaration entry.

gcc/testsuite/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
gcc.dg/debug/dwarf2/pr81135.c: New test.
g++.dg/debug/dwarf2/pr81135.C: New test.
---
 gcc/c/c-decl.c  |  10 +++
 gcc/cp/decl2.c  |  30 
 gcc/dwarf2out.c | 107 
 gcc/testsuite/g++.dg/debug/dwarf2/pr81135.C |  25 +++
 gcc/testsuite/gcc.dg/debug/dwarf2/pr81135.C |  13 
 5 files changed, 156 insertions(+), 29 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/debug/dwarf2/pr81135.C
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/pr81135.C

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 2acedac..2596474 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -11259,6 +11259,16 @@ c_parse_final_cleanups (void)
   c_write_global_declarations_1 (BLOCK_VARS (ext_block));
 
   timevar_stop (TV_PHASE_DEFERRED);
+
+  /* Purge unreferenced extern variables from the debug information.  */
+  if (!seen_error ())
+  {
+tree decl;
+for (decl = BLOCK_VARS (ext_block); decl; decl = DECL_CHAIN (decl))
+   if (DECL_EXTERNAL (decl))
+(*debug_hooks->late_global_decl) (decl);
+  }
+
   timevar_start (TV_PHASE_PARSING);
 
   ext_block = NULL;
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index fd5622b..ac098e3 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "calls.h"
 #include "decl.h"
 #include "toplev.h"
+#include "debug.h"
 #include "c-family/c-objc.h"
 #include "c-family/c-pragma.h"
 #include "dumpfile.h"
@@ -4439,6 +4440,31 @@ lower_var_init ()
 }
 }
 
+/* We call this routine on each namespace to remove unreferenced extern
+   variables from the debug information.  */
+
+static int
+purge_unused_extern_globals (tree name_space)
+{
+  cp_binding_level *level = NAMESPACE_LEVEL (name_space);
+  tree decl;
+
+  if (seen_error ())
+return 1;
+
+  if (!level)
+return 0;
+
+  for (decl = level->names; decl; decl = DECL_CHAIN (decl))
+if (TREE_CODE (decl) == VAR_DECL && DECL_EXTERNAL (decl))
+  (*debug_hooks->late_global_decl) (decl);
+
+  int rc = 0;
+  for (decl = level->namespaces; decl && !rc; decl = DECL_CHAIN (decl))
+rc = purge_unused_extern_globals (decl);
+  return rc;
+}
+
 /* This routine is called at the end of compilation.
Its job is to create all the code needed to initialize and
destroy the global aggregates.  We do the destruction
@@ -4844,6 +4870,10 @@ c_parse_final_cleanups (void)
 }
 
   timevar_stop (TV_PHASE_DEFERRED);
+
+  /* Purge unreferenced extern variables from the debug information.  */
+  purge_unused_extern_globals (global_namespace);
+
   timevar_start (TV_PHASE_PARSING);
 
   /* Indicate that we're done with front end processing.  */
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index f78aaa9..b7e1770 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "function.h"
 #include "rtl.h"
 #include "tree.h"
+#include "cp/cp-tree.h"
 #include "memmodel.h"
 #include "tm_p.h"
 #include "stringpool.h"

[PATCH v1 1/3] dwarf: purge DIEs for unreferenced extern globals.

2017-07-12 Thread Franklin “Snaipe” Mathieu
From: Franklin “Snaipe” Mathieu 

Due to an earlier change in gcc that split the dwarf info generation
in two steps (one early, one late), the DIE for unreferenced extern
globals are no longer removed (in fact, they didn't emit it at
all since they had already processed the translation unit and
knew whether or not a variable was referenced). This is no longer
the case during the early generation.

This change addresses this problem by revisiting during the late stage
global declarations on the C side, and for each namespace on the C++
side, removing DIEs when they are unreferenced in both cases.

gcc/c/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
* c-decl.c (c_parse_final_cleanups): Call the late_global_decl
hook for each global in the extern block.

gcc/cp/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
* decl2.c (c_parse_final_cleanups): Call the late_global_decl
for each extern variable in each namespace.
(purge_unused_extern_globals): New.

gcc/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
* dwarf2out: Remove DIEs for unreferenced externs.
(struct die_struct): Add removed field.
(lookup_decl_die): Remove DIEs marked for removal.
(mark_removed): New.
(dwarf2out_late_global_decl): Refactor to remove DIE from the
sibling list, and actually check that the code filling new
new location information acts on the same conditions as it
had when it was called from the symtab code.
(dwarf2out_imported_module_or_decl_1): make the referenced
DIE perennial to avoid it being removed when deemed unused, as
it would be referenced by a DW_TAG_imported_declaration entry.

gcc/testsuite/ChangeLog:
2017-07-12  Franklin “Snaipe” Mathieu  

PR debug/81135
gcc.dg/debug/dwarf2/pr81135.c: New test.
g++.dg/debug/dwarf2/pr81135.C: New test.
---
 gcc/c/c-decl.c  |  10 +++
 gcc/cp/decl2.c  |  41 +++
 gcc/dwarf2out.c | 109 
 gcc/testsuite/g++.dg/debug/dwarf2/pr81135.C |  25 +++
 gcc/testsuite/gcc.dg/debug/dwarf2/pr81135.C |  13 
 5 files changed, 168 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/debug/dwarf2/pr81135.C
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/pr81135.C

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 317d5cd..9d64046 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -11227,6 +11227,16 @@ c_parse_final_cleanups (void)
   c_write_global_declarations_1 (BLOCK_VARS (ext_block));
 
   timevar_stop (TV_PHASE_DEFERRED);
+
+  /* Purge unreferenced extern variables from the debug information.  */
+  if (!seen_error ())
+  {
+tree decl;
+for (decl = BLOCK_VARS (ext_block); decl; decl = DECL_CHAIN (decl))
+   if (DECL_EXTERNAL (decl))
+(*debug_hooks->late_global_decl) (decl);
+  }
+
   timevar_start (TV_PHASE_PARSING);
 
   ext_block = NULL;
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 877745c..93929e6 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "calls.h"
 #include "decl.h"
 #include "toplev.h"
+#include "debug.h"
 #include "c-family/c-objc.h"
 #include "c-family/c-pragma.h"
 #include "dumpfile.h"
@@ -4408,6 +4409,42 @@ lower_var_init ()
 }
 }
 
+/* We call this routine on each namespace to remove unreferenced extern
+   variables from the debug information.  */
+
+static int
+purge_unused_extern_globals (tree name_space)
+{
+  int rc;
+  cp_binding_level *level = NAMESPACE_LEVEL (name_space);
+  tree decl;
+
+  if (seen_error ())
+return 1;
+
+  if (!level)
+return 0;
+
+  for (decl = level->names; decl; decl = DECL_CHAIN (decl))
+{
+  switch (TREE_CODE (decl))
+   {
+   case VAR_DECL:
+ if (DECL_EXTERNAL (decl))
+   (*debug_hooks->late_global_decl) (decl);
+ break;
+   case NAMESPACE_DECL:
+ if ((rc = purge_unused_extern_globals (decl)))
+   return rc;
+ break;
+   default:
+ break;
+   }
+}
+
+  return 0;
+}
+
 /* This routine is called at the end of compilation.
Its job is to create all the code needed to initialize and
destroy the global aggregates.  We do the destruction
@@ -4791,6 +4828,10 @@ c_parse_final_cleanups (void)
 }
 
   timevar_stop (TV_PHASE_DEFERRED);
+
+  /* Purge unreferenced extern variables from the debug information.  */
+  purge_unused_extern_globals (global_namespace);
+
   timevar_start (TV_PHASE_PARSING);
 
   /* Indicate that we're done with front end processing.  */
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index c277d27..261aeef 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -62,6 +62,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "function.h"
 #in

[PATCH v1 0/3] dwarf: purge DIEs for unreferenced extern globals.

2017-07-12 Thread Franklin “Snaipe” Mathieu
From: Franklin “Snaipe” Mathieu 

Hello GCC folks,

This patch series addresses PR 81135 [1].

* patch 1/3 is for trunk (built/tested on trunk@250093).
* patch 2/3 is the gcc7 backport (built/tested on gcc-7-branch@249680).
* patch 3/3 is the gcc6 backport (built/tested on gcc-6-branch@249671).

[1]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81135

Kind Regards,

-- 
Franklin "Snaipe" Mathieu
Arista Networks, Ltd



Re: A potential bug in lra-constraints.c for special_memory_constraint?

2017-07-12 Thread Qing Zhao

> On Jul 12, 2017, at 10:28 AM, David Miller  wrote:
> 
> From: Qing Zhao 
> Date: Wed, 12 Jul 2017 08:49:52 -0500
> 
>> and it also clearly mentioned that “specially aligned memory might
>> use this constraint”.
> 
> It guarantees the achieve the opposite of what you are trying to do.
> 
> That is, it can be used to guarantee that something is aligned to a
> multiple of X or greater.

so, the spill code can guarantee to provide a special alignment for multiple of 
X or greater?


> 
> What you want is to know that something is guaranteed to be
> aligned less strongly that X.  And that invariant is not
> provided for.

are you implying that special_memory_constraint is NOT for my current purpose?

thanks.

Qing



[PATCH], PR target/81193, Add warning for using __builtin_cpu_* on old PowerPC GLIBC's

2017-07-12 Thread Michael Meissner
This patch adds a warning to the built-in function handling if the user used
the __builtin_cpu_supports and __builtin_cpu_is were used when the compiler was
configured to use an old GLIBC that does not provice the hardware capability
bits.  Previously, the compiler would silently change the built-in function to
return 0.

I also changed the current warning in target_clones handling to be an error
instead of a warning, since it really makes no sense to use target_clones if we
can't generate a resolver function.

To allow people to figure out whether the built-in functions __builtin_cpu_is
and __builtin_cpu_supports are supported, I added the following defines if you
are using a new glibc:

__BUILTIN_CPU_IS__
__BUILTIN_CPU_SUPPORTS__

I added documentation to state the requirement for using a new GLIBC.

I went through the testsuite and added the appropriate target requires for the
tests that require __builtin_cpu_supports (cpu support test and the bmi*
tests).

I changed the libgcc build so that the IEEE float128 ifunc support function
(float128-ifunc.c) is only built if GCC is configured to use a new GLIBC.  I
verified that if you build with a compiler configured to use an old GCC that it
does not generate the ifunc handlers, and it always uses floating point
emulation.  If you configure GCC to use a new GLIBC, it will generate the ifunc
handler.  I did verify that on a power8 system, it will call the software
emulation, and on a power9 system, it will use the hardware IEEE float128
instructions.

I built boostrap compilers with/without the patches on a little endian power8
system and there were no regressions.

On the little endian power8 system, I did one run using the host Ubuntu
libraries, which are based on GLIBC 2.19.  There were no regressions between a
compiler without the patches and with the patches.  I verified that all of the
tests that use __builtin_cpu_supports are now UNSUPPORTED.

On the same little endian power8 system, I did a second set of runs using the
Advance Toolchain AT-10.4 libraries, which are based on GLIBC 2.24.  There were
no regressions between a compiler without the patches and with the patches.  I
verified that all of the tests that use __builtin_cpu_supports were run and
were successful.

I also did a test with bootstrap on a big endian power7 system with both 32-bit
and 64-bit libraries, there were no regressions.  For this set of runs, I used
the older host libraries, which do not export the hardware capability bits.  I
could not test using the Advance Toolchain on that system, since that
particular system is too old to install the newer Advance Toolchain libraries
that provide the hardware capability bits.

Can I install these patches on the trunk?  After a burn-in period, can I
install equivalent patches to GCC 7 (which added the __builtin_cpu* built-in
functions)?

[gcc]
2017-07-12  Michael Meissner  

PR target/81193
* config/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): If GLIBC
provides the hardware capability bits, define __BUILTIN_CPU_IS__
and __BUILTIN_CPU_SUPPORTS__.
* config/rs6000/rs6000.c (cpu_expand_builtin): Generate a warning
if GLIBC does not provide the hardware capability bits.  Add a
gcc_unreachable call if the built-in cpu function is neither
__builtin_cpu_is nor __builtin_cpu_supports.
(rs6000_get_function_versions_dispatcher): Change the warnging
that an old GLIBC is used which does not export the capability
bits to be an error.
* doc/extend.texi (target_clones attribute): Document the
restriction that GLIBC 2.23 or newer is needed on the PowerPC.
(PowerPC built-in functions): Document that GLIBC 2.23 or newer is
needed by __builtin_cpu_is and __builtin_cpu_supports.  Document
the macros defined by GCC if the newer GLIBC is available.

[gcc/testsuite]
2017-07-12  Michael Meissner  

PR target/81193
* gcc.target/powerpc/bmi-andn-1.c: Add guard against using
__builtin_cpu_supports with old GLIBC's.
* gcc.target/powerpc/bmi-andn-2.c: Likewise.
* gcc.target/powerpc/bmi-bextr-1.c: Likewise.
* gcc.target/powerpc/bmi-bextr-2.c: Likewise.
* gcc.target/powerpc/bmi-bextr-4.c: Likewise.
* gcc.target/powerpc/bmi-bextr-5.c: Likewise.
* gcc.target/powerpc/bmi-blsi-1.c: Likewise.
* gcc.target/powerpc/bmi-blsi-2.c: Likewise.
* gcc.target/powerpc/bmi-blsmsk-1.c: Likewise.
* gcc.target/powerpc/bmi-blsmsk-2.c: Likewise.
* gcc.target/powerpc/bmi-blsr-1.c: Likewise.
* gcc.target/powerpc/bmi-blsr-2.c: Likewise.
* gcc.target/powerpc/bmi-tzcnt-1.c: Likewise.
* gcc.target/powerpc/bmi-tzcnt-2.c: Likewise.
* gcc.target/powerpc/bmi2-bzhi32-1.c: Likewise.
* gcc.target/powerpc/bmi2-bzhi64-1.c: Likewise.
* gcc.target/powerpc/bmi2-mulx32-1.c: Likewise.
* gcc.target/powerpc/bmi2-mulx3

Re: [1/2] PR 78736: New warning -Wenum-conversion

2017-07-12 Thread Sandra Loosemore

On 07/11/2017 06:29 AM, Prathamesh Kulkarni wrote:


@@ -6074,6 +6076,12 @@ In C++ enumerated type mismatches in conditional 
expressions are also
 diagnosed and the warning is enabled by default.  In C this warning is
 enabled by @option{-Wall}.

+@item -Wenum-conversion @r{(C, Objective-C only)}
+@opindex Wenum-conversion
+@opindex Wno-enum-conversion
+Warn when an enum value of a type is implicitly converted to an enum valeu of
+another type. This warning is enabled by @option{-Wall}.
+
 @item -Wextra-semi @r{(C++, Objective-C++ only)}
 @opindex Wextra-semi
 @opindex Wno-extra-semi


Aside from the "valeu" typo, I think this would be more clearly phrased as

Warn when a value of enumerated type is implicitly converted to a 
different enumerated type.  This warning is enabled by @option{-Wall}.


The rest of the documentation changes are OK.

-Sandra



Re: A potential bug in lra-constraints.c for special_memory_constraint?

2017-07-12 Thread David Miller
From: Qing Zhao 
Date: Wed, 12 Jul 2017 08:49:52 -0500

> and it also clearly mentioned that “specially aligned memory might
> use this constraint”.

It guarantees the achieve the opposite of what you are trying to do.

That is, it can be used to guarantee that something is aligned to a
multiple of X or greater.

What you want is to know that something is guaranteed to be
aligned less strongly that X.  And that invariant is not
provided for.


Re: [PATCH] gcc/doc: list what version each attribute was introduced in

2017-07-12 Thread Sandra Loosemore

On 07/06/2017 07:25 AM, Daniel P. Berrange wrote:

There are several hundred named attribute keys that have been
introduced over many GCC releases. Applications typically need
to be compilable with multiple GCC versions, so it is important
for developers to know when GCC introduced support for each
attribute.

This augments the texi docs that list attribute keys with
a note of what version introduced the feature. The version
information was obtained through archaeology of the GCC source
repository release tags, back to gcc-4_0_0-release. For
attributes added in 4.0.0 or later, an explicit version will
be noted. Any attribute that predates 4.0.0 will simply note
that it has existed prior to 4.0.0. It is thought there is
little need to go further back in time than 4.0.0 since few,
if any, apps will still be using such old compiler versions.

Where a named attribute can be used in many contexts (ie the
'visibility' attribute can be used for both functions or
variables), it was assumed that the attribute was supported
in all use contexts at the same time.

Future patches that add new attributes to GCC should be
required to follow this new practice, by documenting the
version.


I have to reject this documentation patch as-is because the new material 
isn't in complete sentences ending with a period.


I'm also skeptical that it's a good idea overall to add this information 
to the GCC manual, although I'll bow to the wishes of the community on 
this.  The arguments I'd make against adding it are:


(1) The GCC manual is already tied to a specific version of GCC and 
searchable manuals for old versions of GCC are readily available online.


(2) Information about backward compatibility with old versions becomes 
less relevant as time goes on, and I've already removed a bunch of cruft 
describing changes that happened 10-20+ years ago.


(3) We don't have similar historical information for any other GCC 
language extensions, builtins, etc.


(4) The manual is already too long.

If the consensus of the community is that we really need this historical 
information in the current manual, I'll consider a revised patch.


-Sandra



[PATCH AArch64]Fix ICE in cortex-a57 fma steering pass

2017-07-12 Thread Bin Cheng
Hi,
After change @236817, AArch64 backend could avoid unnecessary conversion 
instructions
for register between different modes now.  As a result, GCC could initialize 
register
in larger mode and use it later in smaller mode.  such def-use chain is not 
supported
by current regrename.c analyzer, as described by its comment:

  /* Process the insn, determining its effect on the def-use
 chains and live hard registers.  We perform the following
 steps with the register references in the insn, simulating
 its effect:
 ...
 We cannot deal with situations where we track a reg in one mode
 and see a reference in another mode; these will cause the chain
 to be marked unrenamable or even cause us to abort the entire
 basic block.  */

In this case, regrename.c analyzer doesn't create chain for the use of the 
register.
OTOH, cortex-a57-fma-steering.c has below code:

@@ -973,10 +973,14 @@ func_fma_steering::analyze ()
break;
}
 
- /* We didn't find a chain with a def for this instruction.  */
- gcc_assert (i < dest_op_info->n_chains);
-
- this->analyze_fma_fmul_insn (forest, chain, head);

It assumes by gcc_assert that a chain must be found for dest register of 
fmul/fmac
instructions.  According to above analysis, this is not always true if the dest 
reg
is reused from one of its source register.

This patch fixes the issue by skipping such instructions if no du chain is 
found.
Bootstrap and test on AArch64/cortex-a57.  Is it OK?  If it's fine, I would 
also need to
backport it to 7/6 branches.

Thanks,
bin
2017-07-12  Bin Cheng  

PR target/81414
* config/aarch64/cortex-a57-fma-steering.c (analyze): Skip fmul/fmac
instructions if no du chain is found.

gcc/testsuite/ChangeLog
2017-07-12  Kyrylo Tkachov  

PR target/81414
* gcc.target/aarch64/pr81414.C: New.From ef2bc842993210a4399205d83fa46435eec5d7cd Mon Sep 17 00:00:00 2001
From: Bin Cheng 
Date: Wed, 12 Jul 2017 15:16:53 +0100
Subject: [PATCH] tmp

---
 gcc/config/aarch64/cortex-a57-fma-steering.c | 12 
 gcc/testsuite/gcc.target/aarch64/pr81414.C   | 10 ++
 2 files changed, 18 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr81414.C

diff --git a/gcc/config/aarch64/cortex-a57-fma-steering.c 
b/gcc/config/aarch64/cortex-a57-fma-steering.c
index 1bf804b..b2ee398 100644
--- a/gcc/config/aarch64/cortex-a57-fma-steering.c
+++ b/gcc/config/aarch64/cortex-a57-fma-steering.c
@@ -973,10 +973,14 @@ func_fma_steering::analyze ()
break;
}
 
- /* We didn't find a chain with a def for this instruction.  */
- gcc_assert (i < dest_op_info->n_chains);
-
- this->analyze_fma_fmul_insn (forest, chain, head);
+ /* Due to implementation of regrename, dest register can slip away
+from regrename's analysis.  As a result, there is no chain for
+the destination register of insn.  We simply skip the insn even
+it is a fmul/fmac instruction.  This case can happen when the
+dest register is also a source register of insn and the source
+reg is setup in larger mode than this insn.  */
+ if (i < dest_op_info->n_chains)
+   this->analyze_fma_fmul_insn (forest, chain, head);
}
 }
   free (bb_dfs_preorder);
diff --git a/gcc/testsuite/gcc.target/aarch64/pr81414.C 
b/gcc/testsuite/gcc.target/aarch64/pr81414.C
new file mode 100644
index 000..13666a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr81414.C
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=cortex-a57" } */
+
+typedef __Float32x2_t float32x2_t;
+__inline float32x2_t vdup_n_f32(float) {}
+ 
+float32x2_t vfma_lane_f32(float32x2_t __a, float32x2_t __b) {
+  int __lane;
+  return __builtin_aarch64_fmav2sf(__b, vdup_n_f32(__lane), __a);
+}
-- 
1.9.1



Re: [PATCH 3/3] matching tokens: C++ parts

2017-07-12 Thread Martin Sebor

On 07/12/2017 07:13 AM, Trevor Saunders wrote:

On Tue, Jul 11, 2017 at 11:24:45AM -0400, David Malcolm wrote:

+/* Some tokens naturally come in pairs e.g.'(' and ')'.
+   This class is for tracking such a matching pair of symbols.
+   In particular, it tracks the location of the first token,
+   so that if the second token is missing, we can highlight the
+   location of the first token when notifying the user about the
+   problem.  */
+
+template 


the style guide says template arguments should be in mixed case, so
TokenPairTraits, and the _t looks odd to my eyes.


+class token_pair
+{
+ private:
+  typedef token_pair_traits_t traits_t;


I'm not really sure what this is about, you can name it whatever you
like as a template argument, and this name seems less descriptive of
what its about.


In generic code, a typedef for a template parameter makes it
possible to refer to the parameter even when it's a member of
a type whose template parameters aren't known (or that's not
even a template).  In the C++ standard library the naming
convention is to end such typedefs with _type (e.g., value_type,
allocator_type, etc.)  GCC itself makes use of this convention
in its hash_table template. (I have no idea if token_pair is
ever used in type generic contexts where the typedef is needed.)

As an aside, it's interesting to note that names that end in _t
are reserved by POSIX, so (purely) pedantically speaking, making
use of them for own symbols is undefined (this is probably one
of the most commonly abused POSIX requirements; even the C++
standard flagrantly disregards it).

Martin


[PATCH] Fix PR81362: Vector peeling

2017-07-12 Thread Robin Dapp
The attached patch fixes PR81362.

npeel was erroneously overwritten by vect_peeling_hash_get_lowest_cost
although the corresponding dataref is not used afterwards.  It should be
safe to get rid of the npeel parameter since we use the returned
peeling_info's npeel anyway.  Also removed the body_cost_vec parameter
which is not used elsewhere.

Regards
 Robin


--

gcc/ChangeLog:

2017-07-12  Robin Dapp  

* (vect_enhance_data_refs_alignment):
Remove body_cost_vec from _vect_peel_extended_info.
tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost):
Do not set body_cost_vec.
(vect_peeling_hash_choose_best_peeling): Remove body_cost_vec
and npeel.
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 5103ba1..0b8eee7 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1161,7 +1161,6 @@ typedef struct _vect_peel_extended_info
   struct _vect_peel_info peel_info;
   unsigned int inside_cost;
   unsigned int outside_cost;
-  stmt_vector_for_cost body_cost_vec;
 } *vect_peel_extended_info;
 
 
@@ -1309,6 +1308,8 @@ vect_peeling_hash_get_lowest_cost (_vect_peel_info **slot,
   vect_get_peeling_costs_all_drs (elem->dr, &inside_cost, &outside_cost,
   &body_cost_vec, elem->npeel, false);
 
+  body_cost_vec.release ();
+
   outside_cost += vect_get_known_peeling_cost
 (loop_vinfo, elem->npeel, &dummy,
  &LOOP_VINFO_SCALAR_ITERATION_COST (loop_vinfo),
@@ -1327,14 +1328,10 @@ vect_peeling_hash_get_lowest_cost (_vect_peel_info **slot,
 {
   min->inside_cost = inside_cost;
   min->outside_cost = outside_cost;
-  min->body_cost_vec.release ();
-  min->body_cost_vec = body_cost_vec;
   min->peel_info.dr = elem->dr;
   min->peel_info.npeel = elem->npeel;
   min->peel_info.count = elem->count;
 }
-  else
-body_cost_vec.release ();
 
   return 1;
 }
@@ -1346,14 +1343,11 @@ vect_peeling_hash_get_lowest_cost (_vect_peel_info **slot,
 
 static struct _vect_peel_extended_info
 vect_peeling_hash_choose_best_peeling (hash_table *peeling_htab,
-   loop_vec_info loop_vinfo,
-   unsigned int *npeel,
-   stmt_vector_for_cost *body_cost_vec)
+   loop_vec_info loop_vinfo)
 {
struct _vect_peel_extended_info res;
 
res.peel_info.dr = NULL;
-   res.body_cost_vec = stmt_vector_for_cost ();
 
if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
  {
@@ -1371,8 +1365,6 @@ vect_peeling_hash_choose_best_peeling (hash_table *peeling_hta
res.outside_cost = 0;
  }
 
-   *npeel = res.peel_info.npeel;
-   *body_cost_vec = res.body_cost_vec;
return res;
 }
 
@@ -1537,7 +1529,6 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
   unsigned possible_npeel_number = 1;
   tree vectype;
   unsigned int nelements, mis, same_align_drs_max = 0;
-  stmt_vector_for_cost body_cost_vec = stmt_vector_for_cost ();
   hash_table peeling_htab (1);
 
   if (dump_enabled_p ())
@@ -1812,7 +1803,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
  unless aligned.  So we try to choose the best possible peeling from
 	 the hash table.  */
   peel_for_known_alignment = vect_peeling_hash_choose_best_peeling
-	(&peeling_htab, loop_vinfo, &npeel, &body_cost_vec);
+	(&peeling_htab, loop_vinfo);
 }
 
   /* Compare costs of peeling for known and unknown alignment. */
@@ -1838,7 +1829,8 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
 {
   /* Calculate the penalty for no peeling, i.e. leaving everything
 	 unaligned.
-	 TODO: Adapt vect_get_peeling_costs_all_drs and use here.  */
+	 TODO: Adapt vect_get_peeling_costs_all_drs and use here.
+	 TODO: Use nopeel_outside_cost or get rid of it?  */
   unsigned nopeel_inside_cost = 0;
   unsigned nopeel_outside_cost = 0;
 
@@ -1920,10 +1912,7 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
   if (!stat)
 do_peeling = false;
   else
-	{
-	  body_cost_vec.release ();
-	  return stat;
-	}
+	return stat;
 }
 
   /* Cost model #1 - honor --param vect-max-peeling-for-alignment.  */
@@ -1999,19 +1988,16 @@ vect_enhance_data_refs_alignment (loop_vec_info loop_vinfo)
   dump_printf_loc (MSG_NOTE, vect_location,
"Peeling for alignment will be applied.\n");
 }
+
 	  /* The inside-loop cost will be accounted for in vectorizable_load
 	 and vectorizable_store correctly with adjusted alignments.
 	 Drop the body_cst_vec on the floor here.  */
-	  body_cost_vec.release ();
-
 	  stat = vect_verify_datarefs_alignment (loop_vinfo);
 	  gcc_assert (stat);
   return stat;
 }
 }
 
-  body_cost_vec.release ();
-
   /* (2) Versioning to force alignment.  */
 
   /* Try versioning if:


[Committed] S/390: Calculate costs for load/store on condition

2017-07-12 Thread Andreas Krebbel
This adds code to the backend rtx_costs function in order to model the
costs of a load/store on condition.

Bootstrapped and regression tested on s390x.

gcc/ChangeLog:

2017-07-12  Andreas Krebbel  

* config/s390/s390.c (s390_rtx_costs): Return proper costs for
load/store on condition.
---
 gcc/ChangeLog  |  5 +
 gcc/config/s390/s390.c | 42 ++
 2 files changed, 47 insertions(+)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e3d096c..128fc92 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2017-07-12  Andreas Krebbel  
+
+   * config/s390/s390.c (s390_rtx_costs): Return proper costs for
+   load/store on condition.
+
 2017-07-11  Michael Collison  
 
* config/aarch64/aarch64-simd.md (aarch64_sub_compare0):
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 958ee3b..31ced21 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -3414,6 +3414,48 @@ s390_rtx_costs (rtx x, machine_mode mode, int outer_code,
   *total = 0;
   return true;
 
+case SET:
+  {
+   /* Without this a conditional move instruction would be
+  accounted as 3 * COSTS_N_INSNS (set, if_then_else,
+  comparison operator).  That's a bit pessimistic.  */
+
+   if (!TARGET_Z196 || GET_CODE (SET_SRC (x)) != IF_THEN_ELSE)
+ return false;
+
+   rtx cond = XEXP (SET_SRC (x), 0);
+
+   if (!CC_REG_P (XEXP (cond, 0)) || !CONST_INT_P (XEXP (cond, 1)))
+ return false;
+
+   /* It is going to be a load/store on condition.  Make it
+  slightly more expensive than a normal load.  */
+   *total = COSTS_N_INSNS (1) + 1;
+
+   rtx dst = SET_DEST (x);
+   rtx then = XEXP (SET_SRC (x), 1);
+   rtx els = XEXP (SET_SRC (x), 2);
+
+   /* It is a real IF-THEN-ELSE.  An additional move will be
+  needed to implement that.  */
+   if (reload_completed
+   && !rtx_equal_p (dst, then)
+   && !rtx_equal_p (dst, els))
+ *total += COSTS_N_INSNS (1) / 2;
+
+   /* A minor penalty for constants we cannot directly handle.  */
+   if ((CONST_INT_P (then) || CONST_INT_P (els))
+   && (!TARGET_Z13 || MEM_P (dst)
+   || (CONST_INT_P (then) && !satisfies_constraint_K (then))
+   || (CONST_INT_P (els) && !satisfies_constraint_K (els
+ *total += COSTS_N_INSNS (1) / 2;
+
+   /* A store on condition can only handle register src operands.  */
+   if (MEM_P (dst) && (!REG_P (then) || !REG_P (els)))
+ *total += COSTS_N_INSNS (1) / 2;
+
+   return true;
+  }
 case IOR:
   /* risbg */
   if (GET_CODE (XEXP (x, 0)) == AND
-- 
2.9.1



[Committed] S/390: Remove loc splitter

2017-07-12 Thread Andreas Krebbel
The backend splitter splitting a 3 operand load on condition into 2 is
wrong.  The S/390 load on condition instruction might trap on the
memory operand even if the condition is false.  So if the first load
on condition overwrites a register used as part of the memory address
of the second the second might trigger a segfault even if it does not
actually perform the load.

Trying to fix this I noticed that the generated code looks anyway
better without the splitter.  So removing the splitter entirely is the
way to go here.

Bootstrapped and regression tested on s390x.

gcc/ChangeLog:

2017-07-12  Andreas Krebbel  

* config/s390/s390.md: Remove movcc splitter.
---
 gcc/ChangeLog   |  4 
 gcc/config/s390/s390.md | 30 --
 2 files changed, 12 insertions(+), 22 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 128fc92..31f9832 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,9 @@
 2017-07-12  Andreas Krebbel  
 
+   * config/s390/s390.md: Remove movcc splitter.
+
+2017-07-12  Andreas Krebbel  
+
* config/s390/s390.c (s390_rtx_costs): Return proper costs for
load/store on condition.
 
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index cfae171..0eef9b1 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -6600,14 +6600,14 @@
 })
 
 ; locr, loc, stoc, locgr, locg, stocg, lochi, locghi
-(define_insn_and_split "*movcc"
-  [(set (match_operand:GPR 0 "nonimmediate_operand"   "=d,d,d,d,d,d,S,S,&d")
+(define_insn "*movcc"
+  [(set (match_operand:GPR 0 "nonimmediate_operand"   "=d,d,d,d,d,d,S,S")
(if_then_else:GPR
  (match_operator 1 "s390_comparison"
-   [(match_operand 2 "cc_reg_operand"" c,c,c,c,c,c,c,c,c")
+   [(match_operand 2 "cc_reg_operand"" c,c,c,c,c,c,c,c")
 (match_operand 5 "const_int_operand" "")])
- (match_operand:GPR 3 "loc_operand" " d,0,S,0,K,0,d,0,S")
- (match_operand:GPR 4 "loc_operand" " 0,d,0,S,0,K,0,d,S")))]
+ (match_operand:GPR 3 "loc_operand" " d,0,S,0,K,0,d,0")
+ (match_operand:GPR 4 "loc_operand" " 0,d,0,S,0,K,0,d")))]
   "TARGET_Z196"
   "@
locr%C1\t%0,%3
@@ -6617,23 +6617,9 @@
lochi%C1\t%0,%h3
lochi%D1\t%0,%h4
stoc%C1\t%3,%0
-   stoc%D1\t%4,%0
-   #"
-  "&& reload_completed
-   && MEM_P (operands[3]) && MEM_P (operands[4])"
-  [(set (match_dup 0)
-   (if_then_else:GPR
-(match_op_dup 1 [(match_dup 2) (const_int 0)])
-(match_dup 3)
-(match_dup 0)))
-   (set (match_dup 0)
-   (if_then_else:GPR
-(match_op_dup 1 [(match_dup 2) (const_int 0)])
-(match_dup 0)
-(match_dup 4)))]
-  ""
-  [(set_attr "op_type" "RRF,RRF,RSY,RSY,RIE,RIE,RSY,RSY,*")
-   (set_attr "cpu_facility" "*,*,*,*,z13,z13,*,*,*")])
+   stoc%D1\t%4,%0"
+  [(set_attr "op_type" "RRF,RRF,RSY,RSY,RIE,RIE,RSY,RSY")
+   (set_attr "cpu_facility" "*,*,*,*,z13,z13,*,*")])
 
 ;;
 ;;- Multiply instructions.
-- 
2.9.1



Re: [PATCH] PR libstdc++/80316 make promise::set_value throw no_state error

2017-07-12 Thread Christophe Lyon
On 12 July 2017 at 12:15, Jonathan Wakely  wrote:
> On 12/07/17 09:46 +0200, Christophe Lyon wrote:
>>
>> On 11 July 2017 at 14:39, Jonathan Wakely  wrote:
>>>
>>> On 11/07/17 12:53 +0100, Jonathan Wakely wrote:


 On 21/04/17 15:54 +0100, Jonathan Wakely wrote:
>
>
> On 4 April 2017 at 20:44, Jonathan Wakely wrote:
>>
>>
>> We got a bug report from a customer pointing out that calling
>> promise::set_value on a moved-from promise crashes instead of throwing
>> an exception with error code future_errc::no_state.
>>
>> This fixes it, by moving the _S_check calls to *before* we deference
>> the pointer that the calls check!
>>
>> This passes all tests, including the more comprehensive ones I've
>> added as part of this commit, but I think it can wait for stage 1
>> anyway. We've been shipping this bug for a couple of releases already.
>>
>>PR libstdc++/80316
>>* include/std/future (_State_baseV2::_Setter::operator()):
>> Remove
>>_S_check calls that are done after the pointer to the shared
>> state
>> is
>>already dereferenced.
>>(_State_baseV2::_Setter<_Res, void>): Define specialization for
>> void
>>as partial specialization so it can be defined within the
>> definition
>>of _State_baseV2.
>>(_State_baseV2::__setter): Call _S_check.
>>(_State_baseV2::__setter(promise*)): Add overload for use
>> by
>>promise::set_value and
>> promise::set_value_at_thread_exit.
>>(promise, promise, promise): Make _State a friend.
>>(_State_baseV2::_Setter): Remove explicit
>> specialization.
>>(promise::set_value,
>> promise::set_value_at_thread_exit):
>>Use new __setter overload.
>>* testsuite/30_threads/promise/members/at_thread_exit2.cc: New
>> test.
>>* testsuite/30_threads/promise/members/set_exception.cc: Test
>>promise and promise specializations.
>>* testsuite/30_threads/promise/members/set_exception2.cc:
>> Likewise.
>>Test for no_state error condition.
>>* testsuite/30_threads/promise/members/set_value2.cc: Likewise.
>
>
>
>
> This is now committed to trunk.



 And now also to gcc-7-branch.
>>>
>>>
>>>
>>> And gcc-6-branch.
>>>
>>
>> Hi Jonathan,
>>
>> The new test fails to compile on arm when forcing -march=armv5t:
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:
>> In function 'void test01()':^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:31:
>> error: aggregate 'std::promise p1' has incomplete type and cannot
>> be defined^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:44:
>> error: 'make_exception_ptr' is not a member of 'std'^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:52:
>> error: variable 'std::promise p2' has initializer but incomplete
>> type^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:64:
>> error: 'make_exception_ptr' is not a member of 'std'^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:
>> In function 'void test02()':^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:75:
>> error: aggregate 'std::promise p1' has incomplete type and
>> cannot be defined^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:89:
>> error: 'make_exception_ptr' is not a member of 'std'^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:97:
>> error: variable 'std::promise p2' has initializer but incomplete
>> type^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:110:
>> error: 'make_exception_ptr' is not a member of 'std'^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:
>> In function 'void test03()':^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:121:
>> error: aggregate 'std::promise p1' has incomplete type and
>> cannot be defined^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:134:
>> error: 'make_exception_ptr' is not a member of 'std'^M
>>
>> /aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:142:
>> error: variable 'std::promise p2' has initial

[PATCH] PR81393: S/390: libgo: Fix ptrace register set accessors.

2017-07-12 Thread Andreas Krebbel
ptrace SETREGS and GETREGS were never supported on S/390.  The macros
were accidentally defined in the Glibc header though.  A recent Glibc
change removed them breaking libgo build on S/390.

This patch changes the ptrace calls to use PEEKUSR_AREA/POKEUSR_AREA to
access the register sets.  That's what GDB does.

Bootstrapped and regression tested on s390x.

Bye,

-Andreas-
---
 libgo/go/syscall/syscall_linux_s390.go  | 16 +---
 libgo/go/syscall/syscall_linux_s390x.go | 30 +++---
 2 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/libgo/go/syscall/syscall_linux_s390.go 
b/libgo/go/syscall/syscall_linux_s390.go
index d6d3f6a..9364bcd 100644
--- a/libgo/go/syscall/syscall_linux_s390.go
+++ b/libgo/go/syscall/syscall_linux_s390.go
@@ -4,6 +4,8 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
+// See the s390x version for why we don't use GETREGSET/SETREGSET
+
 package syscall
 
 import "unsafe"
@@ -12,10 +14,18 @@ func (r *PtraceRegs) PC() uint64 { return 
uint64(r.Psw.addr) }
 
 func (r *PtraceRegs) SetPC(pc uint64) { r.Psw.addr = uint32(pc) }
 
-func PtraceGetRegs(pid int, regsout *PtraceRegs) (err error) {
-   return ptrace(PTRACE_GETREGS, pid, 0, uintptr(unsafe.Pointer(regsout)))
+func PtraceGetRegs(pid int, regs *PtraceRegs) (err error) {
+   parea :=  _ptrace_area{ _sizeof_ptrace_area,
+   0,
+   uint32(uintptr(unsafe.Pointer(regs))) }
+   return ptrace(PTRACE_PEEKUSR_AREA, pid,
+ uintptr(unsafe.Pointer(&parea)), 0)
 }
 
 func PtraceSetRegs(pid int, regs *PtraceRegs) (err error) {
-   return ptrace(PTRACE_SETREGS, pid, 0, uintptr(unsafe.Pointer(regs)))
+   parea := _ptrace_area{ _sizeof_ptrace_area,
+  0,
+  uint32(uintptr(unsafe.Pointer(regs))) }
+   return ptrace(PTRACE_POKEUSR_AREA, pid,
+ uintptr(unsafe.Pointer(&parea)), 0)
 }
diff --git a/libgo/go/syscall/syscall_linux_s390x.go 
b/libgo/go/syscall/syscall_linux_s390x.go
index f3701dc..8d9ca84 100644
--- a/libgo/go/syscall/syscall_linux_s390x.go
+++ b/libgo/go/syscall/syscall_linux_s390x.go
@@ -4,6 +4,24 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
+
+// The PtraceRegs struct generated for go looks like this:
+//
+// type PtraceRegs struct
+// {
+//   Psw _psw_t;
+//   Gprs [15+1]uint64;
+//   Acrs [15+1]uint32;
+//   Orig_gpr2 uint64;
+//   Fp_regs _s390_fp_regs;
+//   Per_info _per_struct;
+//   Ieee_instruction_pointer uint64;
+// }
+//
+// The GETREGSET/SETREGSET ptrace commands on S/390 only read/write
+// the content up to Orig_gpr2.  Hence, we use
+// PEEKUSR_AREA/POKEUSR_AREA like GDB does.
+
 package syscall
 
 import "unsafe"
@@ -12,10 +30,16 @@ func (r *PtraceRegs) PC() uint64 { return r.Psw.addr }
 
 func (r *PtraceRegs) SetPC(pc uint64) { r.Psw.addr = pc }
 
-func PtraceGetRegs(pid int, regsout *PtraceRegs) (err error) {
-   return ptrace(PTRACE_GETREGS, pid, 0, uintptr(unsafe.Pointer(regsout)))
+func PtraceGetRegs(pid int, regs *PtraceRegs) (err error) {
+   parea := _ptrace_area{ _sizeof_ptrace_area,
+  0,
+  uint64(uintptr(unsafe.Pointer(regs))) }
+   return ptrace(PTRACE_PEEKUSR_AREA, pid, 
uintptr(unsafe.Pointer(&parea)), 0)
 }
 
 func PtraceSetRegs(pid int, regs *PtraceRegs) (err error) {
-   return ptrace(PTRACE_SETREGS, pid, 0, uintptr(unsafe.Pointer(regs)))
+   parea := _ptrace_area{ _sizeof_ptrace_area,
+  0,
+  uint64(uintptr(unsafe.Pointer(regs))) }
+   return ptrace(PTRACE_POKEUSR_AREA, pid, 
uintptr(unsafe.Pointer(&parea)), 0)
 }
-- 
2.9.1



[PATCH] Fix when -lssp is added by driver (PR middle-end/81400).

2017-07-12 Thread Martin Liška

Hi.

Following patch adds -lspp when one uses -mstack-protector-guard=global.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?

Martin

gcc/ChangeLog:

2017-07-12  Martin Liska  

PR middle-end/81400
* gcc.c: Add -lssp when one uses -mstack-protector-guard=global.
---
 gcc/gcc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


diff --git a/gcc/gcc.c b/gcc/gcc.c
index e8e3d6687c3..0043f86d8d2 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -869,7 +869,8 @@ proper position among the other output files.  */
 #ifndef LINK_SSP_SPEC
 #ifdef TARGET_LIBC_PROVIDES_SSP
 #define LINK_SSP_SPEC "%{fstack-protector|fstack-protector-all" \
-		   "|fstack-protector-strong|fstack-protector-explicit:}"
+		   "|fstack-protector-strong|fstack-protector-explicit:" \
+		   "%{mstack-protector-guard=global:-lssp}}"
 #else
 #define LINK_SSP_SPEC "%{fstack-protector|fstack-protector-all" \
 		   "|fstack-protector-strong|fstack-protector-explicit" \



[PATCH, GCC/testsuite/ARM] Fix coprocessor intrinsic test failures on ARMv8-A

2017-07-12 Thread Thomas Preudhomme

Coprocessor intrinsic tests in gcc.target/arm/acle test whether
__ARM_FEATURE_COPROC has the right bit defined before calling the
intrinsic. This allows to test both the correct setting of that macro
and the availability and correct working of the intrinsic. However the
__ARM_FEATURE_COPROC macro is no longer defined for ARMv8-A since
r249399.

This patch changes the testcases to skip that test for ARMv8-A and
ARMv8-R targets.  It also fixes some irregularity in the coprocessor
effective targets:
- add ldcl and stcl to the list of instructions listed as guarded by
  arm_coproc1_ok
- enable tests guarded by arm_coproc2_ok, arm_coproc3_ok and
  arm_coproc4_ok for Thumb-2 capable targets but disable for Thumb-1
  targets.

ChangeLog entry is as follows:

*** gcc/testsuite/ChangeLog ***

2017-07-04  Thomas Preud'homme  

* gcc.target/arm/acle/cdp.c: Skip __ARM_FEATURE_COPROC check for
ARMv8-A and ARMv8-R.
* gcc.target/arm/acle/cdp2.c: Likewise.
* gcc.target/arm/acle/ldc.c: Likewise.
* gcc.target/arm/acle/ldc2.c: Likewise.
* gcc.target/arm/acle/ldc2l.c: Likewise.
* gcc.target/arm/acle/ldcl.c: Likewise.
* gcc.target/arm/acle/mcr.c: Likewise.
* gcc.target/arm/acle/mcr2.c: Likewise.
* gcc.target/arm/acle/mcrr.c: Likewise.
* gcc.target/arm/acle/mcrr2.c: Likewise.
* gcc.target/arm/acle/mrc.c: Likewise.
* gcc.target/arm/acle/mrc2.c: Likewise.
* gcc.target/arm/acle/mrrc.c: Likewise.
* gcc.target/arm/acle/mrrc2.c: Likewise.
* gcc.target/arm/acle/stc.c: Likewise.
* gcc.target/arm/acle/stc2.c: Likewise.
* gcc.target/arm/acle/stc2l.c: Likewise.
* gcc.target/arm/acle/stcl.c: Likewise.
* lib/target-supports.exp:
(check_effective_target_arm_coproc1_ok_nocache): Mention ldcl
and stcl in the comment.
(check_effective_target_arm_coproc2_ok_nocache): Allow Thumb-2 targets
and disable Thumb-1 targets.
(check_effective_target_arm_coproc3_ok_nocache): Likewise.
(check_effective_target_arm_coproc4_ok_nocache): Likewise.

Tested by running all tests in gcc.target/arm/acle before and after this
patch for ARMv6-M, ARMv7-M, ARMv7E-M, ARMv3, ARMv4 (ARM state), ARMv4T
(Thumb state), ARMv5 (ARM state), ARMv5TE (ARM state), ARMv6 (ARM
state), ARMv6T2 (Thumb state) and and ARMv8-A (both state). The only
changes are for ARMv8-A where tests FAILing are now PASSing again.

Is this ok for trunk?

Best regards,

Thomas
diff --git a/gcc/testsuite/gcc.target/arm/acle/cdp.c b/gcc/testsuite/gcc.target/arm/acle/cdp.c
index cebd8c4024ea1930f490f63e5267a33bac59a3a8..cfa922a797cddbf4a99f27ec156fd2d2fc9a460d 100644
--- a/gcc/testsuite/gcc.target/arm/acle/cdp.c
+++ b/gcc/testsuite/gcc.target/arm/acle/cdp.c
@@ -5,7 +5,8 @@
 /* { dg-require-effective-target arm_coproc1_ok } */
 
 #include "arm_acle.h"
-#if (__ARM_FEATURE_COPROC & 0x1) == 0
+#if (__ARM_ARCH < 8 || !defined (__ARM_ARCH_ISA_ARM)) \
+&& (__ARM_FEATURE_COPROC & 0x1) == 0
   #error "__ARM_FEATURE_COPROC does not have correct feature bits set"
 #endif
 
diff --git a/gcc/testsuite/gcc.target/arm/acle/cdp2.c b/gcc/testsuite/gcc.target/arm/acle/cdp2.c
index 945d435d2fb99962ff47d921d9cb3633cb75bb79..b18076c26274043be8ac71e6516b9b6eac3b4137 100644
--- a/gcc/testsuite/gcc.target/arm/acle/cdp2.c
+++ b/gcc/testsuite/gcc.target/arm/acle/cdp2.c
@@ -5,7 +5,8 @@
 /* { dg-require-effective-target arm_coproc2_ok } */
 
 #include "arm_acle.h"
-#if (__ARM_FEATURE_COPROC & 0x2) == 0
+#if (__ARM_ARCH < 8 || !defined (__ARM_ARCH_ISA_ARM)) \
+&& (__ARM_FEATURE_COPROC & 0x2) == 0
   #error "__ARM_FEATURE_COPROC does not have correct feature bits set"
 #endif
 
diff --git a/gcc/testsuite/gcc.target/arm/acle/ldc.c b/gcc/testsuite/gcc.target/arm/acle/ldc.c
index cd57343208fc5b17e5391d11d126d20e224d6566..10c879f4a15e7c293541c61dc974d972798ecedf 100644
--- a/gcc/testsuite/gcc.target/arm/acle/ldc.c
+++ b/gcc/testsuite/gcc.target/arm/acle/ldc.c
@@ -5,7 +5,8 @@
 /* { dg-require-effective-target arm_coproc1_ok } */
 
 #include "arm_acle.h"
-#if (__ARM_FEATURE_COPROC & 0x1) == 0
+#if (__ARM_ARCH < 8 || !defined (__ARM_ARCH_ISA_ARM)) \
+&& (__ARM_FEATURE_COPROC & 0x1) == 0
   #error "__ARM_FEATURE_COPROC does not have correct feature bits set"
 #endif
 
diff --git a/gcc/testsuite/gcc.target/arm/acle/ldc2.c b/gcc/testsuite/gcc.target/arm/acle/ldc2.c
index d7691e30d763d1e921817fd586b47888e1b5c78f..d561adacccf358a1dbfa9db253c9bc08847c7e33 100644
--- a/gcc/testsuite/gcc.target/arm/acle/ldc2.c
+++ b/gcc/testsuite/gcc.target/arm/acle/ldc2.c
@@ -5,7 +5,8 @@
 /* { dg-require-effective-target arm_coproc2_ok } */
 
 #include "arm_acle.h"
-#if (__ARM_FEATURE_COPROC & 0x2) == 0
+#if (__ARM_ARCH < 8 || !defined (__ARM_ARCH_ISA_ARM)) \
+&& (__ARM_FEATURE_COPROC & 0x2) == 0
   #error "__ARM_FEATURE_COPROC does not have correct feature bits set"
 #endif
 
diff --git a/gcc/testsuite/gcc.target/arm/acle/ldc2l.c b/gcc/testsuite

Re: [ping #4][patch] Fix PR80929: Realistic PARALLEL cost in seq_cost.

2017-07-12 Thread Georg-Johann Lay

On 12.07.2017 14:11, Segher Boessenkool wrote:

Hi,

On Tue, Jul 11, 2017 at 10:47:27AM +0200, Georg-Johann Lay wrote:

This small addition improves costs of PARALLELs in
rtlanal.c:seq_cost().  Up to now, these costs are
assumed to be 1 which gives gross inexact costs for,
e.g. divmod which is represented as PARALLEL.


insn_rtx_cost returns 0 ("unknown") for such a PARALLEL, so your
current patch does not change this at all?


Huh?  It returns the costs of 1st SET in a PARALLEL (provided it
has one), no?  Or even costs for come compares.



--- rtlanal.c   (revision 248745)
+++ rtlanal.c   (working copy)
@@ -5300,6 +5300,9 @@ seq_cost (const rtx_insn *seq, bool spee
set = single_set (seq);
if (set)
  cost += set_rtx_cost (set, speed);


So, why does this not use insn_rtx_cost as well?


You'll have to ask the author of that line...

And I didn't intend to change existing behaviour except for a
case where I know that "1" is far off the real costs.




+  else if (INSN_P (seq)
+  && PARALLEL == GET_CODE (PATTERN (seq)))


Yoda conditions have we should not.


hmm, I didn't find something like PARALLEL_P (rtx).
Is comparing rtx_codes deprecated now?


+   cost += 1 + insn_rtx_cost (PATTERN (seq), speed);
else
  cost++;
  }


This whole thing could be something like

   if (INSN_P (seq))
 {
   int t = insn_rtx_cost (PATTERN (seq), speed);


This will behave differently.  The original set_src_cost calls
rtx_costs with SET and outer_code = INSN, insn_rtx_cost does not.

My intentions was to be conservative and not silently introduce
performance degradations in whatever back-end by changing the
seen RTXes or codes.

Everything that rtx_costs was called for will be the same.
Nothing changes, just some new calls are added.  But neither are
existing calls removed nor are ones changes to up different
arguments.



   cost += t ? t : COST_N_INSNS (1);
 }
   else
 cost++;

But set_rtx_cost does *not* return the same answer as insn_rtx_cost.

(Why do you need a check for INSN_P here?  Why does it increment the



cost for non-insns?  So many questions).


Again, you'll have to ask the original author for reasoning.

Johann




Segher





Re: [PATCH 3/3] matching tokens: C++ parts

2017-07-12 Thread Trevor Saunders
On Tue, Jul 11, 2017 at 11:24:45AM -0400, David Malcolm wrote:
> +/* Some tokens naturally come in pairs e.g.'(' and ')'.
> +   This class is for tracking such a matching pair of symbols.
> +   In particular, it tracks the location of the first token,
> +   so that if the second token is missing, we can highlight the
> +   location of the first token when notifying the user about the
> +   problem.  */
> +
> +template 

the style guide says template arguments should be in mixed case, so
TokenPairTraits, and the _t looks odd to my eyes.

> +class token_pair
> +{
> + private:
> +  typedef token_pair_traits_t traits_t;

I'm not really sure what this is about, you can name it whatever you
like as a template argument, and this name seems less descriptive of
what its about.

> + public:
> +  /* token_pair's ctor.  */
> +  token_pair () : m_open_loc (UNKNOWN_LOCATION) {}

What do you think of passing the parser to the ctor, and dropping it
from the other arguments?  It doesn't seem to make sense to support
passing in different parsers?

> +  /* If the next token is the closing symbol for this pair, consume it
> + and return it.
> + Otherwise, issue an error, highlighting the location of the
> + corrsponding opening token, and return NULL.  */

typo.

> +/* A subclass of token_pair for tracking matching pairs of parentheses.  */
> +
> +class matching_parens : public token_pair

It seems a little strange for this class to both subclass and be the
traits, given that the token_pair class doesn't need objeects of the
template argument type.  I'd consider writing this as

struct matching_paren_traits
{
  static const cpp_ttype open_token_type = CPP_OPEN_PAREN;
  ...
};

typedef token_pair matching_parens;

> +{
> + public:
> +  static const enum cpp_ttype open_token_type;
> +  static const enum required_token required_token_open;
> +  static const enum cpp_ttype close_token_type;
> +  static const enum required_token required_token_close;

Given that these are static consts of integer type I think its fine to
define them inline in the class.

> +class matching_braces : public token_pair

same comments here.

thanks

Trev



Re: [PATCH, rs6000] Add support for vec_revb builtin

2017-07-12 Thread Segher Boessenkool
Hi Carl,

On Tue, Jul 11, 2017 at 09:55:31PM -0700, Carl Love wrote:
> 2017-07-11  Carl Love  
> 
>   * config/rs6000/rs6000-c.c: Add support for built-in functions
>   vector bool char vec_revb (vector bool char);
>   vector bool short vec_revb (vector short char);
>   vector bool int vec_revb (vector bool int);
>   vector bool long long vec_revb (vector bool long long);
>   * doc/extend.texi: Update the built-in documentation file for the
>   new built-in functions.
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-07-11  Carl Love  
> 
>   * gcc.target/powerpc/p9-xxbr-1.c (rev_bool_char, rev_bool_short,
>   rev_bool_int): Add test cases for builtins.
>   * gcc.target/powerpc/p9-xxbr-2.c (rev_long_long, rev_ulong_ulong): Add
>   test cases for builtins.

That looks fine, please commit.  Thanks,


Segher


Re: [PATCH][RFA/RFC] Stack clash mitigation patch 07/08

2017-07-12 Thread Trevor Saunders
On Tue, Jul 11, 2017 at 08:02:26PM -0600, Jeff Law wrote:
> On 07/11/2017 06:20 PM, Wilco Dijkstra wrote:
> > Jeff Law wrote:
> >> aarch64 is the first target that does not have any implicit probes in
> >> the caller.  Thus at prologue entry it must make conservative
> >> assumptions about the offset of the most recent probed address relative
> >> to the stack pointer.
> > 
> > No - like I mentioned before that's not correct nor acceptable as it would 
> > imply
> > that ~70% of functions need a probe at entry. I did a quick run across SPEC 
> > and
> > found the outgoing argument size is > 1024 in just 9 functions out of 
> > 147000!
> > Those 9 were odd special cases due to auto generated code to interface 
> > between
> > C and Fortran. This is extremely unlikely to occur anywhere else. So even 
> > assuming
> > an unchecked caller, large outgoing arguments are simply not a realistic 
> > threat.
> Whether or not such frames exist in SPEC isn't the question.   THere's
> nothing in the ABI or ISA that allows us to avoid those probes without
> compromising security.
> 
> Mixed code scenarios are going to be a fact of life, probably forever
> unless we can convince every ISV providing software that works on top of
> RHEL/*BSD/whatever  to turn on probing (based on my experiences, that
> has exactly a zero chance of occurring).

On the other hand if probing is fast enough that it can be on by default
in gcc there should be much less of it.  Even if you did change the ABI
to require probing it seems unlikely that code violating that
requirement would hit problems other than this security concern, so I'd
expect there will be some non compliant asm out there.

> In a mixed code scenario a caller may have a large alloca or large
> outgoing argument area that pushes the stack pointer into the guard page
> without actually accessing the guard page.  That requires a callee which
> is compiled with stack checking to make worst case assumptions at
> function entry to protect itself as much as possible from these attacks.

It seems to me pretty important to ask how many programs out there have
a caller that can push the stack into the guard page, but not past it.
I'd expect that's not the case for most allocas, either they are safe,
or can increase the stack arbitrarily.  I'd expect its more likely with
outgoing arg or large buffer allocations though.  I think the largest
buffer Qualys found was less than 400k? So 1 256k guard page should
protect 95% of functions, and 1m or 2m  seems like enough to protect
against all non malicious programs.  I'm not sure, but this is a 64 bit
arch, so it seems like we should have the adress space for large guard
pages like that.

There's also the trade off that increasing the amount of code that
probes reduces the set of code that can either move the stack into or
past the guard page.

> THe aarch64 maintainers can certain nix what I've done or modify it in
> ways that suit them.  That is their choice as port maintainers.
> However, Red Hat will have to evaluate the cost of reducing security for
> our customer base against the performance improvement of such changes.
> As I've said before, I do not know where that decision would fall.
> 
> 
> > 
> > Therefore even when using a tiny 4K probe size we can safely adjust SP by 
> > 3KB
> > before needing an explicit probe - now only 0.6% of functions need a probe.
> > If we choose a proper minimum probe distance, say 64KB, explicit probes are
> > basically non-existent (just 35 functions, or ~0.02% of all functions are > 
> > 64KB).
> > Clearly inserting probes can be the default as the impact on code quality 
> > is negligible.
> Again, there's nothing that says 3k is safe.   You're picking an
> arbitrary point that is safe in a codebase you've looked at.  But I'm
> looking at this from a "what guarantees do I have from an ABI or ISA
> standpoint".  The former may be more performant, but it's inherently
> less secure than the latter.

On the other hand making -fstack-check=clash the default seems to me
like a very significant security improvement.

Trev


Re: Backports to gcc 6.x

2017-07-12 Thread Segher Boessenkool
Hi Kelvin,

On Tue, Jul 11, 2017 at 04:43:47PM -0600, Kelvin Nilsen wrote:
> I would like to backport the following patches to the GCC 6 branch.
> 
> PR9: Fix failure of gcc.dg/loop-8.c on Power
>   https://gcc.gnu.org/ml/gcc-patches/2017-01/msg01788.html
> 
> PR68972: g++.dg/cpp1y/vla-initlist1.C test case fails on power
>   https://gcc.gnu.org/ml/gcc-patches/2017-02/msg00541.html
> 
> Handle conflicting target options -mno-power9-vector and -mcpu=power9
>   https://gcc.gnu.org/ml/gcc-patches/2017-03/msg01192.html
> 
> PR80103: Fix ICE with cross compiler
>   https://gcc.gnu.org/ml/gcc-patches/2017-03/msg01335.html
> 
> PR80101: Fix ICE in store_data_bypass_p
>   https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00953.html
> 
> 
> Each of these patches has been bootstrapped and regression tested on the
> GCC 6 branch.  In backport, patch PR80103 omits certain changes to
> existing comments that are not present in GCC6.
> 
> Are these patches ok for backporting to GCC 6?

Yes, all are okay.  Thanks!


Segher


Re: [ping #4][patch] Fix PR80929: Realistic PARALLEL cost in seq_cost.

2017-07-12 Thread Segher Boessenkool
Hi,

On Tue, Jul 11, 2017 at 10:47:27AM +0200, Georg-Johann Lay wrote:
> This small addition improves costs of PARALLELs in
> rtlanal.c:seq_cost().  Up to now, these costs are
> assumed to be 1 which gives gross inexact costs for,
> e.g. divmod which is represented as PARALLEL.

insn_rtx_cost returns 0 ("unknown") for such a PARALLEL, so your
current patch does not change this at all?

> --- rtlanal.c (revision 248745)
> +++ rtlanal.c (working copy)
> @@ -5300,6 +5300,9 @@ seq_cost (const rtx_insn *seq, bool spee
>set = single_set (seq);
>if (set)
>  cost += set_rtx_cost (set, speed);

So, why does this not use insn_rtx_cost as well?

> +  else if (INSN_P (seq)
> +&& PARALLEL == GET_CODE (PATTERN (seq)))

Yoda conditions have we should not.

> + cost += 1 + insn_rtx_cost (PATTERN (seq), speed);
>else
>  cost++;
>  }

This whole thing could be something like

  if (INSN_P (seq))
{
  int t = insn_rtx_cost (PATTERN (seq), speed);
  cost += t ? t : COST_N_INSNS (1);
}
  else
cost++;

But set_rtx_cost does *not* return the same answer as insn_rtx_cost.

(Why do you need a check for INSN_P here?  Why does it increment the
cost for non-insns?  So many questions).


Segher


Re: [patch,avr] PR81407: Error if progmem variable needs constructing.

2017-07-12 Thread Denis Chertykov
2017-07-12 12:45 GMT+04:00 Georg-Johann Lay :
> Hi, if the C++ front-end decides that something will need constructing,
> it will silently put the stuff into .rodata so that according
> pgm_read_xxx will read garbage from .progmem.
>
> As proposed by Jason, this patch diagnoses such situations.
>
> Ok to commit?
>
> Johann
>
> PR target/81407
> * config/avr/avr.c (avr_encode_section_info)
> [progmem && !TREE_READONLY]: Error if progmem object needs
> constructing.

Approved.


Re: [PATCH] PR libstdc++/80316 make promise::set_value throw no_state error

2017-07-12 Thread Jonathan Wakely

On 12/07/17 09:46 +0200, Christophe Lyon wrote:

On 11 July 2017 at 14:39, Jonathan Wakely  wrote:

On 11/07/17 12:53 +0100, Jonathan Wakely wrote:


On 21/04/17 15:54 +0100, Jonathan Wakely wrote:


On 4 April 2017 at 20:44, Jonathan Wakely wrote:


We got a bug report from a customer pointing out that calling
promise::set_value on a moved-from promise crashes instead of throwing
an exception with error code future_errc::no_state.

This fixes it, by moving the _S_check calls to *before* we deference
the pointer that the calls check!

This passes all tests, including the more comprehensive ones I've
added as part of this commit, but I think it can wait for stage 1
anyway. We've been shipping this bug for a couple of releases already.

   PR libstdc++/80316
   * include/std/future (_State_baseV2::_Setter::operator()): Remove
   _S_check calls that are done after the pointer to the shared
state
is
   already dereferenced.
   (_State_baseV2::_Setter<_Res, void>): Define specialization for
void
   as partial specialization so it can be defined within the
definition
   of _State_baseV2.
   (_State_baseV2::__setter): Call _S_check.
   (_State_baseV2::__setter(promise*)): Add overload for use
by
   promise::set_value and
promise::set_value_at_thread_exit.
   (promise, promise, promise): Make _State a friend.
   (_State_baseV2::_Setter): Remove explicit
specialization.
   (promise::set_value,
promise::set_value_at_thread_exit):
   Use new __setter overload.
   * testsuite/30_threads/promise/members/at_thread_exit2.cc: New
test.
   * testsuite/30_threads/promise/members/set_exception.cc: Test
   promise and promise specializations.
   * testsuite/30_threads/promise/members/set_exception2.cc:
Likewise.
   Test for no_state error condition.
   * testsuite/30_threads/promise/members/set_value2.cc: Likewise.




This is now committed to trunk.



And now also to gcc-7-branch.



And gcc-6-branch.



Hi Jonathan,

The new test fails to compile on arm when forcing -march=armv5t:
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:
In function 'void test01()':^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:31:
error: aggregate 'std::promise p1' has incomplete type and cannot
be defined^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:44:
error: 'make_exception_ptr' is not a member of 'std'^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:52:
error: variable 'std::promise p2' has initializer but incomplete
type^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:64:
error: 'make_exception_ptr' is not a member of 'std'^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:
In function 'void test02()':^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:75:
error: aggregate 'std::promise p1' has incomplete type and
cannot be defined^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:89:
error: 'make_exception_ptr' is not a member of 'std'^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:97:
error: variable 'std::promise p2' has initializer but incomplete
type^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:110:
error: 'make_exception_ptr' is not a member of 'std'^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:
In function 'void test03()':^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:121:
error: aggregate 'std::promise p1' has incomplete type and
cannot be defined^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:134:
error: 'make_exception_ptr' is not a member of 'std'^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:142:
error: variable 'std::promise p2' has initializer but incomplete
type^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:153:
error: 'make_exception_ptr' is not a member of 'std'^M
compiler exited with status 1

I can see this on target arm-none-linux-gnueabihf --with-mode arm
--with-cpu cortex-a9 --with-fpu vfp and setting -march=armv5t in
runtestflags.

It also looks like you forgot to add a ChangeLog entry for the
testsuite changes.


These changes?

* testsuite/30_threads/promise/members/at_thread_exit2.cc: New test.
* testsuite/30_threads/promise/members/set_excepti

Re: [PATCH, GCC/ARM] Remove ARMv8-M code for D17-D31

2017-07-12 Thread Thomas Preudhomme

Hi Richard,

On 07/07/17 15:19, Richard Earnshaw (lists) wrote:


Hmm, I think that's because really this is a partial conversion.  It
looks like doing this properly would involve moving that existing code
to use sbitmaps as well.  I think doing that would be better for
long-term maintenance perspectives, but I'm not going to insist that you
do it now.


There's also the assert later but I've found a way to improve it slightly. While 
switching to auto_sbitmap I also changed the code slightly to allocate directly 
bitmaps to the right size. Since the change is probably bigger than what you had 
in mind I'd appreciate if you can give me an OK again. See updated patch in 
attachment. ChangeLog entry is unchanged:


2017-06-13  Thomas Preud'homme  

* config/arm/arm.c (arm_option_override): Forbid ARMv8-M Security
Extensions with more than 16 double VFP registers.
(cmse_nonsecure_entry_clear_before_return): Remove second entry of
to_clear_mask and all code related to it.  Replace the remaining
entry by a sbitmap and adapt code accordingly.



As a result I'll let you take the call as to whether you keep this
version or go back to your earlier patch.  If you do decide to keep this
version, then see the comment below.


Given the changes I'm more happy with how the patch looks now and making it go 
in can be a nice incentive to change other ARMv8-M Security Extension related 
code later on.


Best regards,

Thomas
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 259597d8890ee84c5bd92b12b6f9f6521c8dcd2e..4d09e891c288c7bf9b519ad7cd62bd38df661a02 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3620,6 +3620,11 @@ arm_option_override (void)
   if (use_cmse && !arm_arch_cmse)
 error ("target CPU does not support ARMv8-M Security Extensions");
 
+  /* We don't clear D16-D31 VFP registers for cmse_nonsecure_call functions
+ and ARMv8-M Baseline and Mainline do not allow such configuration.  */
+  if (use_cmse && LAST_VFP_REGNUM > LAST_LO_VFP_REGNUM)
+error ("ARMv8-M Security Extensions incompatible with selected FPU");
+
   /* Disable scheduling fusion by default if it's not armv7 processor
  or doesn't prefer ldrd/strd.  */
   if (flag_schedule_fusion == 2
@@ -24996,42 +25001,39 @@ thumb1_expand_prologue (void)
 void
 cmse_nonsecure_entry_clear_before_return (void)
 {
-  uint64_t to_clear_mask[2];
+  bool clear_vfpregs = TARGET_HARD_FLOAT && !TARGET_THUMB1;
+  int regno, maxregno = clear_vfpregs ? LAST_VFP_REGNUM : IP_REGNUM;
   uint32_t padding_bits_to_clear = 0;
   uint32_t * padding_bits_to_clear_ptr = &padding_bits_to_clear;
-  int regno, maxregno = IP_REGNUM;
+  auto_sbitmap to_clear_bitmap (LAST_VFP_REGNUM);
   tree result_type;
   rtx result_rtl;
 
-  to_clear_mask[0] = (1ULL << (NUM_ARG_REGS)) - 1;
-  to_clear_mask[0] |= (1ULL << IP_REGNUM);
+  bitmap_clear (to_clear_bitmap);
+  bitmap_set_range (to_clear_bitmap, R0_REGNUM, NUM_ARG_REGS);
+  bitmap_set_bit (to_clear_bitmap, IP_REGNUM);
 
   /* If we are not dealing with -mfloat-abi=soft we will need to clear VFP
  registers.  We also check that TARGET_HARD_FLOAT and !TARGET_THUMB1 hold
  to make sure the instructions used to clear them are present.  */
-  if (TARGET_HARD_FLOAT && !TARGET_THUMB1)
+  if (clear_vfpregs)
 {
-  uint64_t float_mask = (1ULL << (D7_VFP_REGNUM + 1)) - 1;
-  maxregno = LAST_VFP_REGNUM;
-
-  float_mask &= ~((1ULL << FIRST_VFP_REGNUM) - 1);
-  to_clear_mask[0] |= float_mask;
+  int float_bits = D7_VFP_REGNUM - FIRST_VFP_REGNUM + 1;
 
-  float_mask = (1ULL << (maxregno - 63)) - 1;
-  to_clear_mask[1] = float_mask;
+  bitmap_set_range (to_clear_bitmap, FIRST_VFP_REGNUM, float_bits);
 
   /* Make sure we don't clear the two scratch registers used to clear the
 	 relevant FPSCR bits in output_return_instruction.  */
   emit_use (gen_rtx_REG (SImode, IP_REGNUM));
-  to_clear_mask[0] &= ~(1ULL << IP_REGNUM);
+  bitmap_clear_bit (to_clear_bitmap, IP_REGNUM);
   emit_use (gen_rtx_REG (SImode, 4));
-  to_clear_mask[0] &= ~(1ULL << 4);
+  bitmap_clear_bit (to_clear_bitmap, 4);
 }
 
   /* If the user has defined registers to be caller saved, these are no longer
  restored by the function before returning and must thus be cleared for
  security purposes.  */
-  for (regno = NUM_ARG_REGS; regno < LAST_VFP_REGNUM; regno++)
+  for (regno = NUM_ARG_REGS; regno <= maxregno; regno++)
 {
   /* We do not touch registers that can be used to pass arguments as per
 	 the AAPCS, since these should never be made callee-saved by user
@@ -25041,29 +25043,45 @@ cmse_nonsecure_entry_clear_before_return (void)
   if (IN_RANGE (regno, IP_REGNUM, PC_REGNUM))
 	continue;
   if (call_used_regs[regno])
-	to_clear_mask[regno / 64] |= (1ULL << (regno % 64));
+	bitmap_set_bit (to_clear_bitmap, regno);
 }
 
   /* Make sure we do not clear the registers used to return the result in.  */
  

[patch,avr] PR81407: Error if progmem variable needs constructing.

2017-07-12 Thread Georg-Johann Lay

Hi, if the C++ front-end decides that something will need constructing,
it will silently put the stuff into .rodata so that according
pgm_read_xxx will read garbage from .progmem.

As proposed by Jason, this patch diagnoses such situations.

Ok to commit?

Johann

PR target/81407
* config/avr/avr.c (avr_encode_section_info)
[progmem && !TREE_READONLY]: Error if progmem object needs
constructing.
Index: config/avr/avr.c
===
--- config/avr/avr.c	(revision 250093)
+++ config/avr/avr.c	(working copy)
@@ -10380,14 +10380,22 @@ avr_encode_section_info (tree decl, rtx
   && !DECL_EXTERNAL (decl)
   && avr_progmem_p (decl, DECL_ATTRIBUTES (decl)))
 {
-  // Don't warn for (implicit) aliases like in PR80462.
   tree asmname = DECL_ASSEMBLER_NAME (decl);
   varpool_node *node = varpool_node::get_for_asmname (asmname);
   bool alias_p = node && node->alias;
 
-  if (!alias_p)
-warning (OPT_Wuninitialized, "uninitialized variable %q+D put into "
- "program memory area", decl);
+  if (!TREE_READONLY (decl))
+{
+  // This might happen with C++ if stuff needs constructing.
+  error ("variable %q+D with dynamic initialization put "
+ "into program memory area", decl);
+}
+  else if (!alias_p)
+{
+  // Don't warn for (implicit) aliases like in PR80462.
+  warning (OPT_Wuninitialized, "uninitialized variable %q+D put "
+   "into program memory area", decl);
+}
 }
 
   default_encode_section_info (decl, rtl, new_decl_p);


Re: A potential bug in lra-constraints.c for special_memory_constraint?

2017-07-12 Thread Eric Botcazou
> Actually, My major question is whether the current handling of
> special_memory_constraint in lra_constraints.c is correct or NOT based on
> GCC internal documentation. I thought that’s independent from this
> misaligned insns generation for M8, but looks I was wrong.

The answer is yes, the current handling is definitely correct, which probably 
means that a special_memory_constraint ought not to be used here and that a 
usual memory_constraint should work just fine.

-- 
Eric Botcazou



Re: [PATCH][testsuite] Add dg-require-stack-check

2017-07-12 Thread Christophe Lyon
On 11 July 2017 at 16:09, Christophe Lyon  wrote:
> On 10 July 2017 at 10:01, Christophe Lyon  wrote:
>> Hi,
>>
>>
>> On 6 July 2017 at 06:50, Jeff Law  wrote:
>>> On 07/04/2017 02:50 AM, Christophe Lyon wrote:
 On 3 July 2017 at 17:30, Jeff Law  wrote:
> On 07/03/2017 09:00 AM, Christophe Lyon wrote:
>> Hi,
>>
>> This is a follow-up to
>> https://gcc.gnu.org/ml/gcc-patches/2017-06/msg01791.html
>>
>> This patch adds dg-require-stack-check and updates the tests that use
>> dg-options "-fstack-check" to avoid failures on configurations that to
>> not support it.
>>
>> I merely copied what we currently do to check if visibility flags are
>> supported, and cross-tested on aarch64 and arm targets with the
>> results I expected.
>>
>> This means that my testing does not cover the changes I propose for
>> i386 and gnat.
>>
>> Is it OK nonetheless?
>>
>> Thanks,
>>
>> Christophe
>>
>>
>> stack-check-et.chlog.txt
>>
>>
>> 2017-07-03  Christophe Lyon  
>>
>>   * lib/target-supports-dg.exp (dg-require-stack-check): New.
>>   * lib/target-supports.exp (check_stack_check_available): New.
>>   * g++.dg/other/i386-9.C: Add dg-require-stack-check.
>>   * gcc.c-torture/compile/stack-check-1.c: Likewise.
>>   * gcc.dg/graphite/run-id-pr47653.c: Likewise.
>>   * gcc.dg/pr47443.c: Likewise.
>>   * gcc.dg/pr48134.c: Likewise.
>>   * gcc.dg/pr70017.c: Likewise.
>>   * gcc.target/aarch64/stack-checking.c: Likewise.
>>   * gcc.target/arm/stack-checking.c: Likewise.
>>   * gcc.target/i386/pr48723.c: Likewise.
>>   * gcc.target/i386/pr55672.c: Likewise.
>>   * gcc.target/i386/pr67265-2.c: Likewise.
>>   * gcc.target/i386/pr67265.c: Likewise.
>>   * gnat.dg/opt49.adb: Likewise.
>>   * gnat.dg/stack_check1.adb: Likewise.
>>   * gnat.dg/stack_check2.adb: Likewise.
>>   * gnat.dg/stack_check3.adb: Likewise.
> ACK once you address Rainer's comments.  I've got further stack-check
> tests in the queue which I'll update once your change goes in.
>
> jeff
 Here is an updated version, which adds documentation for 
 dg-require-stack-check.

 I also ran make-check on and x86_64 with ada enabled and checked the logs:
 the updated i386/* and gnat.dg* tests all pass, and are preceded by
 the compilation
 of the "stack_check" sample.

 OK?

 Thanks,

 Christophe


 stack-check-et.chlog.txt


 2017-07-04  Christophe Lyon  

   gcc/
   * doc/sourcebuild.texi (Test Directives, Variants of
   dg-require-support): Add documentation for dg-require-stack-check.

   gcc/testsuite/
   * lib/target-supports-dg.exp (dg-require-stack-check): New.
   * lib/target-supports.exp (check_stack_check_available): New.
   * g++.dg/other/i386-9.C: Add dg-require-stack-check.
   * gcc.c-torture/compile/stack-check-1.c: Likewise.
   * gcc.dg/graphite/run-id-pr47653.c: Likewise.
   * gcc.dg/pr47443.c: Likewise.
   * gcc.dg/pr48134.c: Likewise.
   * gcc.dg/pr70017.c: Likewise.
   * gcc.target/aarch64/stack-checking.c: Likewise.
   * gcc.target/arm/stack-checking.c: Likewise.
   * gcc.target/i386/pr48723.c: Likewise.
   * gcc.target/i386/pr55672.c: Likewise.
   * gcc.target/i386/pr67265-2.c: Likewise.
   * gcc.target/i386/pr67265.c: Likewise.
   * gnat.dg/opt49.adb: Likewise.
   * gnat.dg/stack_check1.adb: Likewise.
   * gnat.dg/stack_check2.adb: Likewise.
   * gnat.dg/stack_check3.adb: Likewise.
>>> OK for the trunk.  Thanks for doing this!
>>>
>>
>> I've committed this as r250013.
>>
>> Since then, I've noticed that pr48134 randomly fails.
>>
>> According to gcc.log, this seems related the order wrt pr47443.
>> pr48134 uses -fstack-check=specific, while pr47443 uses 
>> -fstack-check=generic.
>>
>> When pr47443 appears before pr48134 in gcc.log, the latter fails,
>> otherwise it is unsupported.
>>
>> Looking at gcc.log, it seems that dg-require-stack-check is not always 
>> called.
>> Is there some caching in dejagnu I'm not aware of, that would ignore
>> the value of the
>> parameter (assuming that dg-require-stack-check "specific" and
>> dg-require-stack-check "generic" return the same value?)
>>
>> Am I missing anything obvious?
>>
>
> It turns out I was... check_no_compiler_messages actually caches the
> results using the testcase name, so using "stack_check" was insufficient.
>
> The attached patch uses "stack_check_$stack_kind" instead, to make it
> unique per fstack-check option.
>

I went ahead and committed it as r250149.

Christophe

> OK?
>
> Thanks,
>
> Christophe
>
>> Thanks,
>>
>> Christophe
>>
>>
>>> Jeff


Re: [PATCH] PR libstdc++/80316 make promise::set_value throw no_state error

2017-07-12 Thread Christophe Lyon
On 11 July 2017 at 14:39, Jonathan Wakely  wrote:
> On 11/07/17 12:53 +0100, Jonathan Wakely wrote:
>>
>> On 21/04/17 15:54 +0100, Jonathan Wakely wrote:
>>>
>>> On 4 April 2017 at 20:44, Jonathan Wakely wrote:

 We got a bug report from a customer pointing out that calling
 promise::set_value on a moved-from promise crashes instead of throwing
 an exception with error code future_errc::no_state.

 This fixes it, by moving the _S_check calls to *before* we deference
 the pointer that the calls check!

 This passes all tests, including the more comprehensive ones I've
 added as part of this commit, but I think it can wait for stage 1
 anyway. We've been shipping this bug for a couple of releases already.

PR libstdc++/80316
* include/std/future (_State_baseV2::_Setter::operator()): Remove
_S_check calls that are done after the pointer to the shared
 state
 is
already dereferenced.
(_State_baseV2::_Setter<_Res, void>): Define specialization for
 void
as partial specialization so it can be defined within the
 definition
of _State_baseV2.
(_State_baseV2::__setter): Call _S_check.
(_State_baseV2::__setter(promise*)): Add overload for use
 by
promise::set_value and
 promise::set_value_at_thread_exit.
(promise, promise, promise): Make _State a friend.
(_State_baseV2::_Setter): Remove explicit
 specialization.
(promise::set_value,
 promise::set_value_at_thread_exit):
Use new __setter overload.
* testsuite/30_threads/promise/members/at_thread_exit2.cc: New
 test.
* testsuite/30_threads/promise/members/set_exception.cc: Test
promise and promise specializations.
* testsuite/30_threads/promise/members/set_exception2.cc:
 Likewise.
Test for no_state error condition.
* testsuite/30_threads/promise/members/set_value2.cc: Likewise.
>>>
>>>
>>>
>>> This is now committed to trunk.
>>
>>
>> And now also to gcc-7-branch.
>
>
> And gcc-6-branch.
>

Hi Jonathan,

The new test fails to compile on arm when forcing -march=armv5t:
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:
In function 'void test01()':^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:31:
error: aggregate 'std::promise p1' has incomplete type and cannot
be defined^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:44:
error: 'make_exception_ptr' is not a member of 'std'^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:52:
error: variable 'std::promise p2' has initializer but incomplete
type^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:64:
error: 'make_exception_ptr' is not a member of 'std'^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:
In function 'void test02()':^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:75:
error: aggregate 'std::promise p1' has incomplete type and
cannot be defined^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:89:
error: 'make_exception_ptr' is not a member of 'std'^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:97:
error: variable 'std::promise p2' has initializer but incomplete
type^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:110:
error: 'make_exception_ptr' is not a member of 'std'^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:
In function 'void test03()':^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:121:
error: aggregate 'std::promise p1' has incomplete type and
cannot be defined^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:134:
error: 'make_exception_ptr' is not a member of 'std'^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:142:
error: variable 'std::promise p2' has initializer but incomplete
type^M
/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libstdc++-v3/testsuite/30_threads/promise/members/at_thread_exit2.cc:153:
error: 'make_exception_ptr' is not a member of 'std'^M
compiler exited with status 1

I can see this on target arm-none-linux-gnueabihf --with-mode arm
--with-cpu cortex-a9 --with-fpu vfp and setting -march=armv5t in
runtestflags.

It also looks like you forgot to add a ChangeLog