Re: [PATCH] x86 interrupt attribute patch [1/2]

2016-04-28 Thread Yulia Koval
Thanks,
Here is the patch. Is it ok?

Update TARGET_FUNCTION_INCOMING_ARG documentation

On x86, interrupt handlers are only called by processors which push
interrupt data onto stack at the address where the normal return address
is.  Since interrupt handlers must access interrupt data via pointers so
that they can update interrupt data, the pointer argument is passed as
"argument pointer - word".

TARGET_FUNCTION_INCOMING_ARG defines how callee sees its argument.
Normally it returns REG, NULL, or CONST_INT.  This patch adds arbitrary
address computation based on hard register, which can be forced into a
register, to the list.

When copying an incoming argument onto stack, assign_parm_setup_stack
has:

if (argument in memory)
  copy argument in memory to stack
else
  move argument to stack

Since an arbitrary address computation may be passed as an argument, we
change it to:

if (argument in memory)
  copy argument in memory to stack
else
  {
if (argument isn't in register)
  force argument into a register
move argument to stack
  }

* function.c (assign_parm_setup_stack): Force source into a
register if needed.
* target.def (function_incoming_arg): Update documentation to
allow arbitrary address computation based on hard register.
* doc/tm.texi: Regenerated.

On Thu, Apr 28, 2016 at 10:32 PM, H.J. Lu  wrote:
> On Thu, Apr 28, 2016 at 11:22 AM, Yulia Koval  wrote:
>> Thank you,
>> Here is the repost.
>>
>> Update TARGET_FUNCTION_INCOMING_ARG documentation
>>
>> On x86, interrupt handlers are only called by processors which push
>> interrupt data onto stack at the address where the normal return address
>> is.  Since interrupt handlers must access interrupt data via pointers so
>> that they can update interrupt data, the pointer argument is passed as
>> "argument pointer - word".
>>
>> TARGET_FUNCTION_INCOMING_ARG defines how callee sees its argument.
>> Normally it returns REG, NULL, or CONST_INT.  This patch adds arbitrary
>> address computation based on hard register, which can be forced into a
>> register, to the list.
>>
>> When copying an incoming argument onto stack, assign_parm_setup_stack
>> has:
>>
>> if (argument in memory)
>>   copy argument in memory to stack
>> else
>>   move argument to stack
>>
>> Since an arbitrary address computation may be passed as an argument, we
>> change it to:
>>
>> if (argument in memory)
>>   copy argument in memory to stack
>> else
>>   {
>> if (argument isn't in register)
>>   force argument into a register
>> move argument to stack
>>   }
>>
>> * function.c (assign_parm_setup_stack): Force source into a
>> register if needed.
>> * target.def (function_incoming_arg): Update documentation to
>> allow arbitrary address computation based on hard register.
>> * doc/tm.texi: Regenerated.
>>
>>
>> Br,
>> Yulia
>>
>
> You also need to update
>
> DEFHOOK
> (function_incoming_arg,
>  "Define this hook if the target machine has ``register windows'', so\n\
> that the register in which a function sees an arguments is not\n\
> necessarily the same as the one in which the caller passed the\n\
> argument.\n\
> \n\
> .
>
> in target.def.
>
> --
> H.J.


patch
Description: Binary data


Re: [PATCH, rs6000] Add support for vector element-reversal built-ins

2016-04-28 Thread Segher Boessenkool
On Wed, Apr 27, 2016 at 01:30:54PM -0500, Bill Schmidt wrote:
> 2016-04-27  Bill Schmidt  
> 
>   * config/rs6000/altivec.h: Change definitions of vec_xl and
>   vec_xst.
>   * config/rs6000/rs6000-builtin.def (LD_ELEMREV_V2DF): New.
>   (LD_ELEMREV_V2DI): New.
>   (LD_ELEMREV_V4SF): New.
>   (LD_ELEMREV_V4SI): New.
>   (LD_ELEMREV_V8HI): New.
>   (LD_ELEMREV_V16QI): New.
>   (ST_ELEMREV_V2DF): New.
>   (ST_ELEMREV_V2DI): New.
>   (ST_ELEMREV_V4SF): New.
>   (ST_ELEMREV_V4SI): New.
>   (ST_ELEMREV_V8HI): New.
>   (ST_ELEMREV_V16QI): New.
>   (XL): New.
>   (XST): New.
>   * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
>   descriptions for VSX_BUILTIN_VEC_XL and VSX_BUILTIN_VEC_XST.
>   * config/rs6000/rs6000.c (rs6000_builtin_mask_calculate): Map from
>   TARGET_P9_VECTOR to RS6000_BTM_P9_VECTOR.
>   (altivec_expand_builtin): Add handling for
>   VSX_BUILTIN_ST_ELEMREV_ and VSX_BUILTIN_LD_ELEMREV_.
>   (rs6000_invalid_builtin): Add error-checking for
>   RS6000_BTM_P9_VECTOR.
>   (altivec_init_builtins): Define builtins used to implement vec_xl
>   and vec_xst.
>   (rs6000_builtin_mask_names): Define power9-vector.
>   * config/rs6000/rs6000.h (MASK_P9_VECTOR): Define.
>   (RS6000_BTM_P9_VECTOR): Define.
>   (RS6000_BTM_COMMON): Include RS6000_BTM_P9_VECTOR.
>   * config/rs6000/vsx.md (vsx_ld_elemrev_v2di): New define_insn.
>   (vsx_ld_elemrev_v2df): Likewise.
>   (vsx_ld_elemrev_v4sf): Likewise.
>   (vsx_ld_elemrev_v4si): Likewise.
>   (vsx_ld_elemrev_v8hi): Likewise.
>   (vsx_ld_elemrev_v16qi): Likewise.
>   (vsx_st_elemrev_v2df): Likewise.
>   (vsx_st_elemrev_v2di): Likewise.
>   (vsx_st_elemrev_v4sf): Likewise.
>   (vsx_st_elemrev_v4si): Likewise.
>   (vsx_st_elemrev_v8hi): Likewise.
>   (vsx_st_elemrev_v16qi): Likewise.
>   * doc/extend.texi: Add prototypes for vec_xl and vec_xst.  Correct
>   grammar.
> 
> [gcc/testsuite]
> 
> 2016-04-27  Bill Schmidt  
> 
>   * gcc.target/powerpc/vsx-elemrev-1.c: New.
>   * gcc.target/powerpc/vsx-elemrev-2.c: New.
>   * gcc.target/powerpc/vsx-elemrev-3.c: New.
>   * gcc.target/powerpc/vsx-elemrev-4.c: New.

This is okay for trunk.  One nit:

>  #define RS6000_BTM_P8_VECTOR MASK_P8_VECTOR  /* ISA 2.07 vector.  */
> +#define RS6000_BTM_P9_VECTOR MASK_P9_VECTOR  /* ISA 3.00 vector.  */

> +If the ISA 3.00 additions to the vector/scalar (power9-vector)

It's called "ISA 3.0".

Thanks,


Segher


[PATCH, rs6000] Backport some swap optimization improvements

2016-04-28 Thread Bill Schmidt
Hi,

The lack of certain swap optimizations added in GCC 6 has shown up as a
performance issue in some customer code, where the customer is unable to
move off of GCC 5.  To accommodate this, I would like to backport these
changes to GCC 5.  They have all been burned in on trunk for many
months.  The same code has also been provided in
branches/ibm/gcc-5-branch since early this year, used to build code in
Ubuntu 16.04 and included in the latest AT9.0 releases.  I feel that it
is therefore pretty solid at this point.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Is this ok for GCC 5.4?

Thanks,
Bill


[gcc]

2016-04-28  Bill Schmidt  

PR target/69868 + swap optimization backports
* config/rs6000/rs6000.c (swap_web_entry): Enlarge
special_handling bitfield.
(special_handling_values): Add SH_XXPERMDI, SH_CONCAT, SH_VPERM,
and SH_VPERM_COMP.
(const_load_sequence_p): New.
(load_comp_mask_p): New.
(v2df_reduction_p): New.
(rtx_is_swappable_p): Perform special handling for XXPERMDI and
for reductions.
(insn_is_swappable_p): Perform special handling for VEC_CONCAT,
V2DF reductions, and various permutes.
(adjust_xxpermdi): New.
(adjust_concat): New.
(find_swapped_load_and_const_vector): New.
(replace_const_vector_in_load): New.
(adjust_vperm): New.
(adjust_vperm_comp): New.
(handle_special_swappables): Call adjust_xxpermdi, adjust_concat,
adjust_vperm, and adjust_vperm_comp.
(replace_swap_with_copy): Allow vector NOT operations to also be
replaced by copies.
(dump_swap_insn_table): Handle new special handling values.

[gcc/testsuite]

2016-04-28  Bill Schmidt  

PR target/69868 + swap optimization backports
* gcc.target/powerpc/swaps-p8-20.c: New.
* gcc.target/powerpc/swaps-p8-22.c: New.
* gcc.target/powerpc/swaps-p8-23.c: New.
* gcc.target/powerpc/swaps-p8-24.c: New.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 235582)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -34134,10 +34134,8 @@ emit_fusion_gpr_load (rtx target, rtx mem)
throughout the computation, we can get correct behavior by replacing
M with M' as follows:
 
-{ M[i+8]+8 : i < 8, M[i+8] in [0,7] U [16,23]
-M'[i] = { M[i+8]-8 : i < 8, M[i+8] in [8,15] U [24,31]
-{ M[i-8]+8 : i >= 8, M[i-8] in [0,7] U [16,23]
-{ M[i-8]-8 : i >= 8, M[i-8] in [8,15] U [24,31]
+M'[i] = { (M[i]+8)%16  : M[i] in [0,15]
+{ ((M[i]+8)%16)+16 : M[i] in [16,31]
 
This seems promising at first, since we are just replacing one mask
with another.  But certain masks are preferable to others.  If M
@@ -34155,8 +34153,12 @@ emit_fusion_gpr_load (rtx target, rtx mem)
mask to be produced by an UNSPEC_LVSL, in which case the mask 
cannot be known at compile time.  In such a case we would have to
generate several instructions to compute M' as above at run time,
-   and a cost model is needed again.  */
+   and a cost model is needed again.
 
+   However, when the mask M for an UNSPEC_VPERM is loaded from the
+   constant pool, we can replace M with M' as above at no cost
+   beyond adding a constant pool entry.  */
+
 /* This is based on the union-find logic in web.c.  web_entry_base is
defined in df.h.  */
 class swap_web_entry : public web_entry_base
@@ -34191,7 +34193,7 @@ class swap_web_entry : public web_entry_base
   /* A nonzero value indicates what kind of special handling for this
  insn is required if doublewords are swapped.  Undefined if
  is_swappable is not set.  */
-  unsigned int special_handling : 3;
+  unsigned int special_handling : 4;
   /* Set if the web represented by this entry cannot be optimized.  */
   unsigned int web_not_optimizable : 1;
   /* Set if this insn should be deleted.  */
@@ -34205,7 +34207,11 @@ enum special_handling_values {
   SH_NOSWAP_LD,
   SH_NOSWAP_ST,
   SH_EXTRACT,
-  SH_SPLAT
+  SH_SPLAT,
+  SH_XXPERMDI,
+  SH_CONCAT,
+  SH_VPERM,
+  SH_VPERM_COMP
 };
 
 /* Union INSN with all insns containing definitions that reach USE.
@@ -34340,6 +34346,164 @@ insn_is_swap_p (rtx insn)
   return 1;
 }
 
+/* Return TRUE if insn is a swap fed by a load from the constant pool.  */
+static bool
+const_load_sequence_p (swap_web_entry *insn_entry, rtx insn)
+{
+  unsigned uid = INSN_UID (insn);
+  if (!insn_entry[uid].is_swap || insn_entry[uid].is_load)
+return false;
+
+  /* Find the unique use in the swap and locate its def.  If the def
+ isn't unique, punt.  */
+  struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
+  df_ref use;
+  FOR_EACH_INSN_INFO_USE (use, insn_info)
+{
+  struct df_link *def_link = DF_REF_CHAIN (use);
+  

[patch] Remove trailing whitespaces from libstdc++-v3/config

2016-04-28 Thread Chris Gregory
I had to break this patch up into multiple edits.

Here is the one for ``libstdc++-v3/config''.

-- 
Chris Gregory

Index: libstdc++-v3/config/abi/compatibility.h
===
--- libstdc++-v3/config/abi/compatibility.h	(revision 235619)
+++ libstdc++-v3/config/abi/compatibility.h	(working copy)
@@ -28,7 +28,7 @@
  */
 
 // Switch for symbol version macro.
-#ifndef _GLIBCXX_APPLY_SYMVER 
+#ifndef _GLIBCXX_APPLY_SYMVER
 #error must define _GLIBCXX_APPLY_SYMVER before including __FILE__
 #endif
 
@@ -36,7 +36,7 @@
 _ZNSt19istreambuf_iteratorIcSt11char_traitsIcEEppEv
 _ZNSt19istreambuf_iteratorIwSt11char_traitsIwEEppEv
  */
-namespace 
+namespace
 {
 _GLIBCXX_APPLY_SYMVER(_ZNSt21istreambuf_iteratorXXIcSt11char_traitsIcEEppEv,
 		  _ZNSt19istreambuf_iteratorIcSt11char_traitsIcEEppEv)
@@ -76,7 +76,7 @@ _ZNSt13basic_istreamIwSt11char_traitsIwEE6ignoreEv
 _ZNSt11char_traitsIcE2eqERKcS2_
 _ZNSt11char_traitsIwE2eqERKwS2_
  */
-namespace 
+namespace
 {
 _GLIBCXX_APPLY_SYMVER(_ZNSt11char_traitsIcE4eqXXERKcS2_,
 		  _ZNSt11char_traitsIcE2eqERKcS2_)
Index: libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver
===
--- libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver	(revision 235619)
+++ libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver	(working copy)
@@ -77,7 +77,7 @@ GLIBCXX_7.0 {
 # locale
 _ZNSt3__79has_facetINS_*;
 
-# hash 
+# hash
 _ZNSt8__detail3__712__prime_listE;
 _ZNSt3tr18__detail3__712__prime_listE;
 
@@ -110,7 +110,7 @@ GLIBCXX_7.0 {
 _ZN9__gnu_cxx3__76__poolILb[01]EE10_M_destroyEv;
 _ZN9__gnu_cxx3__76__poolILb1EE16_M_get_thread_idEv;
 
-_ZN9__gnu_cxx3__717__pool_alloc_base9_M_refillE[jmy];
+_ZN9__gnu_cxx3__717__pool_alloc_base9_M_refillE[jmy];
 _ZN9__gnu_cxx3__717__pool_alloc_base16_M_get_free_listE[jmy];
 _ZN9__gnu_cxx3__717__pool_alloc_base12_M_get_mutexEv;
 
Index: libstdc++-v3/config/abi/pre/none.ver
===
--- libstdc++-v3/config/abi/pre/none.ver	(revision 235619)
+++ libstdc++-v3/config/abi/pre/none.ver	(working copy)
@@ -1,6 +1,6 @@
-# 
+#
 # This is a placeholder file.  It does nothing and is not used.
-# 
+#
 # If you are seeing this file as your linker script (named
 # libstdc++-symbols.ver), then either 1) the configuration process
 # determined that symbol versioning should not be done, or 2) you
Index: libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h
===
--- libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h	(revision 235619)
+++ libstdc++-v3/config/cpu/arm/cxxabi_tweaks.h	(working copy)
@@ -33,7 +33,7 @@
 #ifdef __cplusplus
 namespace __cxxabiv1
 {
-  extern "C" 
+  extern "C"
   {
 #endif
 
@@ -49,7 +49,7 @@ namespace __cxxabiv1
 
   // We also want the element size in array cookies.
 #define _GLIBCXX_ELTSIZE_IN_COOKIE 1
-  
+
   // __cxa_vec_ctor should return a pointer to the array.
   typedef void * __cxa_vec_ctor_return_type;
 #define _GLIBCXX_CXA_VEC_CTOR_RETURN(x) return x
@@ -79,4 +79,4 @@ namespace __cxxabiv1
 } // namespace __cxxabiv1
 #endif
 
-#endif 
+#endif
Index: libstdc++-v3/config/cpu/cris/atomic_word.h
===
--- libstdc++-v3/config/cpu/cris/atomic_word.h	(revision 235619)
+++ libstdc++-v3/config/cpu/cris/atomic_word.h	(working copy)
@@ -28,4 +28,4 @@
 // This entity must not cross a page boundary.
 typedef int _Atomic_word __attribute__ ((__aligned__ (4)));
 
-#endif 
+#endif
Index: libstdc++-v3/config/cpu/generic/atomic_word.h
===
--- libstdc++-v3/config/cpu/generic/atomic_word.h	(revision 235619)
+++ libstdc++-v3/config/cpu/generic/atomic_word.h	(working copy)
@@ -37,4 +37,4 @@ typedef int _Atomic_word;
 // This is a memory order release fence.
 #define _GLIBCXX_WRITE_MEM_BARRIER __atomic_thread_fence (__ATOMIC_RELEASE)
 
-#endif 
+#endif
Index: libstdc++-v3/config/cpu/generic/atomicity_builtins/atomicity.h
===
--- libstdc++-v3/config/cpu/generic/atomicity_builtins/atomicity.h	(revision 235619)
+++ libstdc++-v3/config/cpu/generic/atomicity_builtins/atomicity.h	(working copy)
@@ -30,7 +30,7 @@ namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
-  _Atomic_word 
+  _Atomic_word
   __attribute__ ((__unused__))
   __exchange_and_add(volatile _Atomic_word* __mem, int __val) throw ()
   { return __atomic_fetch_add(__mem, __val, __ATOMIC_ACQ_REL); }
Index: libstdc++-v3/config/cpu/generic/atomicity_mutex/atomicity.h
===
--- libstdc++-v3/config/cpu/generic/atomicity_mutex/atomicity.h	(revision 235619)
+++ libstdc++-v3/config/cpu/generic/atomicity_mutex/atomicity.h	(working copy)
@@ -25,7 +25,7 

Re: [C++ Patch] PR 66644

2016-04-28 Thread Paolo Carlini

Hi Jason,

On 28/04/2016 23:45, Jason Merrill wrote:
I would expect this to cause a false negative on a union of two 
anonymous structs, both of which have initialized members.


I think better would be to have a local any_default_members rather 
than passing the same pointer through all levels.


Also, you can look at 'type' rather than DECL_CONTEXT (field).

.
... and thanks a lot for your very helpful reply. Now I realize that 
something was wrong in the general logic here, not a tiny detail in a 
conditional. Thus the below tries to closely follow your advice: the 
idea is that any_default_members as required by check_field_decls should 
still compute to the same value but the logic to emit the "multiple 
fields in union initialized" diagnostic in check_field_decl is 
different: it relies on the incoming any_default_members, not on what is 
computed and returned at that level. Thus we can reject, eg, your case 
of two anonymous struct. Also, for test5, which we already correctly 
rejected, we do not emit an additional redundant error, and that's more 
evidence that something bigger was wrong in check_field_decl. Anyway, 
the below passes testing. How does it look?


Thanks,
Paolo.

/
Index: cp/class.c
===
--- cp/class.c  (revision 235615)
+++ cp/class.c  (working copy)
@@ -139,7 +139,7 @@ static int count_fields (tree);
 static int add_fields_to_record_type (tree, struct sorted_fields_type*, int);
 static void insert_into_classtype_sorted_fields (tree, tree, int);
 static bool check_bitfield_decl (tree);
-static void check_field_decl (tree, tree, int *, int *, int *);
+static bool check_field_decl (tree, tree, int *, int *, bool);
 static void check_field_decls (tree, tree *, int *, int *);
 static tree *build_base_field (record_layout_info, tree, splay_tree, tree *);
 static void build_base_fields (record_layout_info, splay_tree, tree *);
@@ -3541,14 +3541,15 @@ check_bitfield_decl (tree field)
enclosing type T.  Issue any appropriate messages and set appropriate
flags.  */
 
-static void
+static bool
 check_field_decl (tree field,
  tree t,
  int* cant_have_const_ctor,
  int* no_const_asn_ref,
- int* any_default_members)
+ bool any_default_members)
 {
   tree type = strip_array_types (TREE_TYPE (field));
+  bool any_default_members_field = false;
 
   /* In C++98 an anonymous union cannot contain any fields which would change
  the settings of CANT_HAVE_CONST_CTOR and friends.  */
@@ -3558,12 +3559,13 @@ check_field_decl (tree field,
  structs.  So, we recurse through their fields here.  */
   else if (ANON_AGGR_TYPE_P (type))
 {
-  tree fields;
-
-  for (fields = TYPE_FIELDS (type); fields; fields = DECL_CHAIN (fields))
+  for (tree fields = TYPE_FIELDS (type); fields;
+  fields = DECL_CHAIN (fields))
if (TREE_CODE (fields) == FIELD_DECL && !DECL_C_BIT_FIELD (field))
- check_field_decl (fields, t, cant_have_const_ctor,
-   no_const_asn_ref, any_default_members);
+ any_default_members_field |= check_field_decl (fields, t,
+cant_have_const_ctor,
+no_const_asn_ref,
+any_default_members);
 }
   /* Check members with class type for constructors, destructors,
  etc.  */
@@ -3623,10 +3625,12 @@ check_field_decl (tree field,
 {
   /* `build_class_init_list' does not recognize
 non-FIELD_DECLs.  */
-  if (TREE_CODE (t) == UNION_TYPE && *any_default_members != 0)
+  if (TREE_CODE (t) == UNION_TYPE && any_default_members)
error ("multiple fields in union %qT initialized", t);
-  *any_default_members = 1;
+  any_default_members_field = true;
 }
+
+  return any_default_members_field;
 }
 
 /* Check the data members (both static and non-static), class-scoped
@@ -3662,7 +3666,7 @@ check_field_decls (tree t, tree *access_decls,
   tree *field;
   tree *next;
   bool has_pointers;
-  int any_default_members;
+  bool any_default_members;
   int cant_pack = 0;
   int field_access = -1;
 
@@ -3672,7 +3676,7 @@ check_field_decls (tree t, tree *access_decls,
   has_pointers = false;
   /* Assume none of the members of this class have default
  initializations.  */
-  any_default_members = 0;
+  any_default_members = false;
 
   for (field = _FIELDS (t); *field; field = next)
 {
@@ -3868,10 +3872,10 @@ check_field_decls (tree t, tree *access_decls,
   /* We set DECL_C_BIT_FIELD in grokbitfield.
 If the type and width are valid, we'll also set DECL_BIT_FIELD.  */
   if (! DECL_C_BIT_FIELD (x) || ! check_bitfield_decl (x))
-   check_field_decl (x, t,
- cant_have_const_ctor_p,
- 

Re: [PATCH] Improve detection of constant conditions during jump threading

2016-04-28 Thread Patrick Palka
On Thu, Apr 28, 2016 at 12:52 PM, Jeff Law  wrote:
> On 04/20/2016 03:02 AM, Richard Biener wrote:
>>
>> On Tue, Apr 19, 2016 at 7:50 PM, Patrick Palka 
>> wrote:
>>>
>>> This patch makes the jump threader look through the BIT_AND_EXPRs and
>>> BIT_IOR_EXPRs within a condition so that we could find dominating
>>> ASSERT_EXPRs that could help make the overall condition evaluate to a
>>> constant.  For example, we currently don't perform any jump threading in
>>> the following test case even though it's known that if the code calls
>>> foo() then it can't possibly call bar() afterwards:
>
> I'd always envisioned we'd do more simplifications than we're doing now and
> this fits well within how I expected to exploit ASSERT_EXPRs and DOM's
> available expressions/const/copies tables.
>
> However, I do have some long term direction plans that may make how we do
> this change a bit.  In the mean time I don't see a reason not to go forward
> with your change.
>
>
>
>
>>>
>>> void
>>> baz_1 (int a, int b, int c)
>>> {
>>>   if (a && b)
>>> foo ();
>>>   if (!b && c)
>>> bar ();
>>> }
>>>
>>>:
>>>_4 = a_3(D) != 0;
>>>_6 = b_5(D) != 0;
>>>_7 = _4 & _6;
>>>if (_7 != 0)
>>>  goto ;
>>>else
>>>  goto ;
>>>
>>>:
>>>b_15 = ASSERT_EXPR ;
>>>foo ();
>>>
>>>:
>>>_10 = b_5(D) == 0;
>>>_12 = c_11(D) != 0;
>>>_13 = _10 & _12;
>>>if (_13 != 0)
>>>  goto ;
>>>else
>>>  goto ;
>>>
>>>:
>>>bar ();
>>>
>>>:
>>>return;
>>>
>>> So we here miss a jump threading opportunity that would have made bb 3
>>> jump
>>> straight to bb 6 instead of falling through to bb 4.
>>>
>>> If we inspect the operands of the BIT_AND_EXPR of _13 we'll notice that
>>> there is an ASSERT_EXPR that says its left operand b_5 is non-zero.  We
>>> could use this ASSERT_EXPR to deduce that the condition (_13 != 0) is
>>> always false.  This is what this patch does, basically by making
>>> simplify_control_stmt_condition recurse into BIT_AND_EXPRs and
>>> BIT_IOR_EXPRs.
>>>
>>> Does this seem like a good idea/approach?
>
> So the other approach I've been pondering for a while is backwards
> substitution.
>
> So given _13 != 0, we expand that to
>
> (_10 & _12) != 0
>
> Which further expands into
> ((b_5 == 0) & (c_11 != 0)) != 0
>
> And we follow b_5 back to the ASSERT_EXPR which allows us to start
> simplifying terms.
>
>
> The glitch in that plan is there is no easy linkage between the use of b_5
> in bb4 and the ASSERT_EXPR in bb3.  That's something Aldy, Andrew and myself
> are looking at independently for some of Aldy's work.

I see.. One other deficiency I noticed in the existing threading code
is that there may have been multiple ASSERT_EXPRs registered for b_5,
so bb3 could look like

:
b_15 = ASSERT_EXPR ;
b_16 = ASSERT_EXPR ;
foo ();

but currently we don't consider the 2nd ASSERT_EXPR because we only
look at the immediate uses of b_5.  This oversight makes us fail to
thread

void bar (void);
void baz (void);

void
foo (int a)
{
  if (a != 5 && a != 10)
bar ();
  if (a == 10)
baz ();
}

>
> But that's all future work...  Back to your patch...
>
>>>
>>> Notes:
>>>
>>> 1. This patch introduces a "regression" in
>>> gcc.dg/tree-ssa/ssa-thread-11.c
>>> in that we no longer perform FSM threading during vrp2 but instead we
>>> detect two new jump threading opportunities during vrp1.  Not sure if
>>> the new code is better but it is shorter.  I wonder how this should be
>>> resolved...
>>
>>
>> Try adjusting the testcase so that it performs the FSM threading again
>> or adjust the expected outcome...
>
> Right.  We just need to look closely at the before/after dumps, make a
> decision about whether the result is better or worse.  If it's better, then
> we adjust the output to the new better result (and I would claim that the
> same threading, but done earlier is better).
>
> Shorter isn't generally a good indicator of whether or not something is
> better.  The thing to look at is the number of conditional executed on the
> various paths through the CFG.
>
> In this specific instance, there's a good chance your analysis is catching
> something earlier and allowing it to be better simplified.  But let's do the
> analysis to make sure.

>From what I can tell, the patch does cause fewer conditionals to get
executed in general.  I spot two extra jumps that are threaded in the
final code compared to without the patch.  I wouldn't trust my
analysis though!

By the way, the test case ssa-thread-11.c is somewhat buggy since its
two functions lack return statements.  Also I would expect abort() to
have the noreturn attribute.

>
>>
>>> 2. I haven't tested the performance impact of this patch.  What would be
>>> a good way to do this?
>
> I have ways to do that.  Jump threading inherently is about reducing the
> 

Re: [PATCH] c++/66561 - __builtin_LINE at al. should yield constant expressions

2016-04-28 Thread Martin Sebor

+/* Fold __builtin_FILE() to a constant string.  */

NIT: When we refer to functions, we don't have the trailing().  So drop
it from the comment.


+
+/* Fold __builtin_FUNCTION() to a constant string.  */

Similarly.


+
+/* Fold __builtin_LINE() to an integer constant.  */

Similarly.


Okay, I'll fix it before committing once the patch is approved (or
in the next revision of the patch, whichever comes first).



So the big question I have is why we have to treat these differently
than __func__ __FUNCTION__ and __PRETTY_FUNCTION__.

Which leads me to:

constexpr_fn_retval which has:

case DECL_EXPR:
   {
 tree decl = DECL_EXPR_DECL (body);
 if (TREE_CODE (decl) == USING_DECL
 /* Accept __func__, __FUNCTION__, and __PRETTY_FUNCTION__.  */
 || DECL_ARTIFICIAL (decl))
   return NULL_TREE;
 return error_mark_node;
   }

Is the distinction here that the _builtin_XXX are essentially functions,
yet the others are _DECLs?  Thus we have to fold the builtin function so
that it is considered a constant?


Yes, the built-ins are functions that return a pointer and the three
symbols above are objects, basically static local arrays represented
as "artificial" VAR_DECLs, and they need to be folded to be usable
in constant expressions (in both C++ and C).

IIUC, the piece of code quoted above is a workaround put in place
for the C++ 11 restriction that constexpr functions cannot declare
local variables, but users expect to be able to refer to __func__
and related (bug 55425).  The code path only seems to be exercised
in C++ 11 mode, not later.  There's also the problem of C++ not
allowing static local variables in constexpr functions, and I think
there's another workaround for that somewhere.

Martin


Re: [PATCH], Fix _Complex when there are multiple FP types the same size

2016-04-28 Thread Michael Meissner
On Thu, Apr 28, 2016 at 05:10:07PM -0500, Segher Boessenkool wrote:
> On Thu, Apr 28, 2016 at 05:06:14PM -0400, Michael Meissner wrote:
> Hi Mike,
> 
> > * config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Add
> > support for __float128 complex datatypes.
> > (rs6000_hard_regno_mode_ok): Likewise.
> > (rs6000_setup_reg_addr_masks): Likewise.
> > (rs6000_complex_function_value): Likewise.
> > 
> > * config/rs6000/rs6000.h (FLOAT128_IEEE_P): Add support for
> > __float128 and __ibm128 complex.
> > (FLOAT128_IBM_P): Likewise.
> > (ALTIVEC_COMPLEX): Likewise.
> > (ALTIVEC_ARG_MAX_RETURN): Likewise.
> > 
> > * doc/extend.texi (Additional Floating Types): Document that
> > -mfloat128 must be used to enable __float128.  Document complex
> > __float128 and __ibm128 support.
> > 
> > [gcc/testsuite]
> > 2016-04-28  Michael Meissner  
> > 
> > * gcc.target/powerpc/float128-complex-1.c: New test for complex
> > __float128.
> 
> It is more normal to not have blank lines between files in the changelog.

Ok.

> >  #define FLOAT128_IEEE_P(MODE)  
> > \
> > -  (((MODE) == TFmode && TARGET_IEEEQUAD)   \
> > -   || ((MODE) == KFmode))
> > +  ((TARGET_IEEEQUAD && ((MODE) == TFmode || (MODE) == TCmode)) 
> > \
> > +   || ((MODE) == KFmode) || ((MODE) == KCmode))
> >  
> >  #define FLOAT128_IBM_P(MODE)   
> > \
> > -  (((MODE) == TFmode && !TARGET_IEEEQUAD)  \
> > -   || ((MODE) == IFmode))
> > +  ((!TARGET_IEEEQUAD && ((MODE) == TFmode || (MODE) == TCmode))
> > \
> > +   || ((MODE) == IFmode) || ((MODE) == ICmode))
> 
> Maybe these should be inline functions now?

No they can't be without moving them somewhere else. The problem is if you
declare them as functions machine_mode must be defined, and it isn't in a lot
of cases, such as compiling the gen* functions. Even if machine_mode might
eventually be defined, rs6000.h can be included for machmode.h is defined.

> > +/* _Complex __float128 needs two registers.  */
> > +#define ALTIVEC_COMPLEX((TARGET_FLOAT128) ? 1 : 0)

I wrote it that way to make ALTIVEC_ARG_MAX_RETURN fit in 79 columns. If you
want, I can probably rework ALTIVEC_ARG_MAX_RETURN.

> 
> This name is pretty confusing, way too generic too.  It is only used
> right below, you can just code it there?
> 
> >  /* Return registers */
> >  #define GP_ARG_RETURN GP_ARG_MIN_REG
> >  #define FP_ARG_RETURN FP_ARG_MIN_REG
> >  #define ALTIVEC_ARG_RETURN (FIRST_ALTIVEC_REGNO + 2)
> >  #define FP_ARG_MAX_RETURN (DEFAULT_ABI != ABI_ELFv2 ? FP_ARG_RETURN
> > \
> >: (FP_ARG_RETURN + AGGR_ARG_NUM_REG - 1))
> > -#define ALTIVEC_ARG_MAX_RETURN (DEFAULT_ABI != ABI_ELFv2 ? 
> > ALTIVEC_ARG_RETURN \
> > +#define ALTIVEC_ARG_MAX_RETURN (DEFAULT_ABI != ABI_ELFv2\
> > +   ? (ALTIVEC_ARG_RETURN + ALTIVEC_COMPLEX) \
> > : (ALTIVEC_ARG_RETURN + AGGR_ARG_NUM_REG - 1))
> 
> So you are changing the ABI here (for non-v2).  Are there any complications
> to that, is it compatible in all cases?

I don't think so, all it is saying is we return complex float128 in 2 altivec
registers. Given you could not access complex float128 before, I don't think it
is an issue. Note, presently you must use -mfloat128 to enable it at all.

> 
> > -#define PRINT_OPERAND_PUNCT_VALID_P(CODE)  ((CODE) == '&')
> > +#define PRINT_OPERAND_PUNCT_VALID_P(CODE)  ((CODE) == '&' || (CODE) == '@')
> 
> I think you mixed this in from another change?

Yes, this is from another change.

I will issue another patch tomorrow.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



[PATCH] Fix PR tree-optimization/51513

2016-04-28 Thread Peter Bergner
This patch fixes PR tree-optimization/51513, namely the generation of
wild branches due to switch case statements that only contain calls to
__builtin_unreachable().  For example, compiling using -O2 -fjump-tables
--param case-values-threshold=1 (to easily expose the bug), we see:

  switch (which)
{
  case 0:
return v0;
  case 1:
return v1;
  case 2:
return v2;
  default:
__builtin_unreachable( );
}

we currently generate for powerpc64le-linux:

cmpldi 7,3,2
bgt 7,.L2   <- Invalid branch
...
.L3:
mr 3,4
blr
.p2align 4,,15
.L2:<- Invalid branch target
.long 0
.byte 0,0,0,0,0,0,0,0
.size   bug,.-bug

...and for x86_64-linux:

bug:
.LFB0:
.cfi_startproc
cmpq$2, %rdi
ja  .L2 <- Invalid branch
jmp *.L4(,%rdi,8)
...
.L3:
movq%rsi, %rax
ret
.p2align 4,,10
.p2align 3
.L2:<- Invalid branch target
.cfi_endproc
.LFE0:
.size   bug, .-bug

The bug is that we end up deleting the unreachable block(s) from the CFG,
but we never remove the label(s) for the block(s) in the switch jump table.
We fix this by removing the case labels and their associated edges for
unreachable blocks.  Normal CFG cleanup removes the unreachable blocks.

This has passed bootstrap and regtesting on powerpc64le-linux and x86_64-linux
with no regressions.  Ok for trunk?

Peter


gcc/
PR tree-optimization/51513
* tree-cfg.c (gimple_unreachable_bb_p): New function.
(assert_unreachable_fallthru_edge_p): Use it.
(compress_case_label_vector): New function.
(group_case_labels_stmt): Use it.
(cleanup_dead_labels): Call gimple_unreachable_bb_p() and
compress_case_label_vector().  Remove labels and edges leading
to unreachable blocks.

gcc/testsuite/
PR tree-optimization/51513
* gcc.c-torture/execute/pr51513.c: New test.


Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 235531)
+++ gcc/tree-cfg.c  (working copy)
@@ -408,6 +408,33 @@ computed_goto_p (gimple *t)
  && TREE_CODE (gimple_goto_dest (t)) != LABEL_DECL);
 }
 
+/* Returns true if the basic block BB has no successors and only contains
+   a call to __builtin_unreachable ().  */
+
+static bool
+gimple_unreachable_bb_p (basic_block bb)
+{
+  gimple_stmt_iterator gsi;
+  gimple *stmt;
+
+  if (EDGE_COUNT (bb->succs) != 0)
+return false;
+
+  gsi = gsi_after_labels (bb);
+  if (gsi_end_p (gsi))
+return false;
+
+  stmt = gsi_stmt (gsi);
+  while (is_gimple_debug (stmt) || gimple_clobber_p (stmt))
+{
+  gsi_next ();
+  if (gsi_end_p (gsi))
+   return false;
+  stmt = gsi_stmt (gsi);
+}
+  return gimple_call_builtin_p (stmt, BUILT_IN_UNREACHABLE);
+}
+
 /* Returns true for edge E where e->src ends with a GIMPLE_COND and
the other edge points to a bb with just __builtin_unreachable ().
I.e. return true for C->M edge in:
@@ -431,23 +458,7 @@ assert_unreachable_fallthru_edge_p (edge
   basic_block other_bb = EDGE_SUCC (pred_bb, 0)->dest;
   if (other_bb == e->dest)
other_bb = EDGE_SUCC (pred_bb, 1)->dest;
-  if (EDGE_COUNT (other_bb->succs) == 0)
-   {
- gimple_stmt_iterator gsi = gsi_after_labels (other_bb);
- gimple *stmt;
-
- if (gsi_end_p (gsi))
-   return false;
- stmt = gsi_stmt (gsi);
- while (is_gimple_debug (stmt) || gimple_clobber_p (stmt))
-   {
- gsi_next ();
- if (gsi_end_p (gsi))
-   return false;
- stmt = gsi_stmt (gsi);
-   }
- return gimple_call_builtin_p (stmt, BUILT_IN_UNREACHABLE);
-   }
+  return gimple_unreachable_bb_p (other_bb);
 }
   return false;
 }
@@ -1401,6 +1412,26 @@ cleanup_dead_labels_eh (void)
   }
 }
 
+/* Compress the case labels in the label vector to remove NULL labels,
+   and adjust the length of the vector.  */
+
+void
+compress_case_label_vector (gswitch *stmt, size_t new_size, size_t old_size)
+{
+  size_t i, j;
+
+  gcc_assert (new_size <= old_size);
+
+  for (i = 0, j = 0; i < new_size; i++)
+{
+  while (!gimple_switch_label (stmt, j))
+   j++;
+  gimple_switch_set_label (stmt, i,
+  gimple_switch_label (stmt, j++));
+}
+
+  gimple_switch_set_num_labels (stmt, new_size);
+}
 
 /* Cleanup redundant labels.  This is a three-step process:
  1) Find the leading label for each block.
@@ -1485,17 +1516,81 @@ cleanup_dead_labels (void)
case GIMPLE_SWITCH:
  {
gswitch *switch_stmt = as_a  (stmt);
-   size_t i, n = gimple_switch_num_labels (switch_stmt);
-
-   /* Replace all destination labels.  */
-   

Re: [ping][patch] update handling of 'acc parallel loop' reductions for PR70626

2016-04-28 Thread Cesar Philippidis
Ping.

Cesar

On 04/15/2016 02:30 PM, Cesar Philippidis wrote:
> This patch makes the c, c++ and fortran FEs duplicate the reduction
> clauses in a combined 'acc parallel loop' directive when it splits that
> directive into separate parallel and loop directives. So given something
> like
> 
>   #pragma acc parallel loop reduction(+:var)
>   for (i = 0; i < 10; i++)
> var++;
> 
> all of the front ends will split the original directive into
> 
>   #pragma acc parallel reduction(+:var)
>   #pragma acc loop reduction(+:var)
>   for (i = 0; i < 10; i++)
> var++;
> 
> Later on, the gimplifier will and an implicit copy(var) clause so that
> the original variable gets updated.
> 
> There are still a couple of loose ends with this implementation. For
> instance, consider
> 
>   #pragma acc parallel loop reduction(+:var) vector num_gangs(10)
>   for (i = 0; i < 10; i++)
> var++;
> 
> This will get expanded into
> 
>   #pragma acc parallel reduction(+:var) num_gangs(10)
>   #pragma acc loop vector reduction(+:var)
>   for (i = 0; i < 10; i++)
> var++;
> 
> And because OpenACC permits gang to operate in a redundant mode, the
> output of this loop can be either 10 or 100 depending if the compiler
> wants to assign gangs to the acc loop (which is permissible because
> vector clause implies independence). I'm still working out these details
> with the acc technical committee, but there is a consensus that the
> reduction clause needs to be duplicated in both the parallel and loop
> directives.  Depending on who I talk to, some people consider the
> num_gangs example as invalid while others think that the compiler should
> add an implicit gang clause.
> 
> By the way, I had to adjust template-reduction.C because of a parser bug
> that I found in PR70688. I didn't want to make this patch too big by
> including multiple bug fixes, so I modified that test case temporarily.
> Later on, once I fix PR70626, I re-enable that test coverage.
> 
> Is this patch ok for gcc-6 and trunk? I bootstrapped and regression
> tested on x86_64-none-linux-gnu.
> 
> Cesar
> 

2016-04-15  Cesar Philippidis  

	gcc/c-family/
	PR middle-end/70626
	* c-common.h (c_oacc_split_loop_clauses): Add boolean argument.
	* c-omp.c (c_oacc_split_loop_clauses): Use it to duplicate
	reduction clauses in acc parallel loops.

	gcc/c/
	PR middle-end/70626
	* c-parser.c (c_parser_oacc_loop): Don't augment mask with
	OACC_LOOP_CLAUSE_MASK.
	(c_parser_oacc_kernels_parallel): Update call to
	c_oacc_split_loop_clauses.

	gcc/cp/
	PR middle-end/70626
	* parser.c (cp_parser_oacc_loop): Don't augment mask with
	OACC_LOOP_CLAUSE_MASK.
	(cp_parser_oacc_kernels_parallel): Update call to
	c_oacc_split_loop_clauses.

	gcc/fortran/
	PR middle-end/70626
	* trans-openmp.c (gfc_trans_oacc_combined_directive): Duplicate
	the reduction clause in both parallel and loop directives.

	gcc/testsuite/
	PR middle-end/70626
	* c-c++-common/goacc/combined-reduction.c: New test.
	* gfortran.dg/goacc/reduction-2.f95: Add check for kernels reductions.

	libgomp/
	PR middle-end/70626
	* testsuite/libgomp.oacc-c++/template-reduction.C: Adjust test.
	* testsuite/libgomp.oacc-c-c++-common/combined-reduction.c: New test.
	* testsuite/libgomp.oacc-fortran/combined-reduction.f90: New test.


diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index fa3746c..dd74d0d 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1276,7 +1276,7 @@ extern bool c_omp_check_loop_iv (tree, tree, walk_tree_lh);
 extern bool c_omp_check_loop_iv_exprs (location_t, tree, tree, tree, tree,
    walk_tree_lh);
 extern tree c_finish_oacc_wait (location_t, tree, tree);
-extern tree c_oacc_split_loop_clauses (tree, tree *);
+extern tree c_oacc_split_loop_clauses (tree, tree *, bool);
 extern void c_omp_split_clauses (location_t, enum tree_code, omp_clause_mask,
  tree, tree *);
 extern tree c_omp_declare_simd_clauses_to_numbers (tree, tree);
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index 5469d0d5..be401bb 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -861,9 +861,10 @@ c_omp_check_loop_iv_exprs (location_t stmt_loc, tree declv, tree decl,
#pragma acc parallel loop  */
 
 tree
-c_oacc_split_loop_clauses (tree clauses, tree *not_loop_clauses)
+c_oacc_split_loop_clauses (tree clauses, tree *not_loop_clauses,
+			   bool is_parallel)
 {
-  tree next, loop_clauses;
+  tree next, loop_clauses, nc;
 
   loop_clauses = *not_loop_clauses = NULL_TREE;
   for (; clauses ; clauses = next)
@@ -882,7 +883,23 @@ c_oacc_split_loop_clauses (tree clauses, tree *not_loop_clauses)
 	case OMP_CLAUSE_SEQ:
 	case OMP_CLAUSE_INDEPENDENT:
 	case OMP_CLAUSE_PRIVATE:
+	  OMP_CLAUSE_CHAIN (clauses) = loop_clauses;
+	  loop_clauses = clauses;
+	  break;
+
+	  /* Reductions must be duplicated on both constructs.  */
 	case OMP_CLAUSE_REDUCTION:
+	  if (is_parallel)
+	{
+	  nc = build_omp_clause (OMP_CLAUSE_LOCATION (clauses),
+

Re: [PATCH] c++/66561 - __builtin_LINE at al. should yield constant expressions

2016-04-28 Thread Jeff Law

On 04/26/2016 05:59 PM, Martin Sebor wrote:

PR c++/66639 asked to declare __func__ , __FUNCTION__ and
__PRETTY_FUNCTION__ as constexpr​.  With the request fulfilled
sometime in the 6.0 time frame (possibly as a result of fixing
c++/70353), the attached patch implements the corresponding
change suggested in c++/66561 - __builtin_LINE at al. should
yield constant expression, and updates the manual to reflect
both.  It has been tested on x86_64 Linux.

Martin

gcc-66561.patch


PR c++/66561 - __builtin_LINE at al. should yield constant expressions
PR c++/66639 - declare __func__ , __FUNCTION__ & __PRETTY_FUNCTION__ constexpr

gcc/testsuite/ChangeLog:
2016-04-26  Martin Sebor  

PR c++/66561
* c-c++-common/builtin_location.c: New test.
* g++.dg/cpp1y/builtin_location.C: New test.

gcc/cp/ChangeLog:
2016-04-26  Martin Sebor  

PR c++/66561
* tree.c (builtin_valid_in_constant_expr_p): Treat BUILT_IN_FILE,
BUILT_IN_FUNCTION, and BUILT_IN_LINE as constant expressions.

gcc/ChangeLog:
2016-04-26  Martin Sebor  

PR c++/66561
* builtins.c (fold_builtin_FILE): New function.
(fold_builtin_FUNCTION, fold_builtin_LINE): New functions.
(fold_builtin_0): Call them.

PR c++/66561
* doc/extend.texi (Other Builtins): Update __builtin_FILE,
__builtin_FUNCTION, and __builtin_LINE to reflect they yield
constants.

PR c++/66639
* doc/extend.texi (Function Names as Strings): Update __func__,
__FUNCTION__, __PRETTY_FUNCTION__ to reflect they evaluate to
constants.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 3d89baf..44c4f63 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7927,6 +7927,39 @@ fold_builtin_arith_overflow (location_t loc, enum 
built_in_function fcode,
   return build2_loc (loc, COMPOUND_EXPR, boolean_type_node, store, ovfres);
 }

+/* Fold __builtin_FILE() to a constant string.  */
NIT: When we refer to functions, we don't have the trailing().  So drop 
it from the comment.



+
+/* Fold __builtin_FUNCTION() to a constant string.  */

Similarly.


+
+/* Fold __builtin_LINE() to an integer constant.  */

Similarly.

So the big question I have is why we have to treat these differently 
than __func__ __FUNCTION__ and __PRETTY_FUNCTION__.


Which leads me to:

constexpr_fn_retval which has:

   case DECL_EXPR:
  {
tree decl = DECL_EXPR_DECL (body);
if (TREE_CODE (decl) == USING_DECL
/* Accept __func__, __FUNCTION__, and __PRETTY_FUNCTION__.  */
|| DECL_ARTIFICIAL (decl))
  return NULL_TREE;
return error_mark_node;
  }

Is the distinction here that the _builtin_XXX are essentially functions, 
yet the others are _DECLs?  Thus we have to fold the builtin function so 
that it is considered a constant?



jeff



Re: [PATCH] Re-use cc1-checksum.c for stage-final

2016-04-28 Thread Jeff Law

On 04/28/2016 02:49 AM, Richard Biener wrote:


The following prototype patch re-uses cc1-checksum.c from the
previous stage when compiling stage-final.  This eventually
allows to compare cc1 from the last two stages to fix the
lack of a true comparison when doing LTO bootstrap (it
compiles LTO bytecode from the compile-stage there, not the
final optimization result).

Bootstrapped on x86_64-unknown-linux-gnu.

When stripping gcc/cc1 and prev-gcc/cc1 after the bootstrap
they now compare identical (with LTO bootstrap it should
not require stripping as that doesn't do a bootstrap-debug AFAIK).

Is sth like this acceptable?  (consider it also done for cp/Make-lang.in)

In theory we can compare all stage1 languages but I guess comparing
the required ones for a LTO bootstrap, cc1, cc1plus and lto1 would
be sufficient (or even just comparing one binary in which case
comparing lto1 would not require any patches).

This also gets rid of the annoying warning that cc1-checksum.o
differs (obviously).

Thanks,
Richard.

2016-04-28  Richard Biener  

c/
* Make-lang.in (cc1-checksum.c): For stage-final re-use
the checksum from the previous stage.
I won't object if you add a comment into the fragment indicating why 
you're doing this.


jeff



Re: [PATCH] Fix ICE during operand_equal_p hash checking (PR middle-end/70843)

2016-04-28 Thread Jeff Law

On 04/28/2016 09:50 AM, Jakub Jelinek wrote:

Hi!

As reported in the PR and can be seen on this simplified testcase
everywhere, the FEs sometimes call operand_equal_p e.g. on a SAVE_EXPR
that contains a BIND_EXPR in it, and if arg0 == arg1, operand_equal_p
can return non-zero on it.
The ICE is because inchash::add_expr is unprepared to hash some trees,
it handles just tcc_declaration, selected specific trees and expressions of
all kinds, the last one usually by just recursing on all their operands.
For BIND_EXPR, the last operand is usually a BLOCK which we ICE on though,
and the middle argument usually a STATEMENT_LIST that we ICE on as well.

The first hunk is just an optimization (but fixes the ICE anyway),
I think we really don't need to verify that a hash function for the same
argument always returns the same value.  But I can imagine e.g.
a SAVE_EXPR of BIND_EXPR + var and var + the same SAVE_EXPR being compared
using operand_equal_p and there we wouldn't be equal at the top level and
still ICE.
The second hunk alone fixes the ICE too, by making sure we handle those
(just ignoring BLOCK and OMP_CLAUSE (the latter for now, if we find we want
to hash pre-OMP expansion trees too often we could adjust).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-04-28  Jakub Jelinek  

PR middle-end/70843
* fold-const.c (operand_equal_p): Don't verify hash value equality
if arg0 == arg1.
* tree.c (inchash::add_expr): Handle STATEMENT_LIST.  Ignore BLOCK
and OMP_CLAUSE.

* gcc.dg/pr70843.c: New test.

OK.

Jeff



Re: [PATCH], Fix _Complex when there are multiple FP types the same size

2016-04-28 Thread Segher Boessenkool
On Thu, Apr 28, 2016 at 05:06:14PM -0400, Michael Meissner wrote:
Hi Mike,

>   * config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Add
>   support for __float128 complex datatypes.
>   (rs6000_hard_regno_mode_ok): Likewise.
>   (rs6000_setup_reg_addr_masks): Likewise.
>   (rs6000_complex_function_value): Likewise.
> 
>   * config/rs6000/rs6000.h (FLOAT128_IEEE_P): Add support for
>   __float128 and __ibm128 complex.
>   (FLOAT128_IBM_P): Likewise.
>   (ALTIVEC_COMPLEX): Likewise.
>   (ALTIVEC_ARG_MAX_RETURN): Likewise.
> 
>   * doc/extend.texi (Additional Floating Types): Document that
>   -mfloat128 must be used to enable __float128.  Document complex
>   __float128 and __ibm128 support.
> 
> [gcc/testsuite]
> 2016-04-28  Michael Meissner  
> 
>   * gcc.target/powerpc/float128-complex-1.c: New test for complex
>   __float128.

It is more normal to not have blank lines between files in the changelog.

>  #define FLOAT128_IEEE_P(MODE)
> \
> -  (((MODE) == TFmode && TARGET_IEEEQUAD) \
> -   || ((MODE) == KFmode))
> +  ((TARGET_IEEEQUAD && ((MODE) == TFmode || (MODE) == TCmode))   
> \
> +   || ((MODE) == KFmode) || ((MODE) == KCmode))
>  
>  #define FLOAT128_IBM_P(MODE) \
> -  (((MODE) == TFmode && !TARGET_IEEEQUAD)\
> -   || ((MODE) == IFmode))
> +  ((!TARGET_IEEEQUAD && ((MODE) == TFmode || (MODE) == TCmode))  
> \
> +   || ((MODE) == IFmode) || ((MODE) == ICmode))

Maybe these should be inline functions now?

> +/* _Complex __float128 needs two registers.  */
> +#define ALTIVEC_COMPLEX  ((TARGET_FLOAT128) ? 1 : 0)

This name is pretty confusing, way too generic too.  It is only used
right below, you can just code it there?

>  /* Return registers */
>  #define GP_ARG_RETURN GP_ARG_MIN_REG
>  #define FP_ARG_RETURN FP_ARG_MIN_REG
>  #define ALTIVEC_ARG_RETURN (FIRST_ALTIVEC_REGNO + 2)
>  #define FP_ARG_MAX_RETURN (DEFAULT_ABI != ABI_ELFv2 ? FP_ARG_RETURN  \
>  : (FP_ARG_RETURN + AGGR_ARG_NUM_REG - 1))
> -#define ALTIVEC_ARG_MAX_RETURN (DEFAULT_ABI != ABI_ELFv2 ? 
> ALTIVEC_ARG_RETURN \
> +#define ALTIVEC_ARG_MAX_RETURN (DEFAULT_ABI != ABI_ELFv2  \
> + ? (ALTIVEC_ARG_RETURN + ALTIVEC_COMPLEX) \
>   : (ALTIVEC_ARG_RETURN + AGGR_ARG_NUM_REG - 1))

So you are changing the ABI here (for non-v2).  Are there any complications
to that, is it compatible in all cases?

> -#define PRINT_OPERAND_PUNCT_VALID_P(CODE)  ((CODE) == '&')
> +#define PRINT_OPERAND_PUNCT_VALID_P(CODE)  ((CODE) == '&' || (CODE) == '@')

I think you mixed this in from another change?


Segher


[PATCH] Fix wrong argument to build_modify_expr

2016-04-28 Thread Jeff Law


As Andrew pointed out a while ago, build_modify_expr's last argument is 
supposed to be a tree type.  However in two cases within 
fix_builtin_array_notation_fn we end passing down an tree expression.


Thankfully in the contexts were this happens, the last argument isn't 
actually used, so it hasn't caused problems.


This patch fixes the two call sites.  Andrew and I independently 
bootstrapped and regression tested this change on x86_64.


Installed on the trunk.

Jeff
commit bd4625a0c2cb387b7e05ab7785b1063a8a0a
Author: law 
Date:   Thu Apr 28 22:00:19 2016 +

2016-04-28  Andrew MacLeod  

* c-array-notation.c (fix_builtin_array_notation_fn): Fix final
argument to build_modify_expr in two cases.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@235614 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/c/ChangeLog b/gcc/c/ChangeLog
index a641721..5161f7d 100644
--- a/gcc/c/ChangeLog
+++ b/gcc/c/ChangeLog
@@ -1,3 +1,8 @@
+2016-04-28  Andrew MacLeod  
+
+   * c-array-notation.c (fix_builtin_array_notation_fn): Fix final
+   argument to build_modify_expr in two cases.
+
 2016-04-27  Bernd Schmidt  
 
* c-parser.c (c_parser_postfix_expression_after_primary): Call
diff --git a/gcc/c/c-array-notation.c b/gcc/c/c-array-notation.c
index 716bd11..c7cf66a 100644
--- a/gcc/c/c-array-notation.c
+++ b/gcc/c/c-array-notation.c
@@ -489,7 +489,7 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
  new_yes_expr = build_modify_expr
(location, array_ind_value, TREE_TYPE (array_ind_value),
 NOP_EXPR,
-location, func_parm, TREE_OPERAND (array_op0, 1));
+location, func_parm, TREE_TYPE (TREE_OPERAND (array_op0, 1)));
}
   new_yes_list = alloc_stmt_list ();
   append_to_statement_list (new_yes_ind, _yes_list);
@@ -539,7 +539,7 @@ fix_builtin_array_notation_fn (tree an_builtin_fn, tree 
*new_var)
  new_yes_expr = build_modify_expr
(location, array_ind_value, TREE_TYPE (array_ind_value),
 NOP_EXPR,
-location, func_parm, TREE_OPERAND (array_op0, 1));
+location, func_parm, TREE_TYPE (TREE_OPERAND (array_op0, 1)));
}
   new_yes_list = alloc_stmt_list ();
   append_to_statement_list (new_yes_ind, _yes_list);


Re: not a type buglet in c/c-array-notation

2016-04-28 Thread Jeff Law

On 03/31/2016 07:00 AM, Andrew MacLeod wrote:

Another potential buglet  I stumbled across whilst testing the tree-type
work:
in c/c-array-notation.c::fix_builtin_array_notation_fn()
<...>
  if (list_size > 1)
{
  new_yes_ind = build_modify_expr
(location, *new_var, TREE_TYPE (*new_var), NOP_EXPR,
 location, an_loop_info[0].var, TREE_TYPE
(an_loop_info[0].var));
  new_yes_expr = build_modify_expr
(location, array_ind_value, TREE_TYPE (array_ind_value),
 NOP_EXPR,
 location, func_parm, TREE_TYPE ((*array_operand)[0]));
}
  else
{
  new_yes_ind = build_modify_expr
(location, *new_var, TREE_TYPE (*new_var), NOP_EXPR,
 location, TREE_OPERAND (array_op0, 1),
 TREE_TYPE (TREE_OPERAND (array_op0, 1)));
  new_yes_expr = build_modify_expr
(location, array_ind_value, TREE_TYPE (array_ind_value),
 NOP_EXPR,
 location, func_parm, TREE_OPERAND (array_op0, 1));
<<<
}

'build_modify_expr' expects a type in that final parameter position. It
triggered as showing a non-type being passed into a type parameter.

I think the last operand on the last line ought to be wrapped in
TREE_TYPE() just like it is in the first expression of the else.. Either
that, or someone who understands the code needs to figure out whats
really wanted there.   There is a second occurrence later in the same
routine.   Patch number 1 makes this change and bootstraps/passes
regression.

If someone that understands the code wants to have a look, The test can
be triggered in the cilk testsuite by adding an assert to
build_modify_expr  (thats patch number 2 for convenience)  and running
testcase :
make RUNTESTFLAGS=cilk-plus.exp=sec_reduce_ind_same_value.c check-gcc

 I still have a couple from last fall that were discussed..I'm queuing
these up for once stage 1 opens and we can discuss them again.
Looking through build_modify_expr and how it uses that last operand, 
it's unlikely that the goof in fix_builtin_array_notation_fn would cause 
problems.  In fact, I'm pretty sure it'll never be used when called from 
either of those sites.  A combination of factors come into play and we 
certainly can't guarantee those factors won't change over time, so we 
ought to go ahead and fix.


I'm going to cons up a ChangeLog and commit your patch.

jeff


Re: [C++ Patch] PR 66644

2016-04-28 Thread Jason Merrill
I would expect this to cause a false negative on a union of two 
anonymous structs, both of which have initialized members.


I think better would be to have a local any_default_members rather than 
passing the same pointer through all levels.


Also, you can look at 'type' rather than DECL_CONTEXT (field).

Jason


[PATCH], Fix _Complex when there are multiple FP types the same size

2016-04-28 Thread Michael Meissner
In GCC 6.x, I was not able to get complex __float128 to work before the cut off
period for stage1 submissions. This patch enables complex support for PowerPC
__float128. Note, it does not address adding complex support in libgcc.

Note, similar to the x86_64, you cannot say:

typedef _Complex __float128 f128_complex_type;

f128_complex_type a, b, c;

instead you must use an alternate for using __attributes__:

typedef _Complex float __attribute__((mode(KC))) f128_complex_type;

The problem is when layout_type created the complex type, mode_for_size is
used to find the appropriate complex type, using a size of 2 times the base
type. Unfortunately, in the PowerPC, we have two 128-bit floating point types
and mode_for_size returns the wrong one.

I solved this by having genmodes.c build a table that give a particular mode,
gives the mode that is the complex type for that mode, and having tree.c and
stor-layout.c use this type instead of using mode_for_size.

I did a boostrap and regression test and it did not show any regressions.

Are these patches acceptable to check into the trunk? After they go into the
trunk, and if it doesn't destablize the other ports, I would like to check
these changes into the GCC 6.x branch. Is this ok?

There are 3 attachments to this mail:

   1)   The first attachment (gcc-stage7.patch001b) are the machine independent
changes.

   2)   The second attachment (gcc-stage7.patch001c) are the changes to the
PowerPC compiler.

   3)   The third attachment (gcc-stage7.patch001d) is a new test to make sure
complex float128 continues to work.

I or somebody else will submit a patch later to add support for libgcc.

[gcc]
2016-04-28  Michael Meissner  

* machmode.h (mode_complex): Add support to give the complex mode
for a given mode.
(GET_MODE_COMPLEX_MODE): Likewise.

* stor-layout.c (layout_type): For COMPLEX_TYPE, use the mode
stored by build_complex_type instead of trying to figure out the
appropriate mode based on the size.

* genmodes.c (struct mode_data): Add field for the complex type of
the given type.
(blank_mode): Likewise.
(make_complex_modes): Remember the complex mode created in the
base type.
(emit_mode_complex): Write out the mode_complex array to map a
type mode to the complex version.
(emit_insn_modes_c): Likewise.

* tree.c (build_complex_type): Set the complex type to use before
calling layout_type.

* config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Add
support for __float128 complex datatypes.
(rs6000_hard_regno_mode_ok): Likewise.
(rs6000_setup_reg_addr_masks): Likewise.
(rs6000_complex_function_value): Likewise.

* config/rs6000/rs6000.h (FLOAT128_IEEE_P): Add support for
__float128 and __ibm128 complex.
(FLOAT128_IBM_P): Likewise.
(ALTIVEC_COMPLEX): Likewise.
(ALTIVEC_ARG_MAX_RETURN): Likewise.

* doc/extend.texi (Additional Floating Types): Document that
-mfloat128 must be used to enable __float128.  Document complex
__float128 and __ibm128 support.

[gcc/testsuite]
2016-04-28  Michael Meissner  

* gcc.target/powerpc/float128-complex-1.c: New test for complex
__float128.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/machmode.h
===
--- gcc/machmode.h  (revision 235529)
+++ gcc/machmode.h  (working copy)
@@ -269,6 +269,10 @@ extern const unsigned char mode_wider[NU
 extern const unsigned char mode_2xwider[NUM_MACHINE_MODES];
 #define GET_MODE_2XWIDER_MODE(MODE) ((machine_mode) mode_2xwider[MODE])
 
+/* Get the complex mode from the component mode.  */
+extern const unsigned char mode_complex[NUM_MACHINE_MODES];
+#define GET_MODE_COMPLEX_MODE(MODE) ((machine_mode) mode_complex[MODE])
+
 /* Return the mode for data of a given size SIZE and mode class CLASS.
If LIMIT is nonzero, then don't use modes bigger than MAX_FIXED_MODE_SIZE.
The value is BLKmode if no other mode is found.  */
Index: gcc/stor-layout.c
===
--- gcc/stor-layout.c   (revision 235529)
+++ gcc/stor-layout.c   (working copy)
@@ -2146,11 +2146,19 @@ layout_type (tree type)
 
 case COMPLEX_TYPE:
   TYPE_UNSIGNED (type) = TYPE_UNSIGNED (TREE_TYPE (type));
-  SET_TYPE_MODE (type,
-mode_for_size (2 * TYPE_PRECISION (TREE_TYPE (type)),
-   (TREE_CODE (TREE_TYPE (type)) == REAL_TYPE
-? MODE_COMPLEX_FLOAT : MODE_COMPLEX_INT),
-0));
+
+  /* 

Re: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant.

2016-04-28 Thread Joern Wolfgang Rennecke



On 28/04/16 21:31, Claudiu Zissulescu wrote:


Otherwise, I'd suggest using a traditional integer letter.  'J' is free.

Thanks for the suggestion, I will use 'J'.


Why do you remove half of the indentation?

Unwanted reformatting, sorry for this, I will revert it.

I have the feeling you are happy with my new patch. Is there anything 
to be added to it besides fixing the above issues?

No, otherwise it looks OK.


Re: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant.

2016-04-28 Thread Claudiu Zissulescu


Otherwise, I'd suggest using a traditional integer letter.  'J' is free.

Thanks for the suggestion, I will use 'J'.


Why do you remove half of the indentation?

Unwanted reformatting, sorry for this, I will revert it.

I have the feeling you are happy with my new patch. Is there anything to 
be added to it besides fixing the above issues?


Thanks,
Claudiu


Re: Move "X +- C1 CMP C2 to X CMP C2 -+ C1" to match.pd

2016-04-28 Thread Marc Glisse

On Wed, 27 Apr 2016, Richard Biener wrote:


--- trunk4/gcc/fold-const.h (revision 235452)
+++ trunk4/gcc/fold-const.h (working copy)
@@ -13,20 +13,22 @@ WARRANTY; without even the implied warra
 FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
 for more details.

 You should have received a copy of the GNU General Public License
 along with GCC; see the file COPYING3.  If not see
 .  */

 #ifndef GCC_FOLD_CONST_H
 #define GCC_FOLD_CONST_H

+#include 
+


I think the canonical way is to include options.h where you include
fold-const.h ...
(ick)

Doesn't the prototype serve as a forward declaration only and thus including
options.h from gimple-match-head.c is enough?


Doesn't look like it. If I remove this include, I get build failures for
a large part of the C front-end (through c-family/c-common.h) and
tree-ssa-scopedtables.c. Including options.h in those 2 files seems to
work (I didn't check if all the files in config/ that include
fold-const.h also indirectly include options.h). If you really think
that's better, I'll do it...

--
Marc Glisse


[C++ Patch] PR 66644

2016-04-28 Thread Paolo Carlini

Hi,

only when Jakub bumped some bugs in preparation for the release I noted 
that this one remained assigned to me for way too much time...


Roughly speaking, the problem is caused by the fact that when we have a 
GNU anonymous struct inside a union the fields are flattened out and 
appear to be just individual fields of the union. Thus we end up 
rejecting test1 below because multiple fields are initialized and we 
don't simply handle them later on as NSDMIs. A GNU anonymous struct is 
required for the issue to show up, thus the details of the way we want 
to handle such code are debatable, but we know that, eg, both EDG and 
clang accept test1 too (in relaxed GNU mode, at least), besides test2 
and test3.


Given the underlying cause of the rejection I could easily imagine other 
less straightforward ways to match the cases at issues in 
check_field_decl (eg, along the lines DECL_CONTEXT (field) == t ??), but 
the below certainly passes testing on x86_64-linux.


Thanks,
Paolo.


/cp
2016-04-28  Paolo Carlini  

PR c++/66644
* class.c (check_field_decl): Do not reject multiple initialized
fields in anonymous struct.

/testsuite
2016-04-28  Paolo Carlini  

PR c++/66644
* g++.dg/cpp0x/nsdmi-anon-struct1.C: New.
Index: cp/class.c
===
--- cp/class.c  (revision 235557)
+++ cp/class.c  (working copy)
@@ -3623,7 +3623,11 @@ check_field_decl (tree field,
 {
   /* `build_class_init_list' does not recognize
 non-FIELD_DECLs.  */
-  if (TREE_CODE (t) == UNION_TYPE && *any_default_members != 0)
+  if (TREE_CODE (t) == UNION_TYPE && *any_default_members != 0
+ /* As a GNU extension initializing in C++11 multiple fields
+of an anonymous struct living inside a union is fine.  */
+ && !(TREE_CODE (DECL_CONTEXT (field)) == RECORD_TYPE
+  && ANON_AGGR_TYPE_P (DECL_CONTEXT (field
error ("multiple fields in union %qT initialized", t);
   *any_default_members = 1;
 }
Index: testsuite/g++.dg/cpp0x/nsdmi-anon-struct1.C
===
--- testsuite/g++.dg/cpp0x/nsdmi-anon-struct1.C (revision 0)
+++ testsuite/g++.dg/cpp0x/nsdmi-anon-struct1.C (working copy)
@@ -0,0 +1,30 @@
+// PR c++/66644
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wno-pedantic" }
+
+struct test1  
+{
+  union
+  {
+struct { char a=0, b=0; };
+char buffer[16];
+  };
+};
+
+struct test2 
+{
+  union  
+  {
+struct { char a=0, b; };
+char buffer[16];
+  };
+};
+
+struct test3
+{
+  union
+  {
+struct { char a, b; } test2{0,0};
+char buffer[16];
+  };
+};


Go patch committed: Mark concurrent calls

2016-04-28 Thread Ian Lance Taylor
This patch by Chris Manghane marks concurrent calls in the Go
frontend.  This is a small change that prepares for future patches.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 235602)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-b17e404f5b8954e008b512741296d238ab7b2ef9
+50b2b468a85045c66d60112dc094c31ec4897123
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.h
===
--- gcc/go/gofrontend/expressions.h (revision 235602)
+++ gcc/go/gofrontend/expressions.h (working copy)
@@ -1985,8 +1985,8 @@ class Call_expression : public Expressio
   fn_(fn), args_(args), type_(NULL), results_(NULL), call_(NULL),
   call_temp_(NULL), expected_result_count_(0), is_varargs_(is_varargs),
   varargs_are_lowered_(false), types_are_determined_(false),
-  is_deferred_(false), issued_error_(false), is_multi_value_arg_(false),
-  is_flattened_(false)
+  is_deferred_(false), is_concurrent_(false), issued_error_(false),
+  is_multi_value_arg_(false), is_flattened_(false)
   { }
 
   // The function to call.
@@ -2057,6 +2057,16 @@ class Call_expression : public Expressio
   set_is_deferred()
   { this->is_deferred_ = true; }
 
+  // Whether this call is concurrently executed.
+  bool
+  is_concurrent() const
+  { return this->is_concurrent_; }
+
+  // Note that the call is concurrently executed.
+  void
+  set_is_concurrent()
+  { this->is_concurrent_ = true; }
+
   // We have found an error with this call expression; return true if
   // we should report it.
   bool
@@ -2170,6 +2180,8 @@ class Call_expression : public Expressio
   bool types_are_determined_;
   // True if the call is an argument to a defer statement.
   bool is_deferred_;
+  // True if the call is an argument to a go statement.
+  bool is_concurrent_;
   // True if we reported an error about a mismatch between call
   // results and uses.  This is to avoid producing multiple errors
   // when there are multiple Call_result_expressions.
Index: gcc/go/gofrontend/statements.cc
===
--- gcc/go/gofrontend/statements.cc (revision 234304)
+++ gcc/go/gofrontend/statements.cc (working copy)
@@ -2532,7 +2532,9 @@ Thunk_statement::build_thunk(Gogo* gogo,
 
   gogo->flatten_block(function, b);
 
-  if (may_call_recover || recover_arg != NULL)
+  if (may_call_recover
+  || recover_arg != NULL
+  || this->classification() == STATEMENT_GO)
 {
   // Dig up the call expression, which may have been changed
   // during lowering.
@@ -2546,6 +2548,8 @@ Thunk_statement::build_thunk(Gogo* gogo,
{
  if (may_call_recover)
ce->set_is_deferred();
+ if (this->classification() == STATEMENT_GO)
+   ce->set_is_concurrent();
  if (recover_arg != NULL)
ce->set_recover_arg(recover_arg);
}


Re: [RFA][PATCH] Adding missing calls to bitmap_clear

2016-04-28 Thread Jeff Law

On 03/22/2016 03:37 AM, Richard Biener wrote:

On Mon, Mar 21, 2016 at 9:32 PM, Jeff Law  wrote:

On 03/21/2016 01:10 PM, Bernd Schmidt wrote:


On 03/21/2016 08:06 PM, Jeff Law wrote:



As noted last week, find_removable_extensions initializes several
bitmaps, but doesn't clear them.

This is not strictly a leak as the GC system should find dead data, but
it's better to go ahead and clear the bitmaps.  That releases the
elements back to the cache and presumably makes things easier for the GC
system as well.

Bootstrapped and regression tested on x86_64-linux-gnu.

OK for the trunk?



Looks like they don't leak anywhere, so ok. Probably ok even to install
it now but maybe stage1 would be better timing.


I don't mind waiting for the next stage1, this is a pretty minor issue.


It's ok at this stage as it will also fix -fmem-report.  Please also move
the thing back to heap, see below.

Btw we should disallow bitmap_initialize (, NULL) as it does not do
the same thing as BITMAP_ALLOC (NULL), it does the same thing
as BITMAP_ALLOC_GC ().  Thus I'd rather have a bitmap_initialize_gc ()
and a bitmap_initialize (, NULL) that ends up using the global
bitmap obstack.  No idea where REE came from history wise.

A grep shows only

ira.c:  bitmap_initialize (_insns, NULL);
ree.c:  bitmap_initialize (, NULL);
ree.c:  bitmap_initialize (, NULL);
ree.c:  bitmap_initialize (, NULL);
ree.c:  bitmap_initialize (, NULL);
It's more than that.  Sadly folks have passed in "0" instead of NULL in 
various places.


./haifa-sched.c:  bitmap_initialize (, 0);
./haifa-sched.c:  bitmap_initialize (, 0);
./haifa-sched.c:  bitmap_initialize (_ready, 0);
./sched-ebb.c:  bitmap_initialize (_calc_deps, 0);
./sched-rgn.c:  bitmap_initialize (_in_df, 0);
./testsuite/gcc.dg/pr45352.c:  bitmap_initialize_stat (0);
./ira.c:  bitmap_initialize (, 0);
./ira.c:  bitmap_initialize (, 0);
./ira.c:  bitmap_initialize (, 0);
./ira.c:  bitmap_initialize (, 0);
./ira.c:  bitmap_initialize (_as_input, 0);
./ira.c:  bitmap_initialize (local, 0);
./ira.c:  bitmap_initialize (transp, 0);
./ira.c:  bitmap_initialize (moveable, 0);
./ira.c:  bitmap_initialize (_new, 0);
./ira.c:  bitmap_initialize (, 0);
./sel-sched.c:  bitmap_initialize (forced_ebb_heads, 0);
./sched-deps.c:   bitmap_initialize (_dependency_cache[i], 0);
./sched-deps.c:   bitmap_initialize (_dependency_cache[i], 0);
./sched-deps.c:   bitmap_initialize (_dependency_cache[i], 0);
./sched-deps.c:   bitmap_initialize (_dependency_cache[i], 0);
./sched-deps.c:bitmap_initialize (_dependency_cache[i], 0);



btw, so please consider simply changing bitmap_initialize behavior.  The IRA
use also should use the global bitmap obstack as users around that use
use BITMAP_ALLOC (NULL).  [use a default arg for 'obstack' if possible,
you have to verify it works with/without --enable-gather-detailed-mem-stats]
The problem is ensuring that allocating off the default bitmap obstack 
is appropriate for all those uses.


I'm tempted to change them all to NULL.  Then iterate one by one on to 
ensure we're routing to gc vs the default bitmap obstack as appropriate 
and that we're calling bitmap_clear as appropriate.


Once we've fixed all of 'em, we simply assert that bitmap_initialize is 
never passed NULL and avoid getting in this situation again in the future.


Thoughts?
jeff




[PATCH] Fix PR target/69810 (GCC 7)

2016-04-28 Thread David Edelsohn
This PR was fixed earlier with a patch that was deemed safe for GCC 6
through the removal of splitters for zero extend and sign extend to
HImode.

Now that trunk has opened for GCC 7 development, the following patch
restores the splitters and fixes the bug in the more aggressive manner
originally proposed: disallow patterns for extension to HImode by
removing HImode from the iterator.  The PowerPC architecture does not
provide any instructions that directly operate on HImode, so it's
better for GCC to operate on it as SUBREG except for loads and stores,
as this patch accomplishes.

Bootstrapped on powerpc-ibm-aix7.1.0.0.

Thanks, David

PR target/69810
* config/rs6000/rs6000.md (EXTQI): Don't allow extension to HImode.
(zero_extendqi2_dot): Revert earlier conversion from
define_insn_and_split to define_insn.
(zero_extendqi2_dot2): Same.
(extendqi2_dot): Same.
(extendqi2_dot2): Same.

Index: rs6000.md
===
--- rs6000.md   (revision 235573)
+++ rs6000.md   (working copy)
@@ -322,7 +322,7 @@
 (define_mode_iterator INT1 [QI HI SI (DI "TARGET_POWERPC64")])

 ; Everything we can extend QImode to.
-(define_mode_iterator EXTQI [HI SI (DI "TARGET_POWERPC64")])
+(define_mode_iterator EXTQI [SI (DI "TARGET_POWERPC64")])

 ; Everything we can extend HImode to.
 (define_mode_iterator EXTHI [SI (DI "TARGET_POWERPC64")])
@@ -711,7 +711,7 @@
rlwinm %0,%1,0,0xff"
   [(set_attr "type" "load,shift")])

-(define_insn "*zero_extendqi2_dot"
+(define_insn_and_split "*zero_extendqi2_dot"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y")
(compare:CC (zero_extend:EXTQI (match_operand:QI 1 "gpc_reg_operand" "r,
r"))
(const_int 0)))
@@ -719,12 +719,19 @@
   "rs6000_gen_cell_microcode"
   "@
andi. %0,%1,0xff
-   rlwinm %0,%1,0,0xff\;cmpwi %2,%0,0"
+   #"
+  "&& reload_completed && cc_reg_not_cr0_operand (operands[2], CCmode)"
+  [(set (match_dup 0)
+   (zero_extend:EXTQI (match_dup 1)))
+   (set (match_dup 2)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  ""
   [(set_attr "type" "logical")
(set_attr "dot" "yes")
(set_attr "length" "4,8")])

-(define_insn "*zero_extendqi2_dot2"
+(define_insn_and_split "*zero_extendqi2_dot2"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y")
(compare:CC (zero_extend:EXTQI (match_operand:QI 1 "gpc_reg_operand" "r,
r"))
(const_int 0)))
@@ -733,7 +740,14 @@
   "rs6000_gen_cell_microcode"
   "@
andi. %0,%1,0xff
-   rlwinm %0,%1,0,0xff\;cmpwi %2,%0,0"
+   #"
+  "&& reload_completed && cc_reg_not_cr0_operand (operands[2], CCmode)"
+  [(set (match_dup 0)
+   (zero_extend:EXTQI (match_dup 1)))
+   (set (match_dup 2)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  ""
   [(set_attr "type" "logical")
(set_attr "dot" "yes")
(set_attr "length" "4,8")])
@@ -851,7 +865,7 @@
   "extsb %0,%1"
   [(set_attr "type" "exts")])

-(define_insn "*extendqi2_dot"
+(define_insn_and_split "*extendqi2_dot"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y")
(compare:CC (sign_extend:EXTQI (match_operand:QI 1 "gpc_reg_operand" "r,
r"))
(const_int 0)))
@@ -859,12 +873,19 @@
   "rs6000_gen_cell_microcode"
   "@
extsb. %0,%1
-   extsb %0,%1\;cmpwi %2,%0,0"
+   #"
+  "&& reload_completed && cc_reg_not_cr0_operand (operands[2], CCmode)"
+  [(set (match_dup 0)
+   (sign_extend:EXTQI (match_dup 1)))
+   (set (match_dup 2)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  ""
   [(set_attr "type" "exts")
(set_attr "dot" "yes")
(set_attr "length" "4,8")])

-(define_insn "*extendqi2_dot2"
+(define_insn_and_split "*extendqi2_dot2"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x,?y")
(compare:CC (sign_extend:EXTQI (match_operand:QI 1 "gpc_reg_operand" "r,
r"))
(const_int 0)))
@@ -873,7 +894,14 @@
   "rs6000_gen_cell_microcode"
   "@
extsb. %0,%1
-   extsb %0,%1\;cmpwi %2,%0,0"
+   #"
+  "&& reload_completed && cc_reg_not_cr0_operand (operands[2], CCmode)"
+  [(set (match_dup 0)
+   (sign_extend:EXTQI (match_dup 1)))
+   (set (match_dup 2)
+   (compare:CC (match_dup 0)
+   (const_int 0)))]
+  ""
   [(set_attr "type" "exts")
(set_attr "dot" "yes")
(set_attr "length" "4,8")])


Re: [PATCH] x86 interrupt attribute patch [1/2]

2016-04-28 Thread H.J. Lu
On Thu, Apr 28, 2016 at 11:22 AM, Yulia Koval  wrote:
> Thank you,
> Here is the repost.
>
> Update TARGET_FUNCTION_INCOMING_ARG documentation
>
> On x86, interrupt handlers are only called by processors which push
> interrupt data onto stack at the address where the normal return address
> is.  Since interrupt handlers must access interrupt data via pointers so
> that they can update interrupt data, the pointer argument is passed as
> "argument pointer - word".
>
> TARGET_FUNCTION_INCOMING_ARG defines how callee sees its argument.
> Normally it returns REG, NULL, or CONST_INT.  This patch adds arbitrary
> address computation based on hard register, which can be forced into a
> register, to the list.
>
> When copying an incoming argument onto stack, assign_parm_setup_stack
> has:
>
> if (argument in memory)
>   copy argument in memory to stack
> else
>   move argument to stack
>
> Since an arbitrary address computation may be passed as an argument, we
> change it to:
>
> if (argument in memory)
>   copy argument in memory to stack
> else
>   {
> if (argument isn't in register)
>   force argument into a register
> move argument to stack
>   }
>
> * function.c (assign_parm_setup_stack): Force source into a
> register if needed.
> * target.def (function_incoming_arg): Update documentation to
> allow arbitrary address computation based on hard register.
> * doc/tm.texi: Regenerated.
>
>
> Br,
> Yulia
>

You also need to update

DEFHOOK
(function_incoming_arg,
 "Define this hook if the target machine has ``register windows'', so\n\
that the register in which a function sees an arguments is not\n\
necessarily the same as the one in which the caller passed the\n\
argument.\n\
\n\
.

in target.def.

-- 
H.J.


Re: [PATCH] Disable some i?86 builtins for -m32 (PR target/70858)

2016-04-28 Thread Uros Bizjak
On Thu, Apr 28, 2016 at 9:14 PM, Jakub Jelinek  wrote:
> Hi!
>
> The PR reported one ICE caused by a builtin for __x86_64__ guarded
> intrinsics to be mistakenly available in -m32 too, I've looked for
> INT64 substrings in the various i386.c builtin tables and for each
> that has been missing OPTION_MASK_ISA_64BIT in the mask looked at
> whether the uses of the builtin in *intrin.h aren't guarded #ifdef __x86_64__
> and whether the corresponding insn isn't TARGET_64BIT only.
>
> This is the result of that effort, 7 builtins that each ICEs when used in 
> -m32.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?
>
> 2016-04-28  Jakub Jelinek  
>
> PR target/70858
> * config/i386/i386.c (bdesc_special_args): Add | OPTION_MASK_ISA_64BIT
> to __builtin_ia32_lwpval64 and __builtin_ia32_lwpins64.
> (bdesc_args): Add | OPTION_MASK_ISA_64BIT to __builtin_ia32_bextr_u64,
> __builtin_ia32_bextri_u64, __builtin_ia32_bzhi_di,
> __builtin_ia32_pdep_di and __builtin_ia32_pext_di.
>
> * gcc.target/i386/pr70858.c: New test.

OK everywhere.

Thanks,
Uros.

> --- gcc/config/i386/i386.c.jj   2016-04-28 17:26:10.0 +0200
> +++ gcc/config/i386/i386.c  2016-04-28 18:39:06.506976486 +0200
> @@ -32996,9 +32996,9 @@ static const struct builtin_description
>{ OPTION_MASK_ISA_LWP, CODE_FOR_lwp_llwpcb, "__builtin_ia32_llwpcb", 
> IX86_BUILTIN_LLWPCB, UNKNOWN, (int) VOID_FTYPE_PVOID },
>{ OPTION_MASK_ISA_LWP, CODE_FOR_lwp_slwpcb, "__builtin_ia32_slwpcb", 
> IX86_BUILTIN_SLWPCB, UNKNOWN, (int) PVOID_FTYPE_VOID },
>{ OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvalsi3, "__builtin_ia32_lwpval32", 
> IX86_BUILTIN_LWPVAL32, UNKNOWN, (int) VOID_FTYPE_UINT_UINT_UINT },
> -  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvaldi3, "__builtin_ia32_lwpval64", 
> IX86_BUILTIN_LWPVAL64, UNKNOWN, (int) VOID_FTYPE_UINT64_UINT_UINT },
> +  { OPTION_MASK_ISA_LWP | OPTION_MASK_ISA_64BIT, CODE_FOR_lwp_lwpvaldi3, 
> "__builtin_ia32_lwpval64", IX86_BUILTIN_LWPVAL64, UNKNOWN, (int) 
> VOID_FTYPE_UINT64_UINT_UINT },
>{ OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinssi3, "__builtin_ia32_lwpins32", 
> IX86_BUILTIN_LWPINS32, UNKNOWN, (int) UCHAR_FTYPE_UINT_UINT_UINT },
> -  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinsdi3, "__builtin_ia32_lwpins64", 
> IX86_BUILTIN_LWPINS64, UNKNOWN, (int) UCHAR_FTYPE_UINT64_UINT_UINT },
> +  { OPTION_MASK_ISA_LWP | OPTION_MASK_ISA_64BIT, CODE_FOR_lwp_lwpinsdi3, 
> "__builtin_ia32_lwpins64", IX86_BUILTIN_LWPINS64, UNKNOWN, (int) 
> UCHAR_FTYPE_UINT64_UINT_UINT },
>
>/* FSGSBASE */
>{ OPTION_MASK_ISA_FSGSBASE | OPTION_MASK_ISA_64BIT, CODE_FOR_rdfsbasesi, 
> "__builtin_ia32_rdfsbase32", IX86_BUILTIN_RDFSBASE32, UNKNOWN, (int) 
> UNSIGNED_FTYPE_VOID },
> @@ -33933,12 +33933,12 @@ static const struct builtin_description
>
>/* BMI */
>{ OPTION_MASK_ISA_BMI, CODE_FOR_bmi_bextr_si, "__builtin_ia32_bextr_u32", 
> IX86_BUILTIN_BEXTR32, UNKNOWN, (int) UINT_FTYPE_UINT_UINT },
> -  { OPTION_MASK_ISA_BMI, CODE_FOR_bmi_bextr_di, "__builtin_ia32_bextr_u64", 
> IX86_BUILTIN_BEXTR64, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64 },
> +  { OPTION_MASK_ISA_BMI | OPTION_MASK_ISA_64BIT, CODE_FOR_bmi_bextr_di, 
> "__builtin_ia32_bextr_u64", IX86_BUILTIN_BEXTR64, UNKNOWN, (int) 
> UINT64_FTYPE_UINT64_UINT64 },
>{ OPTION_MASK_ISA_BMI, CODE_FOR_ctzhi2,   "__builtin_ctzs",   
> IX86_BUILTIN_CTZS,UNKNOWN, (int) UINT16_FTYPE_UINT16 },
>
>/* TBM */
>{ OPTION_MASK_ISA_TBM, CODE_FOR_tbm_bextri_si, 
> "__builtin_ia32_bextri_u32", IX86_BUILTIN_BEXTRI32, UNKNOWN, (int) 
> UINT_FTYPE_UINT_UINT },
> -  { OPTION_MASK_ISA_TBM, CODE_FOR_tbm_bextri_di, 
> "__builtin_ia32_bextri_u64", IX86_BUILTIN_BEXTRI64, UNKNOWN, (int) 
> UINT64_FTYPE_UINT64_UINT64 },
> +  { OPTION_MASK_ISA_TBM | OPTION_MASK_ISA_64BIT, CODE_FOR_tbm_bextri_di, 
> "__builtin_ia32_bextri_u64", IX86_BUILTIN_BEXTRI64, UNKNOWN, (int) 
> UINT64_FTYPE_UINT64_UINT64 },
>
>/* F16C */
>{ OPTION_MASK_ISA_F16C, CODE_FOR_vcvtph2ps, "__builtin_ia32_vcvtph2ps", 
> IX86_BUILTIN_CVTPH2PS, UNKNOWN, (int) V4SF_FTYPE_V8HI },
> @@ -33948,11 +33948,11 @@ static const struct builtin_description
>
>/* BMI2 */
>{ OPTION_MASK_ISA_BMI2, CODE_FOR_bmi2_bzhi_si3, "__builtin_ia32_bzhi_si", 
> IX86_BUILTIN_BZHI32, UNKNOWN, (int) UINT_FTYPE_UINT_UINT },
> -  { OPTION_MASK_ISA_BMI2, CODE_FOR_bmi2_bzhi_di3, "__builtin_ia32_bzhi_di", 
> IX86_BUILTIN_BZHI64, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64 },
> +  { OPTION_MASK_ISA_BMI2 | OPTION_MASK_ISA_64BIT, CODE_FOR_bmi2_bzhi_di3, 
> "__builtin_ia32_bzhi_di", IX86_BUILTIN_BZHI64, UNKNOWN, (int) 
> UINT64_FTYPE_UINT64_UINT64 },
>{ OPTION_MASK_ISA_BMI2, CODE_FOR_bmi2_pdep_si3, "__builtin_ia32_pdep_si", 
> IX86_BUILTIN_PDEP32, UNKNOWN, (int) UINT_FTYPE_UINT_UINT },
> -  { OPTION_MASK_ISA_BMI2, CODE_FOR_bmi2_pdep_di3, "__builtin_ia32_pdep_di", 
> IX86_BUILTIN_PDEP64, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64 },
> +  

[PATCH, i386]: Extend TARGET_READ_MODIFY{,_WRITE} peepholes to all integer modes

2016-04-28 Thread Uros Bizjak
Hello!

Attached patch extends TARGET_READ_MODIFY{,_WRITE} peepholes to handle
all integer modes, while also taking care not to introduce additional
QImode register stalls.

While looking at the insn enable condition, I noticed that we don't
use "probe_stack" pattern any more, as the stack check loop is now
implemented in a different way.

2016-04-28  Uros Bizjak  

* config/i386/i386.md (peephole2s for operations with memory inputs):
Use SWI mode iterator.
(peephole2s for operations with memory outputs): Ditto.
Do not check for stack checking probe.

(probe_stack): Remove expander.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 235582)
+++ config/i386/i386.md (working copy)
@@ -17552,20 +17552,6 @@
   DONE;
 })
 
-;; Use IOR for stack probes, this is shorter.
-(define_expand "probe_stack"
-  [(match_operand 0 "memory_operand")]
-  ""
-{
-  rtx (*gen_ior3) (rtx, rtx, rtx);
-
-  gen_ior3 = (GET_MODE (operands[0]) == DImode
- ? gen_iordi3 : gen_iorsi3);
-
-  emit_insn (gen_ior3 (operands[0], operands[0], const0_rtx));
-  DONE;
-})
-
 (define_insn "adjust_stack_and_probe"
   [(set (match_operand:P 0 "register_operand" "=r")
(unspec_volatile:P [(match_operand:P 1 "register_operand" "0")]
@@ -17894,11 +17880,11 @@
 
 ;; Don't do logical operations with memory inputs.
 (define_peephole2
-  [(match_scratch:SI 2 "r")
-   (parallel [(set (match_operand:SI 0 "register_operand")
-   (match_operator:SI 3 "arith_or_logical_operator"
+  [(match_scratch:SWI 2 "")
+   (parallel [(set (match_operand:SWI 0 "register_operand")
+  (match_operator:SWI 3 "arith_or_logical_operator"
  [(match_dup 0)
-  (match_operand:SI 1 "memory_operand")]))
+ (match_operand:SWI 1 "memory_operand")]))
   (clobber (reg:CC FLAGS_REG))])]
   "!(TARGET_READ_MODIFY || optimize_insn_for_size_p ())"
   [(set (match_dup 2) (match_dup 1))
@@ -17907,10 +17893,10 @@
   (clobber (reg:CC FLAGS_REG))])])
 
 (define_peephole2
-  [(match_scratch:SI 2 "r")
-   (parallel [(set (match_operand:SI 0 "register_operand")
-   (match_operator:SI 3 "arith_or_logical_operator"
- [(match_operand:SI 1 "memory_operand")
+  [(match_scratch:SWI 2 "")
+   (parallel [(set (match_operand:SWI 0 "register_operand")
+  (match_operator:SWI 3 "arith_or_logical_operator"
+[(match_operand:SWI 1 "memory_operand")
   (match_dup 0)]))
   (clobber (reg:CC FLAGS_REG))])]
   "!(TARGET_READ_MODIFY || optimize_insn_for_size_p ())"
@@ -17962,15 +17948,13 @@
 ; the same decoder scheduling characteristics as the original.
 
 (define_peephole2
-  [(match_scratch:SI 2 "r")
-   (parallel [(set (match_operand:SI 0 "memory_operand")
-   (match_operator:SI 3 "arith_or_logical_operator"
+  [(match_scratch:SWI 2 "")
+   (parallel [(set (match_operand:SWI 0 "memory_operand")
+  (match_operator:SWI 3 "arith_or_logical_operator"
  [(match_dup 0)
-  (match_operand:SI 1 "nonmemory_operand")]))
+ (match_operand:SWI 1 "")]))
   (clobber (reg:CC FLAGS_REG))])]
-  "!(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
-   /* Do not split stack checking probes.  */
-   && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx"
+  "!(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())"
   [(set (match_dup 2) (match_dup 0))
(parallel [(set (match_dup 2)
(match_op_dup 3 [(match_dup 2) (match_dup 1)]))
@@ -17978,15 +17962,13 @@
(set (match_dup 0) (match_dup 2))])
 
 (define_peephole2
-  [(match_scratch:SI 2 "r")
-   (parallel [(set (match_operand:SI 0 "memory_operand")
-   (match_operator:SI 3 "arith_or_logical_operator"
- [(match_operand:SI 1 "nonmemory_operand")
+  [(match_scratch:SWI 2 "")
+   (parallel [(set (match_operand:SWI 0 "memory_operand")
+  (match_operator:SWI 3 "arith_or_logical_operator"
+[(match_operand:SWI 1 "")
   (match_dup 0)]))
   (clobber (reg:CC FLAGS_REG))])]
-  "!(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())
-   /* Do not split stack checking probes.  */
-   && GET_CODE (operands[3]) != IOR && operands[1] != const0_rtx"
+  "!(TARGET_READ_MODIFY_WRITE || optimize_insn_for_size_p ())"
   [(set (match_dup 2) (match_dup 0))
(parallel [(set (match_dup 2)
(match_op_dup 3 [(match_dup 1) (match_dup 2)]))


Re: [PATCHv2 7/7] gcc/arc: Add an nps400 specific testcase

2016-04-28 Thread Joern Wolfgang Rennecke



On 21/04/16 12:39, Andrew Burgess wrote:

* gcc.target/arc/nps400-1.c: New file.

Thanks.  I have applied this patch.


[PATCH] Disable some i?86 builtins for -m32 (PR target/70858)

2016-04-28 Thread Jakub Jelinek
Hi!

The PR reported one ICE caused by a builtin for __x86_64__ guarded
intrinsics to be mistakenly available in -m32 too, I've looked for
INT64 substrings in the various i386.c builtin tables and for each
that has been missing OPTION_MASK_ISA_64BIT in the mask looked at
whether the uses of the builtin in *intrin.h aren't guarded #ifdef __x86_64__
and whether the corresponding insn isn't TARGET_64BIT only.

This is the result of that effort, 7 builtins that each ICEs when used in -m32.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk/6.2?

2016-04-28  Jakub Jelinek  

PR target/70858
* config/i386/i386.c (bdesc_special_args): Add | OPTION_MASK_ISA_64BIT
to __builtin_ia32_lwpval64 and __builtin_ia32_lwpins64.
(bdesc_args): Add | OPTION_MASK_ISA_64BIT to __builtin_ia32_bextr_u64,
__builtin_ia32_bextri_u64, __builtin_ia32_bzhi_di,
__builtin_ia32_pdep_di and __builtin_ia32_pext_di.

* gcc.target/i386/pr70858.c: New test.

--- gcc/config/i386/i386.c.jj   2016-04-28 17:26:10.0 +0200
+++ gcc/config/i386/i386.c  2016-04-28 18:39:06.506976486 +0200
@@ -32996,9 +32996,9 @@ static const struct builtin_description
   { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_llwpcb, "__builtin_ia32_llwpcb", 
IX86_BUILTIN_LLWPCB, UNKNOWN, (int) VOID_FTYPE_PVOID },
   { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_slwpcb, "__builtin_ia32_slwpcb", 
IX86_BUILTIN_SLWPCB, UNKNOWN, (int) PVOID_FTYPE_VOID },
   { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvalsi3, "__builtin_ia32_lwpval32", 
IX86_BUILTIN_LWPVAL32, UNKNOWN, (int) VOID_FTYPE_UINT_UINT_UINT },
-  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpvaldi3, "__builtin_ia32_lwpval64", 
IX86_BUILTIN_LWPVAL64, UNKNOWN, (int) VOID_FTYPE_UINT64_UINT_UINT },
+  { OPTION_MASK_ISA_LWP | OPTION_MASK_ISA_64BIT, CODE_FOR_lwp_lwpvaldi3, 
"__builtin_ia32_lwpval64", IX86_BUILTIN_LWPVAL64, UNKNOWN, (int) 
VOID_FTYPE_UINT64_UINT_UINT },
   { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinssi3, "__builtin_ia32_lwpins32", 
IX86_BUILTIN_LWPINS32, UNKNOWN, (int) UCHAR_FTYPE_UINT_UINT_UINT },
-  { OPTION_MASK_ISA_LWP, CODE_FOR_lwp_lwpinsdi3, "__builtin_ia32_lwpins64", 
IX86_BUILTIN_LWPINS64, UNKNOWN, (int) UCHAR_FTYPE_UINT64_UINT_UINT },
+  { OPTION_MASK_ISA_LWP | OPTION_MASK_ISA_64BIT, CODE_FOR_lwp_lwpinsdi3, 
"__builtin_ia32_lwpins64", IX86_BUILTIN_LWPINS64, UNKNOWN, (int) 
UCHAR_FTYPE_UINT64_UINT_UINT },
 
   /* FSGSBASE */
   { OPTION_MASK_ISA_FSGSBASE | OPTION_MASK_ISA_64BIT, CODE_FOR_rdfsbasesi, 
"__builtin_ia32_rdfsbase32", IX86_BUILTIN_RDFSBASE32, UNKNOWN, (int) 
UNSIGNED_FTYPE_VOID },
@@ -33933,12 +33933,12 @@ static const struct builtin_description
 
   /* BMI */
   { OPTION_MASK_ISA_BMI, CODE_FOR_bmi_bextr_si, "__builtin_ia32_bextr_u32", 
IX86_BUILTIN_BEXTR32, UNKNOWN, (int) UINT_FTYPE_UINT_UINT },
-  { OPTION_MASK_ISA_BMI, CODE_FOR_bmi_bextr_di, "__builtin_ia32_bextr_u64", 
IX86_BUILTIN_BEXTR64, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64 },
+  { OPTION_MASK_ISA_BMI | OPTION_MASK_ISA_64BIT, CODE_FOR_bmi_bextr_di, 
"__builtin_ia32_bextr_u64", IX86_BUILTIN_BEXTR64, UNKNOWN, (int) 
UINT64_FTYPE_UINT64_UINT64 },
   { OPTION_MASK_ISA_BMI, CODE_FOR_ctzhi2,   "__builtin_ctzs",   
IX86_BUILTIN_CTZS,UNKNOWN, (int) UINT16_FTYPE_UINT16 },
 
   /* TBM */
   { OPTION_MASK_ISA_TBM, CODE_FOR_tbm_bextri_si, "__builtin_ia32_bextri_u32", 
IX86_BUILTIN_BEXTRI32, UNKNOWN, (int) UINT_FTYPE_UINT_UINT },
-  { OPTION_MASK_ISA_TBM, CODE_FOR_tbm_bextri_di, "__builtin_ia32_bextri_u64", 
IX86_BUILTIN_BEXTRI64, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64 },
+  { OPTION_MASK_ISA_TBM | OPTION_MASK_ISA_64BIT, CODE_FOR_tbm_bextri_di, 
"__builtin_ia32_bextri_u64", IX86_BUILTIN_BEXTRI64, UNKNOWN, (int) 
UINT64_FTYPE_UINT64_UINT64 },
 
   /* F16C */
   { OPTION_MASK_ISA_F16C, CODE_FOR_vcvtph2ps, "__builtin_ia32_vcvtph2ps", 
IX86_BUILTIN_CVTPH2PS, UNKNOWN, (int) V4SF_FTYPE_V8HI },
@@ -33948,11 +33948,11 @@ static const struct builtin_description
 
   /* BMI2 */
   { OPTION_MASK_ISA_BMI2, CODE_FOR_bmi2_bzhi_si3, "__builtin_ia32_bzhi_si", 
IX86_BUILTIN_BZHI32, UNKNOWN, (int) UINT_FTYPE_UINT_UINT },
-  { OPTION_MASK_ISA_BMI2, CODE_FOR_bmi2_bzhi_di3, "__builtin_ia32_bzhi_di", 
IX86_BUILTIN_BZHI64, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64 },
+  { OPTION_MASK_ISA_BMI2 | OPTION_MASK_ISA_64BIT, CODE_FOR_bmi2_bzhi_di3, 
"__builtin_ia32_bzhi_di", IX86_BUILTIN_BZHI64, UNKNOWN, (int) 
UINT64_FTYPE_UINT64_UINT64 },
   { OPTION_MASK_ISA_BMI2, CODE_FOR_bmi2_pdep_si3, "__builtin_ia32_pdep_si", 
IX86_BUILTIN_PDEP32, UNKNOWN, (int) UINT_FTYPE_UINT_UINT },
-  { OPTION_MASK_ISA_BMI2, CODE_FOR_bmi2_pdep_di3, "__builtin_ia32_pdep_di", 
IX86_BUILTIN_PDEP64, UNKNOWN, (int) UINT64_FTYPE_UINT64_UINT64 },
+  { OPTION_MASK_ISA_BMI2 | OPTION_MASK_ISA_64BIT, CODE_FOR_bmi2_pdep_di3, 
"__builtin_ia32_pdep_di", IX86_BUILTIN_PDEP64, UNKNOWN, (int) 
UINT64_FTYPE_UINT64_UINT64 },
   { OPTION_MASK_ISA_BMI2, CODE_FOR_bmi2_pext_si3, "__builtin_ia32_pext_si", 
IX86_BUILTIN_PEXT32, UNKNOWN, 

Go patch committed: export String_index_expression

2016-04-28 Thread Ian Lance Taylor
This patch to the Go frontend by Chris Manghane makes
String_index_expression accessible outside of expressions.cc.  This is
a step toward future patches.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 235452)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-ba520fdcbea95531ebb9ef3d5be2de405ca90df3
+b17e404f5b8954e008b512741296d238ab7b2ef9
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 235452)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -10380,66 +10380,7 @@ Expression::make_array_index(Expression*
   return new Array_index_expression(array, start, end, cap, location);
 }
 
-// A string index.  This is used for both indexing and slicing.
-
-class String_index_expression : public Expression
-{
- public:
-  String_index_expression(Expression* string, Expression* start,
- Expression* end, Location location)
-: Expression(EXPRESSION_STRING_INDEX, location),
-  string_(string), start_(start), end_(end)
-  { }
-
- protected:
-  int
-  do_traverse(Traverse*);
-
-  Expression*
-  do_flatten(Gogo*, Named_object*, Statement_inserter*);
-
-  Type*
-  do_type();
-
-  void
-  do_determine_type(const Type_context*);
-
-  void
-  do_check_types(Gogo*);
-
-  Expression*
-  do_copy()
-  {
-return Expression::make_string_index(this->string_->copy(),
-this->start_->copy(),
-(this->end_ == NULL
- ? NULL
- : this->end_->copy()),
-this->location());
-  }
-
-  bool
-  do_must_eval_subexpressions_in_order(int* skip) const
-  {
-*skip = 1;
-return true;
-  }
-
-  Bexpression*
-  do_get_backend(Translate_context*);
-
-  void
-  do_dump_expression(Ast_dump_context*) const;
-
- private:
-  // The string we are getting a value from.
-  Expression* string_;
-  // The start or only index.
-  Expression* start_;
-  // The end index of a slice.  This may be NULL for a single index,
-  // or it may be a nil expression for the length of the string.
-  Expression* end_;
-};
+// Class String_index_expression.
 
 // String index traversal.
 
Index: gcc/go/gofrontend/expressions.h
===
--- gcc/go/gofrontend/expressions.h (revision 235452)
+++ gcc/go/gofrontend/expressions.h (working copy)
@@ -44,6 +44,7 @@ class Func_descriptor_expression;
 class Unknown_expression;
 class Index_expression;
 class Array_index_expression;
+class String_index_expression;
 class Map_index_expression;
 class Bound_method_expression;
 class Field_reference_expression;
@@ -675,6 +676,13 @@ class Expression
   array_index_expression()
   { return this->convert(); }
 
+  // If this is an expression which refers to indexing in a string,
+  // return the String_index_expression structure.  Otherwise, return
+  // NULL.
+  String_index_expression*
+  string_index_expression()
+  { return this->convert(); }
+
   // If this is an expression which refers to indexing in a map,
   // return the Map_index_expression structure.  Otherwise, return
   // NULL.
@@ -2583,6 +2591,72 @@ class Array_index_expression : public Ex
   Type* type_;
 };
 
+// A string index.  This is used for both indexing and slicing.
+
+class String_index_expression : public Expression
+{
+ public:
+  String_index_expression(Expression* string, Expression* start,
+ Expression* end, Location location)
+: Expression(EXPRESSION_STRING_INDEX, location),
+  string_(string), start_(start), end_(end)
+  { }
+
+  // Return the string being indexed.
+  Expression*
+  string() const
+  { return this->string_; }
+
+ protected:
+  int
+  do_traverse(Traverse*);
+
+  Expression*
+  do_flatten(Gogo*, Named_object*, Statement_inserter*);
+
+  Type*
+  do_type();
+
+  void
+  do_determine_type(const Type_context*);
+
+  void
+  do_check_types(Gogo*);
+
+  Expression*
+  do_copy()
+  {
+return Expression::make_string_index(this->string_->copy(),
+this->start_->copy(),
+(this->end_ == NULL
+ ? NULL
+ : this->end_->copy()),
+this->location());
+  }
+
+  bool
+  do_must_eval_subexpressions_in_order(int* skip) const
+  {
+*skip = 1;
+return true;
+  }

Re: [PATCHv2 6/7] gcc/arc: Mask integer 'L' operands to 32-bit

2016-04-28 Thread Joern Wolfgang Rennecke

Thanks.  I have applied this patch.


Re: RFA: PATCH to tell gdb to skip over is-a.h inlines

2016-04-28 Thread Jason Merrill

On 04/25/2016 01:28 PM, Jason Merrill wrote:

There doesn't seem to be any need to step through the is-a inline
functions.  OK for trunk?


Likewise line-map.h.

Jason

commit 59d1faf71bda301f6ac608534d3a7208f99cb8a5
Author: Jason Merrill 
Date:   Mon Apr 25 21:45:33 2016 -0400

	* gdbinit.in: Skip line-map.h.

diff --git a/gcc/gdbinit.in b/gcc/gdbinit.in
index d221130..041c716 100644
--- a/gcc/gdbinit.in
+++ b/gcc/gdbinit.in
@@ -249,6 +249,9 @@ skip file tree.h
 # Also skip inline functions in is-a.h.
 skip file is-a.h
 
+# And line-map.h.
+skip file line-map.h
+
 # Likewise, skip various inline functions in rtl.h.
 skip rtx_expr_list::next
 skip rtx_expr_list::element


C++ PATCH to implement C++17 [[nodiscard]] attribute

2016-04-28 Thread Jason Merrill
C++17 adds a [[nodiscard]] attribute, which is similar to the GNU 
warn_unused_result attribute, except that it applies to function 
declarations and class/enum types rather than to function types and is 
suppressed by an explicit conversion to void. I considered just treating 
it as warn_unused_result, but fn_type_req is hard to get around, so I 
decided to handle it separately.


While I was at it, I fixed 38172, an outstanding bug with 
warn_unused_result false negatives in the C++ front end, and moved its 
testcase into c-c++-common.


This patch does not address bug 66425, which argues that (void) ought to 
suppress the warn_unused_result warning, as it does for [[nodiscard]]. 
I agree with this argument, but will leave that change for a future patch.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit 4c253f9fdbd6acc274f9cf141887ae0e6f1c31ec
Author: Jason Merrill 
Date:   Thu Apr 28 10:17:09 2016 -0400

	Implement C++17 [[nodiscard]] attribute.

	PR c++/38172
	PR c++/54379
gcc/c-family/
	* c-lex.c (c_common_has_attribute): Handle nodiscard.
gcc/cp/
	* parser.c (cp_parser_std_attribute): Handle [[nodiscard]].
	* tree.c (handle_nodiscard_attribute): New.
	(cxx_attribute_table): Add [[nodiscard]].
	* cvt.c (cp_get_fndecl_from_callee, cp_get_callee_fndecl): New.
	(maybe_warn_nodiscard): New.
	(convert_to_void): Call it.

diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index ff7eb25..38a428d 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -347,7 +347,8 @@ c_common_has_attribute (cpp_reader *pfile)
 		result = 200809;
 	  else if (is_attribute_p ("deprecated", attr_name))
 		result = 201309;
-	  else if (is_attribute_p ("maybe_unused", attr_name))
+	  else if (is_attribute_p ("maybe_unused", attr_name)
+		   || is_attribute_p ("nodiscard", attr_name))
 		result = 201603;
 	  if (result)
 		attr_name = NULL_TREE;
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index f6ea0b7..8a06609 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5695,6 +5695,8 @@ extern tree cp_convert(tree, tree, tsubst_flags_t);
 extern tree cp_convert_and_check(tree, tree, tsubst_flags_t);
 extern tree cp_fold_convert			(tree, tree);
 extern tree cp_get_callee			(tree);
+extern tree cp_get_callee_fndecl		(tree);
+extern tree cp_get_fndecl_from_callee		(tree);
 extern tree convert_to_void			(tree, impl_conv_void,
  		 tsubst_flags_t);
 extern tree convert_force			(tree, tree, int,
diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
index 8c9d78b..2e2bac7 100644
--- a/gcc/cp/cvt.c
+++ b/gcc/cp/cvt.c
@@ -918,6 +918,104 @@ cp_get_callee (tree call)
   return NULL_TREE;
 }
 
+/* FN is the callee of a CALL_EXPR or AGGR_INIT_EXPR; return the FUNCTION_DECL
+   if we can.  */
+
+tree
+cp_get_fndecl_from_callee (tree fn)
+{
+  if (fn == NULL_TREE)
+return fn;
+  if (TREE_CODE (fn) == FUNCTION_DECL)
+return fn;
+  tree type = TREE_TYPE (fn);
+  if (type == unknown_type_node)
+return NULL_TREE;
+  gcc_assert (POINTER_TYPE_P (type));
+  fn = maybe_constant_init (fn);
+  STRIP_NOPS (fn);
+  if (TREE_CODE (fn) == ADDR_EXPR)
+{
+  fn = TREE_OPERAND (fn, 0);
+  if (TREE_CODE (fn) == FUNCTION_DECL)
+	return fn;
+}
+  return NULL_TREE;
+}
+
+/* Like get_callee_fndecl, but handles AGGR_INIT_EXPR as well and uses the
+   constexpr machinery.  */
+
+tree
+cp_get_callee_fndecl (tree call)
+{
+  return cp_get_fndecl_from_callee (cp_get_callee (call));
+}
+
+/* Subroutine of convert_to_void.  Warn if we're discarding something with
+   attribute [[nodiscard]].  */
+
+static void
+maybe_warn_nodiscard (tree expr, impl_conv_void implicit)
+{
+  tree call = expr;
+  if (TREE_CODE (expr) == TARGET_EXPR)
+call = TARGET_EXPR_INITIAL (expr);
+  location_t loc = EXPR_LOC_OR_LOC (call, input_location);
+  tree callee = cp_get_callee (call);
+  if (!callee)
+return;
+
+  tree type = TREE_TYPE (callee);
+  if (TYPE_PTRMEMFUNC_P (type))
+type = TYPE_PTRMEMFUNC_FN_TYPE (type);
+  if (POINTER_TYPE_P (type))
+type = TREE_TYPE (type);
+
+  tree rettype = TREE_TYPE (type);
+  tree fn = cp_get_fndecl_from_callee (callee);
+  if (implicit != ICV_CAST && fn
+  && lookup_attribute ("nodiscard", DECL_ATTRIBUTES (fn)))
+{
+  if (warning_at (loc, OPT_Wunused_result,
+		  "ignoring return value of %qD, "
+		  "declared with attribute nodiscard", fn))
+	inform (DECL_SOURCE_LOCATION (fn), "declared here");
+}
+  else if (implicit != ICV_CAST
+	   && lookup_attribute ("nodiscard", TYPE_ATTRIBUTES (rettype)))
+{
+  if (warning_at (loc, OPT_Wunused_result,
+		  "ignoring returned value of type %qT, "
+		  "declared with attribute nodiscard", rettype))
+	{
+	  if (fn)
+	inform (DECL_SOURCE_LOCATION (fn),
+		"in call to %qD, declared here", fn);
+	  inform (DECL_SOURCE_LOCATION (TYPE_NAME (rettype)),
+		  "%qT declared here", 

Re: Allow embedded timestamps by C/C++ macros to be set externally (3)

2016-04-28 Thread Martin Sebor

I'm sorry I'm a little late but I have a couple of minor comments
on the patch:


+  epoch = strtoll (source_date_epoch, , 10);
+  if ((errno == ERANGE && (epoch == LLONG_MAX || epoch == LLONG_MIN))
+  || (errno != 0 && epoch == 0))
+fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+"strtoll: %s\n", xstrerror(errno));
+  if (endptr == source_date_epoch)
+fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+"No digits were found: %s\n", endptr);
+  if (*endptr != '\0')
+fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+"Trailing garbage: %s\n", endptr);
+  if (epoch < 0)
+fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+"Value must be nonnegative: %lld \n", epoch);


In most texts (e.g. the C and POSIX standards), the name of
an environment variable doesn't include the dollar sign.  In
other diagnostic messages GCC doesn't print one.  I suggest
to follow the established practice and remove the dollar sign
from this error message as well.

I would also suggest to issue a single generic error message
explaining what the valid value of the variable is instead of
trying to describe what's wrong with it, for example as follows
(note also the hyphen in "non-negative" which is the prevalent
style used by other GCC messages and GNU documentation).

  "environment variable SOURCE_DATE_EPOCH must expand to a non-
  negative integer less than or equal to %qlli", LLONG_MAX

One comment about the documentation:

> +The value of @env{SOURCE_DATE_EPOCH} must be a UNIX timestamp,
> +defined as the number of seconds (excluding leap seconds) since
> +01 Jan 1970 00:00:00 represented in ASCII, identical to the output of
> +@samp{@command{date +%s}}.

The +%s option to the date command is a non-standard extension
that's not universally available.  To avoid confusing users on
systems that don't support it I would suggest to either avoid
mentioning or to clarify that it's a Linux command.

Martin


Re: [PATCHv2 5/7] gcc/arc: Add nps400 bitops support

2016-04-28 Thread Joern Wolfgang Rennecke



On 21/04/16 12:39, Andrew Burgess wrote:

Add support for nps400 bit operation instructions.  There's a new flag
-mbitops that turns this feature on.  There are new instructions, some
changes to existing instructions, a new register class to support the
new instructions, and some new expand and peephole optimisations.

gcc/ChangeLog:

* config/arc/arc.c (arc_conditional_register_usage): Take
TARGET_RRQ_CLASS into account.
(arc_print_operand): Support printing 'p' and 's' operands.
* config/arc/arc.h (TARGET_NPS_BITOPS_DEFAULT): Provide default
as 0.
(TARGET_RRQ_CLASS): Define.
(IS_POWEROF2_OR_0_P): Define.
* config/arc/arc.md (*movsi_insn): Add w/Clo, w/Chi, and w/Cbi
alternatives.
(*tst_movb): New define_insn.
(*tst): Avoid recognition if it could prevent '*tst_movb'
combination; replace c/CnL with c/Chs alternative.
(*tst_bitfield_tst): New define_insn.
(*tst_bitfield_asr): New define_insn.
(*tst_bitfield): New define_insn.
(andsi3_i): Add Rrq variant.
(extzv): New define_expand.
(insv): New define_expand.
(*insv_i): New define_insn.
(*movb): New define_insn.
(*movb_signed): New define_insn.
(*movb_high): New define_insn.
(*movb_high_signed): New define_insn.
(*movb_high_signed + 1): New define_split pattern.
(*mrgb): New define_insn.
(*mrgb + 1): New define_peephole2 pattern.
(*mrgb + 2): New define_peephole2 pattern.
* config/arc/arc.opt (mbitops): New option for nps400, uses
TARGET_NPS_BITOPS_DEFAULT.
* config/arc/constraints.md (q): Make register class conditional.
(Rrq): New register constraint.
(Chs): New constraint.
(Clo): New constraint.
(Chi): New constraint.
(Cbf): New constraint.
(Cbn): New constraint.
(C18): New constraint.
(Cbi): New constraint.

gcc/testsuite/ChangeLog:

* gcc.target/arc/extzv-1.c: New file.
* gcc.target/arc/insv-1.c: New file.
* gcc.target/arc/insv-2.c: New file.
* gcc.target/arc/movb-1.c: New file.
* gcc.target/arc/movb-2.c: New file.
* gcc.target/arc/movb-3.c: New file.
* gcc.target/arc/movb-4.c: New file.
* gcc.target/arc/movb-5.c: New file.
* gcc.target/arc/movb_cl-1.c: New file.
* gcc.target/arc/movb_cl-2.c: New file.
* gcc.target/arc/movbi_cl-1.c: New file.
* gcc.target/arc/movl-1.c: New file.


 Thanks.  I have applied this patch.


Re: Thoughts on memcmp expansion (PR43052)

2016-04-28 Thread Bernd Schmidt

On 01/18/2016 10:22 AM, Richard Biener wrote:

See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=52171 - the
inline expansion
for small sizes and equality compares should be done on GIMPLE.  Today the
strlen pass might be an appropriate place to do this given its
superior knowledge
about string lengths.

The idea of turning eq feeding memcmp into a special memcmp_eq is good but
you have to avoid doing that too early - otherwise you'd lose on

   res = memcmp (p, q, sz);
   if (memcmp (p, q, sz) == 0)
...

that is, you have to make sure CSE got the chance to common the two calls.
This is why I think this kind of transform needs to happen in specific places
(like during strlen opt) rather than in generic folding.


Ok, here's an update. I kept pieces of your patch from that PR, but also 
translating memcmps larger than a single operation into memcmp_eq as in 
my previous patch.


Then, I added by_pieces infrastructure for memcmp expansion. To avoid 
any more code duplication in this area, I abstracted the existing code 
and converted it to C++ classes since that seemed to fit pretty well.


There are a few possible ways I could go with this, which is why I'm 
posting it more as a RFD at this point.

 - should store_by_pieces be eliminated in favour of doing just
   move_by_pieces with constfns?
 - the C++ification could be extended, making move_by_pieces_d and
   compare_by_pieces_d classes inheriting from a common base. This
   would get rid of the callbacks, replacing them with virtuals,
   and also make some of the current struct members private.
 - could move all of the by_pieces stuff out into a separate file?

Later, I think we'll also want to extend this to allow vector mode 
operations, but I think that's a separate patch.


So, opinions what I should be doing with this patch? FWIW it bootstraps 
and tests OK on x86_64-linux.



Bernd
Index: gcc/builtins.c
===
--- gcc/builtins.c	(revision 235474)
+++ gcc/builtins.c	(working copy)
@@ -3671,53 +3671,24 @@ expand_cmpstr (insn_code icode, rtx targ
   return NULL_RTX;
 }
 
-/* Try to expand cmpstrn or cmpmem operation ICODE with the given operands.
-   ARG3_TYPE is the type of ARG3_RTX.  Return the result rtx on success,
-   otherwise return null.  */
-
-static rtx
-expand_cmpstrn_or_cmpmem (insn_code icode, rtx target, rtx arg1_rtx,
-			  rtx arg2_rtx, tree arg3_type, rtx arg3_rtx,
-			  HOST_WIDE_INT align)
-{
-  machine_mode insn_mode = insn_data[icode].operand[0].mode;
-
-  if (target && (!REG_P (target) || HARD_REGISTER_P (target)))
-target = NULL_RTX;
-
-  struct expand_operand ops[5];
-  create_output_operand ([0], target, insn_mode);
-  create_fixed_operand ([1], arg1_rtx);
-  create_fixed_operand ([2], arg2_rtx);
-  create_convert_operand_from ([3], arg3_rtx, TYPE_MODE (arg3_type),
-			   TYPE_UNSIGNED (arg3_type));
-  create_integer_operand ([4], align);
-  if (maybe_expand_insn (icode, 5, ops))
-return ops[0].value;
-  return NULL_RTX;
-}
-
 /* Expand expression EXP, which is a call to the memcmp built-in function.
Return NULL_RTX if we failed and the caller should emit a normal call,
-   otherwise try to get the result in TARGET, if convenient.  */
+   otherwise try to get the result in TARGET, if convenient.
+   RESULT_EQ is true if we can relax the returned value to be either zero
+   or nonzero, without caring about the sign.  */
 
 static rtx
-expand_builtin_memcmp (tree exp, rtx target)
+expand_builtin_memcmp (tree exp, rtx target, bool result_eq)
 {
   if (!validate_arglist (exp,
  			 POINTER_TYPE, POINTER_TYPE, INTEGER_TYPE, VOID_TYPE))
 return NULL_RTX;
 
-  /* Note: The cmpstrnsi pattern, if it exists, is not suitable for
- implementing memcmp because it will stop if it encounters two
- zero bytes.  */
-  insn_code icode = direct_optab_handler (cmpmem_optab, SImode);
-  if (icode == CODE_FOR_nothing)
-return NULL_RTX;
-
   tree arg1 = CALL_EXPR_ARG (exp, 0);
   tree arg2 = CALL_EXPR_ARG (exp, 1);
   tree len = CALL_EXPR_ARG (exp, 2);
+  machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
+  location_t loc = EXPR_LOCATION (exp);
 
   unsigned int arg1_align = get_pointer_alignment (arg1) / BITS_PER_UNIT;
   unsigned int arg2_align = get_pointer_alignment (arg2) / BITS_PER_UNIT;
@@ -3726,22 +3697,39 @@ expand_builtin_memcmp (tree exp, rtx tar
   if (arg1_align == 0 || arg2_align == 0)
 return NULL_RTX;
 
-  machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
-  location_t loc = EXPR_LOCATION (exp);
   rtx arg1_rtx = get_memory_rtx (arg1, len);
   rtx arg2_rtx = get_memory_rtx (arg2, len);
-  rtx arg3_rtx = expand_normal (fold_convert_loc (loc, sizetype, len));
+  rtx len_rtx = expand_normal (fold_convert_loc (loc, sizetype, len));
 
   /* Set MEM_SIZE as appropriate.  */
-  if (CONST_INT_P (arg3_rtx))
+  if (CONST_INT_P (len_rtx))
 {
-  set_mem_size (arg1_rtx, INTVAL (arg3_rtx));
-  set_mem_size (arg2_rtx, INTVAL 

Re: Allow embedded timestamps by C/C++ macros to be set externally (3)

2016-04-28 Thread Dhole
On 16-04-28 15:14:20, Jakub Jelinek wrote:
> On Thu, Apr 28, 2016 at 03:10:26PM +0200, Bernd Schmidt wrote:
> > On 04/28/2016 12:35 PM, Jakub Jelinek wrote:
> > >On Thu, Apr 28, 2016 at 12:31:40PM +0200, Bernd Schmidt wrote:
> > >>I really don't see anything in that function that looks like a huge time
> > >>sink, so I'm not that worried about it. I think it's likely to be buried 
> > >>way
> > >>down in the noise.
> > >
> > >True, but the noise sums up, and the result is terrible speed of compiling
> > >empty source files, something that e.g. Linux kernel or other packages
> > >that have lots of small source files, care about a lot.
> > >If initializing it early would buy us anything on code clarity etc., it
> > >could be justified, but IMHO it doesn't, the code in libcpp already has the
> > >delayed initialization anyway.
> > 
> > Well, it does buy us early (and reliable) error checks against the
> > environment variable.
> 
> I'm not sure we really care about the env var unless it actually needs to be
> used.  If we error only if it is used, people could e.g. use it in another
> way, to verify their code doesn't contain any __TIME__ uses, compile with
> the env var set to some invalid string and just compile everything with
> that, it would diagnose any uses of __TIME__.

There is the Wdate-time flag, that warns on using __DATE__, __TIME__ and
__TIMESTAMP__.  Although that alone will not make the compilation fail
unless it's used with Werror.

The reason behind using fatal_error (rather than a warning) when
SOURCE_DATE_EPOCH contains an invalid value is due to the
SOURCE_DATE_EPOCH specification [1]:

  SOURCE_DATE_EPOCH
  (...)
  If the value is malformed, the build process SHOULD exit with a non-zero 
error code.

And the reason for reading and parsing the env var in gcc/ rather than
when the macro is expanded for the first time (in libcpp/) is from a
comment by Joseph Myers made the first time I submited this patch [2].
The most clean way to read the env var from gcc/ I found was to do it
during the initialization.  But if you think this should be done
different I'm open to change the implementation.


Bernd: I'll see if I can prepare a testcase; first I need to get
familiar with the testing framework and learn how to set environment
variables in tests.  Any tips on that will be really welcome!


Also, I'll take a look at the -fcompare-debug, see what's the best way
to get the same __TIME__ and __DATE__ with the help of
SOURCE_DATE_EPOCH.


[1] https://reproducible-builds.org/specs/source-date-epoch/
[2] https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02270.html

Cheers,
-- 
Dhole


signature.asc
Description: PGP signature


Re: [PATCHv2 4/7] gcc/arc: Add support for nps400 cmem xld/xst instructions

2016-04-28 Thread Joern Wolfgang Rennecke

Thanks.  I have merged this patch.



Re: [PATCH] x86 interrupt attribute patch [1/2]

2016-04-28 Thread Yulia Koval
Thank you,
Here is the repost.

Update TARGET_FUNCTION_INCOMING_ARG documentation

On x86, interrupt handlers are only called by processors which push
interrupt data onto stack at the address where the normal return address
is.  Since interrupt handlers must access interrupt data via pointers so
that they can update interrupt data, the pointer argument is passed as
"argument pointer - word".

TARGET_FUNCTION_INCOMING_ARG defines how callee sees its argument.
Normally it returns REG, NULL, or CONST_INT.  This patch adds arbitrary
address computation based on hard register, which can be forced into a
register, to the list.

When copying an incoming argument onto stack, assign_parm_setup_stack
has:

if (argument in memory)
  copy argument in memory to stack
else
  move argument to stack

Since an arbitrary address computation may be passed as an argument, we
change it to:

if (argument in memory)
  copy argument in memory to stack
else
  {
if (argument isn't in register)
  force argument into a register
move argument to stack
  }

* function.c (assign_parm_setup_stack): Force source into a
register if needed.
* target.def (function_incoming_arg): Update documentation to
allow arbitrary address computation based on hard register.
* doc/tm.texi: Regenerated.


Br,
Yulia

On Thu, Apr 28, 2016 at 7:32 PM, Jeff Law  wrote:
> On 04/20/2016 07:48 AM, Koval, Julia wrote:
>>
>> Sorry, here is the right patch.
>>
>> -Original Message-
>> From: Koval, Julia
>> Sent: Wednesday, April 20, 2016 4:42 PM
>> To: 'gcc-patches@gcc.gnu.org' 
>> Cc: Lu, Hongjiu ; 'vaalfr...@gmail.com'
>> ; 'ubiz...@gmail.com' ;
>> 'l...@redhat.com' ; Zamyatin, Igor 
>> Subject: [PATCH] x86 interrupt attribute patch [1/2]
>>
>> Hi,
>> Here is the new version of interrupt attribute patch.
>> Bootstraped/regtested for Linux/x86_64. Ok for trunk?
>>
>> Update TARGET_FUNCTION_INCOMING_ARG documentation
>>
>> On x86, interrupt handlers are only called by processors which push
>> interrupt data onto stack at the address where the normal return
>> address
>> is.  Since interrupt handlers must access interrupt data via pointers
>> so
>> that they can update interrupt data, the pointer argument is passed as
>> "argument pointer - word".
>>
>> TARGET_FUNCTION_INCOMING_ARG defines how callee sees its argument.
>> Normally it returns REG, NULL, or CONST_INT.  This patch adds
>> arbitrary
>> address computation based on hard register, which can be forced into a
>> register, to the list.
>>
>> When copying an incoming argument onto stack, assign_parm_setup_stack
>> has:
>>
>> if (argument in memory)
>>   copy argument in memory to stack
>> else
>>   move argument to stack
>>
>> Since an arbitrary address computation may be passed as an argument,
>> we
>> change it to:
>>
>> if (argument in memory)
>>   copy argument in memory to stack
>> else
>>   {
>> if (argument isn't in register)
>>   force argument into a register
>> move argument to stack
>>   }
>>
>> * function.c (assign_parm_setup_stack): Force source into a
>> register if needed.
>> * target.def (function_incoming_arg): Update documentation to
>> allow arbitrary address computation based on hard register.
>> * doc/tm.texi: Regenerated.
>>
> So I think the function.c changes are fine.  But I think we need to do a
> tiny bit more on the documentation side before we can install the change.
>
> While I think a rewrite of the whole argument passing section would be
> advisable, that may be a bit much to expect.  So let's try to just cleanup
> FUNCTION_INCOMING_ARG.
>
> FUNCTION_INCOMING_ARG has text like "Define this hook if the target machine
> has register windows ..."
>
> I'd change that text to be something like
>
> "Define this hook if the caller and callee on the target have different
> views of where arguments are passed.  Also define this hook if there are
> functions that are never directly called, but are invoked by the hardware
> and which have nonstandard calling conventions."
>
> Or something along those lines.
>
>
> At one time I thought we'd want to specify how the cumulative args structure
> would or would not be updated for these special arguments. But after further
> reflection, I think that can be a target dependent implementation detail.
>
>
>
> I think with that one documentation update this will be OK, but I would like
> you to repost it so I can look at it one final time.
>
> jeff
>
>


patch
Description: Binary data


Re: [PATCH] sbitmap: Remove popcount

2016-04-28 Thread Bernd Schmidt

On 04/28/2016 07:30 PM, Segher Boessenkool wrote:

In r193072 sbitmap_popcount was removed, so we cannot ask for the popcount
of an sbitmap anymore.  Nothing calls sbitmap_alloc_with_popcount either.
This patch removes everything else popcount-related from sbitmap.

Tested on powerpc64-linux; is this okay for trunk?


Ok.


Bernd



Re: [PATCH][AArch64] Improve aarch64_case_values_threshold setting

2016-04-28 Thread Wilco Dijkstra
Kyrill Tkachov wrote:
> On 25/04/16 20:21, Wilco Dijkstra wrote:
> > The GCC switch expansion is awful, so
> > even with a good indirect predictor it is better to use conditional
> > branches.
> 
> In what way is it awful? If there's something we can do better at
> can you file a bug report with a testcase so that we can work on
> improving it rather than tweaking a heuristic in the backend.

In every way :-( 

Try this simple example and see whether you agree this is terrible,
especially on targets that don't use PC-relative tables...

int i;
int func(int a)
{
  switch(a)
  {
case 0:   i = 20; break;
case 1:   i = 50; break;
case 2:   i = 29; break;
case 3:   i = 20; break;
case 4:   i = 50; break;
case 5:   i = 29; break;
case 6:   i = 20; break;
case 7:   i = 50; break;
case 8:   i = 29; break;
case 9:   i = 79; break;
case 110: i = 27; break;
default:  i = 77; break;
  }
  return i;
}

See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70861.

It's also disappointing this doesn't get turned into:

return i = (unsigned)a <= 110 ? table[a] : 77;

Wilco



[PATCH] rs6000: Rename insn_chain_scanned_p to spe_insn_chain_scanned_p

2016-04-28 Thread Segher Boessenkool
This makes it clearer this field is only for SPE.  Committing.


Segher


2016-04-28  Segher Boessenkool  

* config/rs6000/rs6000.c (machine_function): Rename
insn_chain_scanned_p to spe_insn_chain_scanned_p.
(rs6000_stack_info): Adjust.

---
 gcc/config/rs6000/rs6000.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index c351aa6..d0ebdb1d 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -130,7 +130,7 @@ typedef struct rs6000_stack {
 typedef struct GTY(()) machine_function
 {
   /* Whether the instruction chain has been scanned already.  */
-  int insn_chain_scanned_p;
+  int spe_insn_chain_scanned_p;
   /* Flags if __builtin_return_address (n) with n >= 1 was used.  */
   int ra_needs_full_frame;
   /* Flags if __builtin_return_address (0) was used.  */
@@ -23387,10 +23387,10 @@ rs6000_stack_info (void)
   if (TARGET_SPE)
 {
   /* Cache value so we don't rescan instruction chain over and over.  */
-  if (cfun->machine->insn_chain_scanned_p == 0)
-   cfun->machine->insn_chain_scanned_p
+  if (cfun->machine->spe_insn_chain_scanned_p == 0)
+   cfun->machine->spe_insn_chain_scanned_p
  = spe_func_has_64bit_regs_p () + 1;
-  info_ptr->spe_64bit_regs_used = cfun->machine->insn_chain_scanned_p - 1;
+  info_ptr->spe_64bit_regs_used = cfun->machine->spe_insn_chain_scanned_p 
- 1;
 }
 
   /* Select which calling sequence.  */
-- 
1.9.3



Re: [PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant.

2016-04-28 Thread Joern Wolfgang Rennecke



On 28/04/16 18:10, Claudiu Zissulescu wrote:

Please find the updated patch.

Claudiu

gcc/
2016-04-28  Claudiu Zissulescu  

* config/arc/arc.h (UNSIGNED_INT12, UNSIGNED_INT16): Define.
* config/arc/arc.md (umulhisi3): Use arc_short_operand predicate.
(umulhisi3_imm): Update predicates and constraint letters.
(umulhisi3_reg): Declare instruction as commutative.
* config/arc/constraints.md (U12, U16): New constraints.

I'm not sure how to feel about this.  U16 looks intuitive, but we have
traditionally used U for memory constraints.  And we use it for ARC
for that purpose, too, even though with a compatible constraint
length of 3.
I suppose it's fine if you're sure we never want to have an addressing
mode that's best described with "12" or "16", or some other number
we might want for an unsigned integer.

Otherwise, I'd suggest using a traditional integer letter.  'J' is free.
  
  (define_expand "umulhisi3"

[(set (match_operand:SI 0 "register_operand"   "")
(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand"  ""))
-(zero_extend:SI (match_operand:HI 2 "nonmemory_operand" ""]
+(zero_extend:SI (match_operand:HI 2 "arc_short_operand" ""]
"TARGET_MPYW"
"{
  if (CONSTANT_P (operands[2]))
  {
-  emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2]));
-  DONE;
+ emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2]));
+ DONE;

Why do you remove half of the indentation?


Re: [patch] cleanup *finish_omp_clauses

2016-04-28 Thread Cesar Philippidis
On 04/28/2016 01:56 AM, Jakub Jelinek wrote:
> On Wed, Apr 27, 2016 at 07:37:17PM -0700, Cesar Philippidis wrote:
>> This patch replaces all of the bool argument to c_finish_omp_clauses and
>> finish_omp_clauses in the c and c++ front ends, respectively. Right now
>> there are three bool arguments, one for is_omp/allow_fields,
>> declare_simd and is_cilk, the latter two have default values set.
>> OpenACC will require some special handling in *finish_omp_clauses in the
>> near future, too, so rather than add an is_oacc argument, I introduced
>> an enum c_omp_region_type, similar to the one in gimplify.c.
>>
>> Is this patch ok for trunk? I'll make use of C_ORT_ACC shortly in a
>> follow up patch.
> 
> I've been long wanting to use just tree_code there, but as we don't have one
> e.g. for DECLARE_SIMD, perhaps a separate enum is better.
> 
>> --- a/gcc/c-family/c-common.h
>> +++ b/gcc/c-family/c-common.h
>> @@ -1261,6 +1261,17 @@ enum c_omp_clause_split
>>C_OMP_CLAUSE_SPLIT_TASKLOOP = C_OMP_CLAUSE_SPLIT_FOR
>>  };
>>  
>> +enum c_omp_region_type
>> +{
>> +  C_ORT_NONE= 0,
>> +  C_ORT_OMP = 1 << 0,
>> +  C_ORT_SIMD= 1 << 1,
>> +  C_ORT_CILK= 1 << 2,
>> +  C_ORT_ACC = 1 << 3,
>> +  C_ORT_OMP_SIMD= C_ORT_OMP | C_ORT_SIMD,
>> +  C_ORT_OMP_CILK= C_ORT_OMP | C_ORT_CILK
> 
> That said, the above names are just weird, it is non-obvious
> what they mean at all.  What is C_ORT_NONE for?  We surely don't
> have any clauses that aren't OpenMP, nor Cilk+, nor OpenACC
> (ok, maybe the simd attribute, but donno if it ever calls the
> *finish_omp_clauses functions).

*parser_clik_for was just passing is_omp/allow_fields = false.

> So, IMHO the originating specification should be one thing, so
> C_ORT_OMP, C_ORT_CILK, C_ORT_ACC.
> And, beyond that the C FE cares about whether it is a clause
> on #pragma omp declare simd or its Cilk+ counterpart (vector attribute),
> so you want C_ORT_DECLARE_SIMD possibly ored with the language
> (note, not C_ORT_SIMD, that is way too confusing - we have
> #pragma omp simd (OpenMP), #pragma simd (Cilk+), and we certainly do not
> want the declare simd behavior for those.
> Perhaps #pragma omp declare target is another construct that is handled
> differently and could be visible in the bitmask too.
> 
> The C++ finish_omp_clauses also cares about whether fields (meaning
> non-static members) are allowed, i.e. whether
> struct S { int p; void foo () {
> ...
> #pragma omp ... private (p)
> ...
> }};
> should be allowed or not.  That can be derived from the language and
> the other construct bits though, I believe right now only OpenMP constructs
> should handle it, and declare simd should not, and similarly declare target
> should not.

That sounds reasonable. This patch removes C_ORT_OMP_SIMD and
C_ORT_OMP_CILK from the enum and introduces C_ORT_DECLARE_SIMD.
Additionally, instead of taking an enum c_omp_region_type argument,
*finish_omp_clauses just takes in a unsigned int bitmask. That way the
enum doesn't get overly complicated. Note that I didn't include an
allow_fields entry for c++ in the enum, since that information can be
derived from C_ORT_OMP anyway.

Is this ok for trunk?

Cesar

2016-04-28  Cesar Philippidis  

	gcc/c-family/
	* c-common.h (enum c_omp_region_type): Define.

	gcc/c/
	* c-parser.c (c_parser_oacc_all_clauses): Update call to
	c_finish_omp_clauses.
	(c_parser_omp_all_clauses): Likewise.
	(c_parser_oacc_cache): Likewise.
	(c_parser_oacc_loop): Likewise.
	(omp_split_clauses): Likewise.
	(c_parser_omp_declare_target): Likewise.
	(c_parser_cilk_all_clauses): Likewise.
	(c_parser_cilk_for): Likewise.
	* c-typeck.c (c_finish_omp_clauses): Replace bool arguments
	is_omp, declare_simd, and is_cilk with bitmask ort.

	gcc/cp/
	* cp-tree.h (finish_omp_clauses): Update prototype.
	* parser.c (cp_parser_oacc_all_clauses): Update call to
	finish_omp_clauses.
	(cp_parser_omp_all_clauses): Likewise.
	(cp_parser_omp_for_loop): Likewise.
	(cp_omp_split_clauses): Likewise.
	(cp_parser_oacc_cache): Likewise.
	(cp_parser_oacc_loop): Likewise.
	(cp_parser_omp_declare_target):
	(cp_parser_cilk_simd_all_clauses): Likewise.
	(cp_parser_cilk_for): Likewise.
	* pt.c (tsubst_attribute): Likewise.
	(tsubst_omp_clauses): Likewise.
	(tsubst_omp_for_iterator): Likewise.
	* semantics.c (finish_omp_clauses): Replace bool arguments
	allow_fields, declare_simd, and is_cilk with bitmask ort.
	(finish_omp_for): Update call to finish_omp_clauses.


diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 3a7805f..3473e66 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1261,6 +1261,14 @@ enum c_omp_clause_split
   C_OMP_CLAUSE_SPLIT_TASKLOOP = C_OMP_CLAUSE_SPLIT_FOR
 };
 
+enum c_omp_region_type
+{
+  C_ORT_OMP		= 1 << 0,
+  C_ORT_CILK		= 1 << 1,
+  C_ORT_ACC		= 1 << 2,
+  C_ORT_DECLARE_SIMD	= 1 << 3
+};
+
 extern tree c_finish_omp_master (location_t, tree);
 extern tree 

[PATCH] sbitmap: Remove popcount

2016-04-28 Thread Segher Boessenkool
In r193072 sbitmap_popcount was removed, so we cannot ask for the popcount
of an sbitmap anymore.  Nothing calls sbitmap_alloc_with_popcount either.
This patch removes everything else popcount-related from sbitmap.

Tested on powerpc64-linux; is this okay for trunk?


Segher


2016-04-28  Segher Boessenkool  

* cfganal.c (bitmap_intersection_of_succs): Delete assert checking
dst->popcount.
(bitmap_intersection_of_preds): Ditto.
(bitmap_union_of_succs): Ditto.
(bitmap_union_of_preds): Ditto.
* sbitmap.c (do_popcount): Delete.
(BITMAP_DEBUGGING): Delete.
(sbitmap_verify_popcount): Delete.
(sbitmap_alloc): Don't initialize the popcount field.
(sbitmap_alloc_with_popcount): Delete.
(sbitmap_resize): Don't resize the popcount array.
(sbitmap_vector_alloc): Don't initialize the popcount field.
(bitmap_copy): Don't copy the popcount array.
(bitmap_clear): Don't clear the popcount array.
(bitmap_clear): Delete the popcount array handling.
(bitmap_ior_and_compl): Delete the popcount assert.
(bitmap_not): Ditto.
(bitmap_and_compl): Ditto.
(bitmap_and): Delete the popcount array handling.
(bitmap_xor): Ditto.
(bitmap_ior): Ditto.
(bitmap_or_and): Delete the popcount assert.
(bitmap_and_or): Ditto.
(popcount_table): Delete.
(sbitmap_elt_popcount): Delete.
* sbitmap.h (simple_bitmap_def): Delete the popcount field.
(bitmap_set_bit): Delete the popcount assert.
(bitmap_clear_bit): Ditto.
(sbitmap_free): Don't free the popcount array.
(sbitmap_alloc_with_popcount): Delete declaration.
(sbitmap_popcount): Ditto.

---
 gcc/cfganal.c |   8 ---
 gcc/sbitmap.c | 167 ++
 gcc/sbitmap.h |   6 ---
 3 files changed, 5 insertions(+), 176 deletions(-)

diff --git a/gcc/cfganal.c b/gcc/cfganal.c
index bf9866b..189762c 100644
--- a/gcc/cfganal.c
+++ b/gcc/cfganal.c
@@ -1378,8 +1378,6 @@ bitmap_intersection_of_succs (sbitmap dst, sbitmap *src, 
basic_block b)
   edge e;
   unsigned ix;
 
-  gcc_assert (!dst->popcount);
-
   for (e = NULL, ix = 0; ix < EDGE_COUNT (b->succs); ix++)
 {
   e = EDGE_SUCC (b, ix);
@@ -1419,8 +1417,6 @@ bitmap_intersection_of_preds (sbitmap dst, sbitmap *src, 
basic_block b)
   edge e;
   unsigned ix;
 
-  gcc_assert (!dst->popcount);
-
   for (e = NULL, ix = 0; ix < EDGE_COUNT (b->preds); ix++)
 {
   e = EDGE_PRED (b, ix);
@@ -1460,8 +1456,6 @@ bitmap_union_of_succs (sbitmap dst, sbitmap *src, 
basic_block b)
   edge e;
   unsigned ix;
 
-  gcc_assert (!dst->popcount);
-
   for (ix = 0; ix < EDGE_COUNT (b->succs); ix++)
 {
   e = EDGE_SUCC (b, ix);
@@ -1501,8 +1495,6 @@ bitmap_union_of_preds (sbitmap dst, sbitmap *src, 
basic_block b)
   edge e;
   unsigned ix;
 
-  gcc_assert (!dst->popcount);
-
   for (ix = 0; ix < EDGE_COUNT (b->preds); ix++)
 {
   e = EDGE_PRED (b, ix);
diff --git a/gcc/sbitmap.c b/gcc/sbitmap.c
index 87e5c51..10b4347 100644
--- a/gcc/sbitmap.c
+++ b/gcc/sbitmap.c
@@ -22,25 +22,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "sbitmap.h"
 
-/* This suffices for roughly 99% of the hosts we run on, and the rest
-   don't have 256 bit integers.  */
-#if SBITMAP_ELT_BITS > 255
-#error Need to increase size of datatype used for popcount
-#endif
-
-#if GCC_VERSION >= 3400
-#  if SBITMAP_ELT_BITS == HOST_BITS_PER_LONG
-#define do_popcount(x) __builtin_popcountl (x)
-#  elif SBITMAP_ELT_BITS == HOST_BITS_PER_LONGLONG
-#define do_popcount(x) __builtin_popcountll (x)
-#  else
-#error "internal error: sbitmap.h and hwint.h are inconsistent"
-#  endif
-#else
-static unsigned long sbitmap_elt_popcount (SBITMAP_ELT_TYPE);
-#  define do_popcount(x) sbitmap_elt_popcount (x)
-#endif
-
 typedef SBITMAP_ELT_TYPE *sbitmap_ptr;
 typedef const SBITMAP_ELT_TYPE *const_sbitmap_ptr;
 
@@ -51,29 +32,6 @@ static inline unsigned int sbitmap_size_bytes (const_sbitmap 
map)
return map->size * sizeof (SBITMAP_ELT_TYPE);
 }
 
-/* This macro controls debugging that is as expensive as the
-   operations it verifies.  */
-
-/* #define BITMAP_DEBUGGING  */
-#ifdef BITMAP_DEBUGGING
-
-/* Verify the population count of sbitmap A matches the cached value,
-   if there is a cached value. */
-
-static void
-sbitmap_verify_popcount (const_sbitmap a)
-{
-  unsigned ix;
-  unsigned int lastword;
-
-  if (!a->popcount)
-return;
-
-  lastword = a->size;
-  for (ix = 0; ix < lastword; ix++)
-gcc_assert (a->popcount[ix] == do_popcount (a->elms[ix]));
-}
-#endif
 
 /* Bitmap manipulation routines.  */
 
@@ -92,17 +50,6 @@ sbitmap_alloc (unsigned int n_elms)
   bmap = (sbitmap) xmalloc (amt);
   bmap->n_bits = n_elms;
   bmap->size = size;
-  bmap->popcount = NULL;
-  return bmap;
-}
-
-/* Allocate a simple bitmap 

Re: [PATCHv2 3/7] gcc/arc: convert some constraints to define_constraint

2016-04-28 Thread Joern Wolfgang Rennecke



On 21/04/16 12:39, Andrew Burgess wrote:

* config/arc/constraints.md (Usd): Convert to define_constraint.
(Us<): Likewise.
(Us>): Likewise.

Thanks.  I have allied this patch.


Re: [PATCHv2 1/7] gcc/arc: Add support for nps400 cpu type.

2016-04-28 Thread Joern Wolfgang Rennecke



On 21/04/16 12:39, Andrew Burgess wrote:

The nps400 is an arc700 with a set of extension instructions produced by
Mellanox (formally EZChip).  This commit adds support for the nps400
architecture to the arc backend.

After this commit it is possible to compile using -mcpu=nps400 in order
to specialise for the nps400.  Later commits add support for the
specific extension instructions.

gcc/ChangeLog:

* common/config/arc/arc-common.c (arc_handle_option): Add NPS400
support, setup defaults.
* config/arc/arc-opts.h (enum processor_type): Add NPS400.
* config/arc/arc.c (arc_init): Add NPS400 support.
* config/arc/arc.h (CPP_SPEC): Add NPS400 defines.
(TARGET_ARC700): NPS400 is also an ARC700.
* config/arc/arc.opt: Add NPS400 options to -mcpu=.

Thanks.  I have applied this patch.


Re: [PATCHv2 2/7] gcc/arc: Replace rI constraint with r & Cm2 for ld and update insns

2016-04-28 Thread Joern Wolfgang Rennecke



On 21/04/16 12:39, Andrew Burgess wrote:
  


* config/arc/arc.md (*loadqi_update): Replace use of 'rI'
constraint with separate 'r' and 'Cm2' constraints.


Why don't you use simply rCm2 ?


[PATCH] [ARC] Fix unwanted match for sign extend 16-bit constant.

2016-04-28 Thread Claudiu Zissulescu
Please find the updated patch.

Claudiu

gcc/
2016-04-28  Claudiu Zissulescu  

* config/arc/arc.h (UNSIGNED_INT12, UNSIGNED_INT16): Define.
* config/arc/arc.md (umulhisi3): Use arc_short_operand predicate.
(umulhisi3_imm): Update predicates and constraint letters.
(umulhisi3_reg): Declare instruction as commutative.
* config/arc/constraints.md (U12, U16): New constraints.
* config/arc/predicates.md (short_unsigned_const_operand): New
predicate.
(arc_short_operand): Likewise.
* testsuite/gcc.target/arc/umulsihi3_z.c: New file.
---
 gcc/config/arc/arc.h   |  2 ++
 gcc/config/arc/arc.md  | 14 +++---
 gcc/config/arc/constraints.md  | 11 +++
 gcc/config/arc/predicates.md   |  8 
 gcc/testsuite/gcc.target/arc/umulsihi3_z.c | 23 +++
 5 files changed, 51 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/umulsihi3_z.c

diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index 37c1afa..1b75099 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -795,6 +795,8 @@ extern enum reg_class arc_regno_reg_class[];
 #define UNSIGNED_INT6(X) ((unsigned) (X) < 0x40)
 #define UNSIGNED_INT7(X) ((unsigned) (X) < 0x80)
 #define UNSIGNED_INT8(X) ((unsigned) (X) < 0x100)
+#define UNSIGNED_INT12(X) ((unsigned) (X) < 0x800)
+#define UNSIGNED_INT16(X) ((unsigned) (X) < 0x1)
 #define IS_ONE(X) ((X) == 1)
 #define IS_ZERO(X) ((X) == 0)
 
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 8ec0ce0..e0f74e4 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -1720,21 +1720,21 @@
 (define_expand "umulhisi3"
   [(set (match_operand:SI 0 "register_operand"   "")
(mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand"  ""))
-(zero_extend:SI (match_operand:HI 2 "nonmemory_operand" ""]
+(zero_extend:SI (match_operand:HI 2 "arc_short_operand" ""]
   "TARGET_MPYW"
   "{
 if (CONSTANT_P (operands[2]))
 {
-  emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2]));
-  DONE;
+ emit_insn (gen_umulhisi3_imm (operands[0], operands[1], operands[2]));
+ DONE;
 }
   }"
 )
 
 (define_insn "umulhisi3_imm"
-  [(set (match_operand:SI 0 "register_operand"  "=r, 
r,r,  r,  r")
-   (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" " 0, 
r,0,  0,  r"))
-(match_operand:HI 2 "short_const_int_operand"  " L, 
L,I,C16,C16")))]
+  [(set (match_operand:SI 0 "register_operand"  "=r, 
r,  r,  r,  r")
+   (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "%0, r, 
 0,  0,  r"))
+(match_operand:HI 2 "short_unsigned_const_operand" " L, 
L,U12,U16,U16")))]
   "TARGET_MPYW"
   "mpyuw%? %0,%1,%2"
   [(set_attr "length" "4,4,4,8,8")
@@ -1746,7 +1746,7 @@
 
 (define_insn "umulhisi3_reg"
   [(set (match_operand:SI 0 "register_operand"  
"=Rcqq, r, r")
-   (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "0, 
0, r"))
+   (mult:SI (zero_extend:SI (match_operand:HI 1 "register_operand" "   %0, 
0, r"))
 (zero_extend:SI (match_operand:HI 2 "register_operand" " Rcqq, 
r, r"]
   "TARGET_MPYW"
   "mpyuw%? %0,%1,%2"
diff --git a/gcc/config/arc/constraints.md b/gcc/config/arc/constraints.md
index 668b60a..cdf94ef 100644
--- a/gcc/config/arc/constraints.md
+++ b/gcc/config/arc/constraints.md
@@ -427,3 +427,14 @@
   "A memory with only a base register"
   (match_operand 0 "mem_noofs_operand"))
 
+(define_constraint "U12"
+  "@internal
+   An unsigned 12-bit integer constant."
+  (and (match_code "const_int")
+   (match_test "UNSIGNED_INT12 (ival)")))
+
+(define_constraint "U16"
+  "@internal
+   An unsigned 16-bit integer constant"
+  (and (match_code "const_int")
+   (match_test "UNSIGNED_INT16 (ival)")))
diff --git a/gcc/config/arc/predicates.md b/gcc/config/arc/predicates.md
index 3c657c6..9542b22 100644
--- a/gcc/config/arc/predicates.md
+++ b/gcc/config/arc/predicates.md
@@ -819,3 +819,11 @@
 (define_predicate "double_register_operand"
   (ior (match_test "even_register_operand (op, mode)")
(match_test "arc_double_register_operand (op, mode)")))
+
+(define_predicate "short_unsigned_const_operand"
+  (and (match_code "const_int")
+   (match_test "satisfies_constraint_U16 (op)")))
+
+(define_predicate "arc_short_operand"
+  (ior (match_test "register_operand (op, mode)")
+   (match_test "short_unsigned_const_operand (op, mode)")))
diff --git a/gcc/testsuite/gcc.target/arc/umulsihi3_z.c 
b/gcc/testsuite/gcc.target/arc/umulsihi3_z.c
new file mode 100644
index 000..cf1c00d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/umulsihi3_z.c
@@ -0,0 +1,23 @@
+/* Check if the 

[PATCH] tracer: Make bb_seen static

2016-04-28 Thread Segher Boessenkool
bb_seen is not used outside of tracer.c.  Committing as trivial.


2016-04-28  Segher Boesssenkool  

* tracer.c (bb_seen): Make static.

---
 gcc/tracer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tracer.c b/gcc/tracer.c
index 56788a2..477d8b3 100644
--- a/gcc/tracer.c
+++ b/gcc/tracer.c
@@ -65,7 +65,7 @@ static int branch_ratio_cutoff;
 
 /* A bit BB->index is set if BB has already been seen, i.e. it is
connected to some trace already.  */
-sbitmap bb_seen;
+static sbitmap bb_seen;
 
 static inline void
 mark_bb_seen (basic_block bb)
-- 
1.9.3



Re: [PATCHv2 0/7] ARC: Add support for nps400 variant

2016-04-28 Thread Joern Wolfgang Rennecke



On 28/04/16 16:31, Joern Wolfgang Rennecke wrote:


However, setting defaults and multilib sets at gcc configure time is
also quite useful, as otherwise every user is confronted with building
multilibs for a burgeoning array of variants.
P.S.: One way to do this is to add introduce a new macro 
SUBTARGET_SELF_SPECS,
to augment DRIVER_SELF_SPECS, and use config.gcc to set this in 
tm_defines to


"%{!mcpu:-mcpu=nps400}" for this subtarget

to set the default cpu.





Re: [PATCH] Improve detection of constant conditions during jump threading

2016-04-28 Thread Jeff Law

On 04/20/2016 03:02 AM, Richard Biener wrote:

On Tue, Apr 19, 2016 at 7:50 PM, Patrick Palka  wrote:

This patch makes the jump threader look through the BIT_AND_EXPRs and
BIT_IOR_EXPRs within a condition so that we could find dominating
ASSERT_EXPRs that could help make the overall condition evaluate to a
constant.  For example, we currently don't perform any jump threading in
the following test case even though it's known that if the code calls
foo() then it can't possibly call bar() afterwards:
I'd always envisioned we'd do more simplifications than we're doing now 
and this fits well within how I expected to exploit ASSERT_EXPRs and 
DOM's available expressions/const/copies tables.


However, I do have some long term direction plans that may make how we 
do this change a bit.  In the mean time I don't see a reason not to go 
forward with your change.






void
baz_1 (int a, int b, int c)
{
  if (a && b)
foo ();
  if (!b && c)
bar ();
}

   :
   _4 = a_3(D) != 0;
   _6 = b_5(D) != 0;
   _7 = _4 & _6;
   if (_7 != 0)
 goto ;
   else
 goto ;

   :
   b_15 = ASSERT_EXPR ;
   foo ();

   :
   _10 = b_5(D) == 0;
   _12 = c_11(D) != 0;
   _13 = _10 & _12;
   if (_13 != 0)
 goto ;
   else
 goto ;

   :
   bar ();

   :
   return;

So we here miss a jump threading opportunity that would have made bb 3 jump
straight to bb 6 instead of falling through to bb 4.

If we inspect the operands of the BIT_AND_EXPR of _13 we'll notice that
there is an ASSERT_EXPR that says its left operand b_5 is non-zero.  We
could use this ASSERT_EXPR to deduce that the condition (_13 != 0) is
always false.  This is what this patch does, basically by making
simplify_control_stmt_condition recurse into BIT_AND_EXPRs and
BIT_IOR_EXPRs.

Does this seem like a good idea/approach?
So the other approach I've been pondering for a while is backwards 
substitution.


So given _13 != 0, we expand that to

(_10 & _12) != 0

Which further expands into
((b_5 == 0) & (c_11 != 0)) != 0

And we follow b_5 back to the ASSERT_EXPR which allows us to start 
simplifying terms.



The glitch in that plan is there is no easy linkage between the use of 
b_5 in bb4 and the ASSERT_EXPR in bb3.  That's something Aldy, Andrew 
and myself are looking at independently for some of Aldy's work.


But that's all future work...  Back to your patch...



Notes:

1. This patch introduces a "regression" in gcc.dg/tree-ssa/ssa-thread-11.c
in that we no longer perform FSM threading during vrp2 but instead we
detect two new jump threading opportunities during vrp1.  Not sure if
the new code is better but it is shorter.  I wonder how this should be
resolved...


Try adjusting the testcase so that it performs the FSM threading again
or adjust the expected outcome...
Right.  We just need to look closely at the before/after dumps, make a 
decision about whether the result is better or worse.  If it's better, 
then we adjust the output to the new better result (and I would claim 
that the same threading, but done earlier is better).


Shorter isn't generally a good indicator of whether or not something is 
better.  The thing to look at is the number of conditional executed on 
the various paths through the CFG.


In this specific instance, there's a good chance your analysis is 
catching something earlier and allowing it to be better simplified.  But 
let's do the analysis to make sure.





2. I haven't tested the performance impact of this patch.  What would be
a good way to do this?
I have ways to do that.  Jump threading inherently is about reducing the 
number of conditionals we have to evaluate at runtime.   Valgrind can 
give us that information.  So what I do is...


Build a control compiler.  Use that to build an older version of gcc 
(gcc-4.7.3 in particular) .  I then use the just-built gcc-4.7.3 to 
build a bunch of .i files under valgrind control.


I then repeat, but the first compiler has whatever patch I want to 
evaluate installed.


I can then compare the output from valgrind to see which has better 
branching behaviour.


It takes a few hours, but it's not terrible and has provided good data 
through the years.  I'll throw your patch into the tester and see what 
it spits out.


I'll also have a look at the details of the patch.

jeff


[PATCH, i386]: "mov $0, reg" and "mov $-1", reg peepholes cleanup

2016-04-28 Thread Uros Bizjak
No functional changes.

2016-04-28  Uros Bizjak  

* config/i386/i386.md (zeroing peephole2): Use general_reg_operand.
(or $-1,reg peephole2): Ditto.
(strict_low_part zeroing peephole2): Use SWI12 mode iterator.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 235561)
+++ i386.md (working copy)
@@ -18120,13 +18120,12 @@
copy_rtx (operands[2]));
 })
 
-;; Attempt to always use XOR for zeroing registers.
+;; Attempt to always use XOR for zeroing registers (including FP modes).
 (define_peephole2
-  [(set (match_operand 0 "register_operand")
+  [(set (match_operand 0 "general_reg_operand")
(match_operand 1 "const0_operand"))]
   "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD
&& (! TARGET_USE_MOV0 || optimize_insn_for_size_p ())
-   && GENERAL_REGNO_P (REGNO (operands[0]))
&& peep2_regno_dead_p (0, FLAGS_REG)"
   [(parallel [(set (match_dup 0) (const_int 0))
  (clobber (reg:CC FLAGS_REG))])]
@@ -18133,11 +18132,9 @@
   "operands[0] = gen_lowpart (word_mode, operands[0]);")
 
 (define_peephole2
-  [(set (strict_low_part (match_operand 0 "register_operand"))
+  [(set (strict_low_part (match_operand:SWI12 0 "general_reg_operand"))
(const_int 0))]
-  "(GET_MODE (operands[0]) == QImode
-|| GET_MODE (operands[0]) == HImode)
-   && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ())
+  "(! TARGET_USE_MOV0 || optimize_insn_for_size_p ())
&& peep2_regno_dead_p (0, FLAGS_REG)"
   [(parallel [(set (strict_low_part (match_dup 0)) (const_int 0))
  (clobber (reg:CC FLAGS_REG))])])
@@ -18144,10 +18141,9 @@
 
 ;; For HI, SI and DI modes, or $-1,reg is smaller than mov $-1,reg.
 (define_peephole2
-  [(set (match_operand:SWI248 0 "register_operand")
+  [(set (match_operand:SWI248 0 "general_reg_operand")
(const_int -1))]
-  "(optimize_insn_for_size_p () || TARGET_MOVE_M1_VIA_OR)
-   && GENERAL_REGNO_P (REGNO (operands[0]))
+  "(TARGET_MOVE_M1_VIA_OR || optimize_insn_for_size_p ())
&& peep2_regno_dead_p (0, FLAGS_REG)"
   [(parallel [(set (match_dup 0) (const_int -1))
  (clobber (reg:CC FLAGS_REG))])]


Re: [PATCH] x86 interrupt attribute patch [1/2]

2016-04-28 Thread Jeff Law

On 04/20/2016 07:48 AM, Koval, Julia wrote:

Sorry, here is the right patch.

-Original Message-
From: Koval, Julia
Sent: Wednesday, April 20, 2016 4:42 PM
To: 'gcc-patches@gcc.gnu.org' 
Cc: Lu, Hongjiu ; 'vaalfr...@gmail.com' ; 
'ubiz...@gmail.com' ; 'l...@redhat.com' ; Zamyatin, Igor 

Subject: [PATCH] x86 interrupt attribute patch [1/2]

Hi,
Here is the new version of interrupt attribute patch. Bootstraped/regtested for 
Linux/x86_64. Ok for trunk?

Update TARGET_FUNCTION_INCOMING_ARG documentation

On x86, interrupt handlers are only called by processors which push
interrupt data onto stack at the address where the normal return address
is.  Since interrupt handlers must access interrupt data via pointers so
that they can update interrupt data, the pointer argument is passed as
"argument pointer - word".

TARGET_FUNCTION_INCOMING_ARG defines how callee sees its argument.
Normally it returns REG, NULL, or CONST_INT.  This patch adds arbitrary
address computation based on hard register, which can be forced into a
register, to the list.

When copying an incoming argument onto stack, assign_parm_setup_stack
has:

if (argument in memory)
  copy argument in memory to stack
else
  move argument to stack

Since an arbitrary address computation may be passed as an argument, we
change it to:

if (argument in memory)
  copy argument in memory to stack
else
  {
if (argument isn't in register)
  force argument into a register
move argument to stack
  }

* function.c (assign_parm_setup_stack): Force source into a
register if needed.
* target.def (function_incoming_arg): Update documentation to
allow arbitrary address computation based on hard register.
* doc/tm.texi: Regenerated.

So I think the function.c changes are fine.  But I think we need to do a 
tiny bit more on the documentation side before we can install the change.


While I think a rewrite of the whole argument passing section would be 
advisable, that may be a bit much to expect.  So let's try to just 
cleanup FUNCTION_INCOMING_ARG.


FUNCTION_INCOMING_ARG has text like "Define this hook if the target 
machine has register windows ..."


I'd change that text to be something like

"Define this hook if the caller and callee on the target have different 
views of where arguments are passed.  Also define this hook if there are 
functions that are never directly called, but are invoked by the 
hardware and which have nonstandard calling conventions."


Or something along those lines.


At one time I thought we'd want to specify how the cumulative args 
structure would or would not be updated for these special arguments. 
But after further reflection, I think that can be a target dependent 
implementation detail.




I think with that one documentation update this will be OK, but I would 
like you to repost it so I can look at it one final time.


jeff




Re: [PATCH] asan: Don't check frame numbers in the testsuite

2016-04-28 Thread Segher Boessenkool
On Thu, Apr 28, 2016 at 06:03:38PM +0200, Jakub Jelinek wrote:
> On Thu, Apr 28, 2016 at 03:57:38PM +, Segher Boessenkool wrote:
> > On various PowerPC configurations, the top frame is often mentioned
> > twice in the backtrace, making many asan tests fail.  I see no particular
> > reason the asan tests want to check the frame number, so this patch
> > makes it check for " #. " instead of " #1 ", etc., in all of the
> > c-c++-common/asan tests.
> 
> Wouldn't it be better to fix the backtrace stuff for PowerPC, so that
> the top frame is not mentioned twice when it shouldn't?

Yes, but should this cause over a hundred asan tests to fail?  The actual
things those tests test work fine, just the backtrace is a little funny
(and people are used to that, things have worked this way for as long as
I can remember -- maybe it even cannot be fixed?).

> I mean, if it annoys the tests (where IMHO it is still useful to test
> the numbers, to avoid e.g. having some unrelated frames being printed first
> or bugs like this to be caught), it will annoy users as well.

How about I leave #0 in place, only replace #1 etc. by #. ?


Segher


Re: [PATCH] Add peephole for -Os lock; dec (PR target/70821)

2016-04-28 Thread Uros Bizjak
On Thu, Apr 28, 2016 at 5:42 PM, Jakub Jelinek  wrote:
> Hi!
>
> Optimizing atomic_fetch_add followed by comparison into just testing
> the flags of the lock; sub is handled by a peephole2, which works usually
> fine, except that for -Os we have another peephole2 that transforms
> movl $-1, %reg into orl $-1, %reg and that causes the above mentioned
> peephole2 not to trigger anymore.
>
> Fixed by adding a peephole2 even for this case.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2016-04-28  Jakub Jelinek  
>
> PR target/70821
> * config/i386/sync.md (define_peephole2 *atomic_fetch_add_cmp):
> Add new peephole2 where the first insn is *mov_or instead of
> *mov_internal.
>
> * gcc.target/i386/pr70821.c: New test.

OK.

Thanks,
Uros.

>
> --- gcc/config/i386/sync.md.jj  2016-01-04 14:55:56.0 +0100
> +++ gcc/config/i386/sync.md 2016-04-28 09:40:28.265764880 +0200
> @@ -467,6 +467,36 @@ (define_peephole2
>(plus:SWI (match_dup 1)
>  (match_dup 2)))])])
>
> +;; Likewise, but for the -Os special case of *mov_or.
> +(define_peephole2
> +  [(parallel [(set (match_operand:SWI 0 "register_operand")
> +  (match_operand:SWI 2 "constm1_operand"))
> + (clobber (reg:CC FLAGS_REG))])
> +   (parallel [(set (match_dup 0)
> +  (unspec_volatile:SWI
> +[(match_operand:SWI 1 "memory_operand")
> + (match_operand:SI 4 "const_int_operand")]
> +UNSPECV_XCHG))
> + (set (match_dup 1)
> +  (plus:SWI (match_dup 1)
> +(match_dup 0)))
> + (clobber (reg:CC FLAGS_REG))])
> +   (set (reg:CCZ FLAGS_REG)
> +   (compare:CCZ (match_dup 0)
> +(match_operand:SWI 3 "const_int_operand")))]
> +  "peep2_reg_dead_p (3, operands[0])
> +   && (unsigned HOST_WIDE_INT) INTVAL (operands[2])
> +  == -(unsigned HOST_WIDE_INT) INTVAL (operands[3])
> +   && !reg_overlap_mentioned_p (operands[0], operands[1])"
> +  [(parallel [(set (reg:CCZ FLAGS_REG)
> +  (compare:CCZ
> +(unspec_volatile:SWI [(match_dup 1) (match_dup 4)]
> + UNSPECV_XCHG)
> +(match_dup 3)))
> + (set (match_dup 1)
> +  (plus:SWI (match_dup 1)
> +(match_dup 2)))])])
> +
>  (define_insn "*atomic_fetch_add_cmp"
>[(set (reg:CCZ FLAGS_REG)
> (compare:CCZ
> --- gcc/testsuite/gcc.target/i386/pr70821.c.jj  2016-04-28 09:56:06.239893613 
> +0200
> +++ gcc/testsuite/gcc.target/i386/pr70821.c 2016-04-28 09:55:23.0 
> +0200
> @@ -0,0 +1,16 @@
> +/* PR target/70821 */
> +/* { dg-do compile } */
> +/* { dg-options "-Os" } */
> +/* { dg-additional-options "-march=i686" { target ia32 } } */
> +
> +void bar (void);
> +
> +void
> +foo (int *p)
> +{
> +  if (__atomic_sub_fetch (p, 1, __ATOMIC_SEQ_CST))
> +bar ();
> +}
> +
> +/* { dg-final { scan-assembler "lock;? dec" } } */
> +/* { dg-final { scan-assembler-not "lock;? xadd" } } */
>
> Jakub


Re: [PATCH] Update gmp/mpfr/mpc in-tree versions

2016-04-28 Thread Bernd Edlinger
On 28.04.2016 16:29, Richard Biener wrote:
>
> Another option would be to try if mini-gmp is enough for our
> (in-tree) use and what the performance impact would be if we'd
> use that (in-tree).
>

Yes, we would certainly never need more than that subset.

But I don't see how mpfr can be built with mini-gmp.
I tried to and failed early in mpfr/configure.
Any ideas?

Bernd.


Re: [PATCH] asan: Don't check frame numbers in the testsuite

2016-04-28 Thread Jeff Law

On 04/28/2016 09:57 AM, Segher Boessenkool wrote:

On various PowerPC configurations, the top frame is often mentioned
twice in the backtrace, making many asan tests fail.  I see no particular
reason the asan tests want to check the frame number, so this patch
makes it check for " #. " instead of " #1 ", etc., in all of the
c-c++-common/asan tests.

Tested on powerpc64-linux, also -m32; is this okay for trunk?


Segher


2016-04-28  Segher Boessenkool  

gcc/testsuite/
* c-c++-common/asan/global-overflow-1.c: Don't check frame numbers.
* c-c++-common/asan/heap-overflow-1.c: Ditto.
* c-c++-common/asan/memcmp-1.c: Ditto.
* c-c++-common/asan/misalign-1.c: Ditto.
* c-c++-common/asan/misalign-2.c: Ditto.
* c-c++-common/asan/null-deref-1.c: Ditto.
* c-c++-common/asan/pr64820.c: Ditto.
* c-c++-common/asan/sanity-check-pure-c-1.c: Ditto.
* c-c++-common/asan/stack-overflow-1.c: Ditto.
* c-c++-common/asan/strip-path-prefix-1.c: Ditto.
* c-c++-common/asan/strlen-overflow-1.c: Ditto.
* c-c++-common/asan/strncpy-overflow-1.c: Ditto.
* c-c++-common/asan/use-after-free-1.c: Ditto.
* c-c++-common/asan/use-after-return-1.c: Ditto.
One could argue that testing the frame numbers tests is a QofI test for 
our backtrace generation and is thus valid.


Has anyone looked into fixing the unwinder so that we're not getting 
duplicate frames?


I'd rather not dumb down the test since it is showing a real issue.

jeff




Re: [PATCH] asan: Don't check frame numbers in the testsuite

2016-04-28 Thread Yury Gribov

On 04/28/2016 06:57 PM, Segher Boessenkool wrote:

On various PowerPC configurations, the top frame is often mentioned
twice in the backtrace, making many asan tests fail.  I see no particular
reason the asan tests want to check the frame number, so this patch
makes it check for " #. " instead of " #1 ", etc., in all of the
c-c++-common/asan tests.


Why not fix libbacktrace though?


Tested on powerpc64-linux, also -m32; is this okay for trunk?


Segher


2016-04-28  Segher Boessenkool  

gcc/testsuite/
* c-c++-common/asan/global-overflow-1.c: Don't check frame numbers.
* c-c++-common/asan/heap-overflow-1.c: Ditto.
* c-c++-common/asan/memcmp-1.c: Ditto.
* c-c++-common/asan/misalign-1.c: Ditto.
* c-c++-common/asan/misalign-2.c: Ditto.
* c-c++-common/asan/null-deref-1.c: Ditto.
* c-c++-common/asan/pr64820.c: Ditto.
* c-c++-common/asan/sanity-check-pure-c-1.c: Ditto.
* c-c++-common/asan/stack-overflow-1.c: Ditto.
* c-c++-common/asan/strip-path-prefix-1.c: Ditto.
* c-c++-common/asan/strlen-overflow-1.c: Ditto.
* c-c++-common/asan/strncpy-overflow-1.c: Ditto.
* c-c++-common/asan/use-after-free-1.c: Ditto.
* c-c++-common/asan/use-after-return-1.c: Ditto.

---
  gcc/testsuite/c-c++-common/asan/global-overflow-1.c |  2 +-
  gcc/testsuite/c-c++-common/asan/heap-overflow-1.c   |  6 +++---
  gcc/testsuite/c-c++-common/asan/memcmp-1.c  |  4 ++--
  gcc/testsuite/c-c++-common/asan/misalign-1.c|  4 ++--
  gcc/testsuite/c-c++-common/asan/misalign-2.c|  4 ++--
  gcc/testsuite/c-c++-common/asan/null-deref-1.c  |  4 ++--
  gcc/testsuite/c-c++-common/asan/pr64820.c   |  2 +-
  gcc/testsuite/c-c++-common/asan/sanity-check-pure-c-1.c |  8 
  gcc/testsuite/c-c++-common/asan/stack-overflow-1.c  |  2 +-
  gcc/testsuite/c-c++-common/asan/strip-path-prefix-1.c   |  2 +-
  gcc/testsuite/c-c++-common/asan/strlen-overflow-1.c |  2 +-
  gcc/testsuite/c-c++-common/asan/strncpy-overflow-1.c|  8 
  gcc/testsuite/c-c++-common/asan/use-after-free-1.c  | 10 +-
  gcc/testsuite/c-c++-common/asan/use-after-return-1.c|  2 +-
  14 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/asan/global-overflow-1.c 
b/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
index 8dd75df..6a659c8 100644
--- a/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
+++ b/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
@@ -23,6 +23,6 @@ int main() {
  }

  /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r)" } */
-/* { dg-output "#0 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*global-overflow-1.c:20|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r).*" } */
+/* { dg-output "#. 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*global-overflow-1.c:20|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r).*" } */
  /* { dg-output "0x\[0-9a-f\]+ is located 0 bytes to the right of global 
variable" } */
  /* { dg-output ".*YYY\[^\n\r]* of size 10\[^\n\r]*(\n|\r\n|\r)" } */
diff --git a/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c 
b/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c
index 0377a6c..e7c0ba5 100644
--- a/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c
+++ b/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c
@@ -24,8 +24,8 @@ int main(int argc, char **argv) {
  }

  /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r)" } */
-/* { dg-output "#0 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*heap-overflow-1.c:21|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
+/* { dg-output "#. 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*heap-overflow-1.c:21|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
  /* { dg-output "\[^\n\r]*0x\[0-9a-f\]+ is located 0 bytes to the right of 10-byte 
region\[^\n\r]*(\n|\r\n|\r)" } */
  /* { dg-output "\[^\n\r]*allocated by thread T0 here:\[^\n\r]*(\n|\r\n|\r)" } 
*/
-/* { dg-output "#0 0x\[0-9a-f\]+ +(in 
_*(interceptor_|wrap_|)malloc|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*heap-overflow-1.c:19|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#. 0x\[0-9a-f\]+ +(in 
_*(interceptor_|wrap_|)malloc|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#. 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*heap-overflow-1.c:19|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
diff --git a/gcc/testsuite/c-c++-common/asan/memcmp-1.c 
b/gcc/testsuite/c-c++-common/asan/memcmp-1.c
index 5915988..5a36353 100644
--- a/gcc/testsuite/c-c++-common/asan/memcmp-1.c
+++ b/gcc/testsuite/c-c++-common/asan/memcmp-1.c
@@ -16,5 +16,5 @@ main ()
  }

  /* { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow.*(\n|\r\n|\r)" 
} */
-/* { dg-output "#0 0x\[0-9a-f\]+ +(in 
_*(interceptor_|wrap_|)memcmp|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*main|\[(\])\[^\n\r]*(\n|\r\n|\r)" 
} */
+/* { 

Re: [PATCH] asan: Don't check frame numbers in the testsuite

2016-04-28 Thread Jakub Jelinek
On Thu, Apr 28, 2016 at 03:57:38PM +, Segher Boessenkool wrote:
> On various PowerPC configurations, the top frame is often mentioned
> twice in the backtrace, making many asan tests fail.  I see no particular
> reason the asan tests want to check the frame number, so this patch
> makes it check for " #. " instead of " #1 ", etc., in all of the
> c-c++-common/asan tests.

Wouldn't it be better to fix the backtrace stuff for PowerPC, so that
the top frame is not mentioned twice when it shouldn't?
I mean, if it annoys the tests (where IMHO it is still useful to test
the numbers, to avoid e.g. having some unrelated frames being printed first
or bugs like this to be caught), it will annoy users as well.

Jakub


Re: [PATCH] nds32: Fix casesi (PR70668)

2016-04-28 Thread Jeff Law

On 04/28/2016 09:45 AM, Segher Boessenkool wrote:

Expanders do not have more elements in the operands array than declared
in the pattern.  So, we cannot use operands[5] here.  Instead just
declare and use another rtx.

Built with a cross compiler; not tested otherwise.  Is this okay for trunk?


Segher


2016-04-28  Segher Boessenkool  

PR target/70668
* config/nds32/nds32.md (casesi): Don't access the operands array
out of bounds.

OK.
jeff



[PATCH] Better location info for "incomplete type" error msg (PR c/70756)

2016-04-28 Thread Marek Polacek
The goal of this patch is to improve the location info for the "incomplete
type" error.  Turned out this is isn't as trivial as it should be:
1) c_incomplete_type_error is a target hook, so I had to add the location
   parameter to the other hooks, too,
2) I had to add the location parameter to size_in_bytes, too.  I renamed
   it to size_in_bytes_loc and defined a macro so I don't have to change
   all the callsites,
3) for the C++ FE I used a macro so that I don't have to change all the
   cxx_incomplete_type_error calls now,
4) what with using EXPR_LOC_OR_LOC (exp, input_location), we sometimes
   still produce imprecise location :(.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-04-28  Marek Polacek  

PR c/70756
* c-common.c (pointer_int_sum): Call size_in_bytes_loc instead of
size_in_bytes and pass LOC to it.

* c-decl.c (build_compound_literal): Pass LOC down to
c_incomplete_type_error.
* c-tree.h (require_complete_type): Adjust declaration.
(c_incomplete_type_error): Likewise.
* c-typeck.c (require_complete_type): Add location parameter, pass it
down to c_incomplete_type_error.
(c_incomplete_type_error): Add location parameter, pass it down to
error_at.
(build_component_ref): Pass location down to c_incomplete_type_error.
(default_conversion): Pass location down to require_complete_type.
(build_array_ref): Likewise.
(build_function_call_vec): Likewise.
(convert_arguments): Likewise.
(build_unary_op): Likewise.
(build_c_cast): Likewise.
(build_modify_expr): Likewise.
(convert_for_assignment): Likewise.
(c_finish_omp_clauses): Likewise.

* cp-tree.h (cxx_incomplete_type_error): Adjust declaration.
* typeck2.c (cxx_incomplete_type_error): Add location parameter.

* langhooks-def.h (lhd_incomplete_type_error): Adjust declaration.
* langhooks.c (lhd_incomplete_type_error): Add location parameter.
* langhooks.h (incomplete_type_error): Likewise.
* tree.c (size_in_bytes_loc): Renamed from size_in_bytes.  Add location
parameter, pass it down to incomplete_type_error.
* tree.h (size_in_bytes): Define macro.
(size_in_bytes_loc): Renamed from size_in_bytes.

* gcc.dg/pr70756.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index c086dee..decbe8b 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -4270,7 +4270,7 @@ pointer_int_sum (location_t loc, enum tree_code 
resultcode,
   size_exp = integer_one_node;
 }
   else
-size_exp = size_in_bytes (TREE_TYPE (result_type));
+size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
 
   /* We are manipulating pointer values, so we don't need to warn
  about relying on undefined signed overflow.  We disable the
diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index 16e4250..d714ec2 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -5112,7 +5112,7 @@ build_compound_literal (location_t loc, tree type, tree 
init, bool non_const)
 
   if (type == error_mark_node || !COMPLETE_TYPE_P (type))
 {
-  c_incomplete_type_error (NULL_TREE, type);
+  c_incomplete_type_error (loc, NULL_TREE, type);
   return error_mark_node;
 }
 
diff --git gcc/c/c-tree.h gcc/c/c-tree.h
index 4633182..d3a6c4c 100644
--- gcc/c/c-tree.h
+++ gcc/c/c-tree.h
@@ -588,13 +588,13 @@ extern tree c_last_sizeof_arg;
 extern struct c_switch *c_switch_stack;
 
 extern tree c_objc_common_truthvalue_conversion (location_t, tree);
-extern tree require_complete_type (tree);
+extern tree require_complete_type (location_t, tree);
 extern int same_translation_unit_p (const_tree, const_tree);
 extern int comptypes (tree, tree);
 extern int comptypes_check_different_types (tree, tree, bool *);
 extern bool c_vla_type_p (const_tree);
 extern bool c_mark_addressable (tree);
-extern void c_incomplete_type_error (const_tree, const_tree);
+extern void c_incomplete_type_error (location_t, const_tree, const_tree);
 extern tree c_type_promotes_to (tree);
 extern struct c_expr default_function_array_conversion (location_t,
struct c_expr);
diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index 58c2139..32fd504 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -183,11 +183,12 @@ struct tagged_tu_seen_cache {
 static const struct tagged_tu_seen_cache * tagged_tu_seen_base;
 static void free_all_tagged_tu_seen_up_to (const struct tagged_tu_seen_cache 
*);
 
-/* Do `exp = require_complete_type (exp);' to make sure exp
-   does not have an incomplete type.  (That includes void types.)  */
+/* Do `exp = require_complete_type (loc, exp);' to make sure exp
+   does not have an incomplete type.  (That includes void types.)
+   LOC is the location of the use.  */
 
 tree
-require_complete_type (tree value)
+require_complete_type 

[PATCH] asan: Don't check frame numbers in the testsuite

2016-04-28 Thread Segher Boessenkool
On various PowerPC configurations, the top frame is often mentioned
twice in the backtrace, making many asan tests fail.  I see no particular
reason the asan tests want to check the frame number, so this patch
makes it check for " #. " instead of " #1 ", etc., in all of the
c-c++-common/asan tests.

Tested on powerpc64-linux, also -m32; is this okay for trunk?


Segher


2016-04-28  Segher Boessenkool  

gcc/testsuite/
* c-c++-common/asan/global-overflow-1.c: Don't check frame numbers.
* c-c++-common/asan/heap-overflow-1.c: Ditto.
* c-c++-common/asan/memcmp-1.c: Ditto.
* c-c++-common/asan/misalign-1.c: Ditto.
* c-c++-common/asan/misalign-2.c: Ditto.
* c-c++-common/asan/null-deref-1.c: Ditto.
* c-c++-common/asan/pr64820.c: Ditto.
* c-c++-common/asan/sanity-check-pure-c-1.c: Ditto.
* c-c++-common/asan/stack-overflow-1.c: Ditto.
* c-c++-common/asan/strip-path-prefix-1.c: Ditto.
* c-c++-common/asan/strlen-overflow-1.c: Ditto.
* c-c++-common/asan/strncpy-overflow-1.c: Ditto.
* c-c++-common/asan/use-after-free-1.c: Ditto.
* c-c++-common/asan/use-after-return-1.c: Ditto.

---
 gcc/testsuite/c-c++-common/asan/global-overflow-1.c |  2 +-
 gcc/testsuite/c-c++-common/asan/heap-overflow-1.c   |  6 +++---
 gcc/testsuite/c-c++-common/asan/memcmp-1.c  |  4 ++--
 gcc/testsuite/c-c++-common/asan/misalign-1.c|  4 ++--
 gcc/testsuite/c-c++-common/asan/misalign-2.c|  4 ++--
 gcc/testsuite/c-c++-common/asan/null-deref-1.c  |  4 ++--
 gcc/testsuite/c-c++-common/asan/pr64820.c   |  2 +-
 gcc/testsuite/c-c++-common/asan/sanity-check-pure-c-1.c |  8 
 gcc/testsuite/c-c++-common/asan/stack-overflow-1.c  |  2 +-
 gcc/testsuite/c-c++-common/asan/strip-path-prefix-1.c   |  2 +-
 gcc/testsuite/c-c++-common/asan/strlen-overflow-1.c |  2 +-
 gcc/testsuite/c-c++-common/asan/strncpy-overflow-1.c|  8 
 gcc/testsuite/c-c++-common/asan/use-after-free-1.c  | 10 +-
 gcc/testsuite/c-c++-common/asan/use-after-return-1.c|  2 +-
 14 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/asan/global-overflow-1.c 
b/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
index 8dd75df..6a659c8 100644
--- a/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
+++ b/gcc/testsuite/c-c++-common/asan/global-overflow-1.c
@@ -23,6 +23,6 @@ int main() {
 }
 
 /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r)" } */
-/* { dg-output "#0 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*global-overflow-1.c:20|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r).*" } 
*/
+/* { dg-output "#. 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*global-overflow-1.c:20|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r).*" } 
*/
 /* { dg-output "0x\[0-9a-f\]+ is located 0 bytes to the right of global 
variable" } */
 /* { dg-output ".*YYY\[^\n\r]* of size 10\[^\n\r]*(\n|\r\n|\r)" } */
diff --git a/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c 
b/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c
index 0377a6c..e7c0ba5 100644
--- a/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c
+++ b/gcc/testsuite/c-c++-common/asan/heap-overflow-1.c
@@ -24,8 +24,8 @@ int main(int argc, char **argv) {
 }
 
 /* { dg-output "READ of size 1 at 0x\[0-9a-f\]+ thread T0.*(\n|\r\n|\r)" } */
-/* { dg-output "#0 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*heap-overflow-1.c:21|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
+/* { dg-output "#. 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*heap-overflow-1.c:21|\[^\n\r]*:0)|\[(\]).*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*0x\[0-9a-f\]+ is located 0 bytes to the right of 
10-byte region\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*allocated by thread T0 here:\[^\n\r]*(\n|\r\n|\r)" } 
*/
-/* { dg-output "#0 0x\[0-9a-f\]+ +(in 
_*(interceptor_|wrap_|)malloc|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*heap-overflow-1.c:19|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#. 0x\[0-9a-f\]+ +(in 
_*(interceptor_|wrap_|)malloc|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#. 0x\[0-9a-f\]+ +(in _*main 
(\[^\n\r]*heap-overflow-1.c:19|\[^\n\r]*:0)|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
diff --git a/gcc/testsuite/c-c++-common/asan/memcmp-1.c 
b/gcc/testsuite/c-c++-common/asan/memcmp-1.c
index 5915988..5a36353 100644
--- a/gcc/testsuite/c-c++-common/asan/memcmp-1.c
+++ b/gcc/testsuite/c-c++-common/asan/memcmp-1.c
@@ -16,5 +16,5 @@ main ()
 }
 
 /* { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow.*(\n|\r\n|\r)" 
} */
-/* { dg-output "#0 0x\[0-9a-f\]+ +(in 
_*(interceptor_|wrap_|)memcmp|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*main|\[(\])\[^\n\r]*(\n|\r\n|\r)" 
} */
+/* { dg-output "#. 0x\[0-9a-f\]+ +(in 
_*(interceptor_|wrap_|)memcmp|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { 

[PATCH] Fix ICE during operand_equal_p hash checking (PR middle-end/70843)

2016-04-28 Thread Jakub Jelinek
Hi!

As reported in the PR and can be seen on this simplified testcase
everywhere, the FEs sometimes call operand_equal_p e.g. on a SAVE_EXPR
that contains a BIND_EXPR in it, and if arg0 == arg1, operand_equal_p
can return non-zero on it.
The ICE is because inchash::add_expr is unprepared to hash some trees,
it handles just tcc_declaration, selected specific trees and expressions of
all kinds, the last one usually by just recursing on all their operands.
For BIND_EXPR, the last operand is usually a BLOCK which we ICE on though,
and the middle argument usually a STATEMENT_LIST that we ICE on as well.

The first hunk is just an optimization (but fixes the ICE anyway),
I think we really don't need to verify that a hash function for the same
argument always returns the same value.  But I can imagine e.g.
a SAVE_EXPR of BIND_EXPR + var and var + the same SAVE_EXPR being compared
using operand_equal_p and there we wouldn't be equal at the top level and
still ICE.
The second hunk alone fixes the ICE too, by making sure we handle those
(just ignoring BLOCK and OMP_CLAUSE (the latter for now, if we find we want
to hash pre-OMP expansion trees too often we could adjust).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-04-28  Jakub Jelinek  

PR middle-end/70843
* fold-const.c (operand_equal_p): Don't verify hash value equality
if arg0 == arg1.
* tree.c (inchash::add_expr): Handle STATEMENT_LIST.  Ignore BLOCK
and OMP_CLAUSE.

* gcc.dg/pr70843.c: New test.

--- gcc/fold-const.c.jj 2016-04-27 15:29:05.0 +0200
+++ gcc/fold-const.c2016-04-28 13:28:56.272276557 +0200
@@ -2756,12 +2756,15 @@ operand_equal_p (const_tree arg0, const_
 {
   if (operand_equal_p (arg0, arg1, flags | OEP_NO_HASH_CHECK))
{
- inchash::hash hstate0 (0), hstate1 (0);
- inchash::add_expr (arg0, hstate0, flags);
- inchash::add_expr (arg1, hstate1, flags);
- hashval_t h0 = hstate0.end ();
- hashval_t h1 = hstate1.end ();
- gcc_assert (h0 == h1);
+ if (arg0 != arg1)
+   {
+ inchash::hash hstate0 (0), hstate1 (0);
+ inchash::add_expr (arg0, hstate0, flags);
+ inchash::add_expr (arg1, hstate1, flags);
+ hashval_t h0 = hstate0.end ();
+ hashval_t h1 = hstate1.end ();
+ gcc_assert (h0 == h1);
+   }
  return 1;
}
   else
--- gcc/tree.c.jj   2016-04-27 09:45:27.0 +0200
+++ gcc/tree.c  2016-04-28 13:24:01.770245254 +0200
@@ -7836,6 +7836,10 @@ add_expr (const_tree t, inchash::hash 
 case PLACEHOLDER_EXPR:
   /* The node itself doesn't matter.  */
   return;
+case BLOCK:
+case OMP_CLAUSE:
+  /* Ignore.  */
+  return;
 case TREE_LIST:
   /* A list of expressions, for a CALL_EXPR or as the elements of a
 VECTOR_CST.  */
@@ -7854,6 +7858,14 @@ add_expr (const_tree t, inchash::hash 
  }
return;
   }
+case STATEMENT_LIST:
+  {
+   tree_stmt_iterator i;
+   for (i = tsi_start (CONST_CAST_TREE (t));
+!tsi_end_p (i); tsi_next ())
+ inchash::add_expr (tsi_stmt (i), hstate, flags);
+   return;
+  }
 case FUNCTION_DECL:
   /* When referring to a built-in FUNCTION_DECL, use the __builtin__ form.
 Otherwise nodes that compare equal according to operand_equal_p might
--- gcc/testsuite/gcc.dg/pr70843.c.jj   2016-04-28 13:37:54.596022706 +0200
+++ gcc/testsuite/gcc.dg/pr70843.c  2016-04-28 13:37:38.0 +0200
@@ -0,0 +1,9 @@
+/* PR middle-end/70843 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int
+foo (int x, int y)
+{
+  return ({ int a = 5; a += x; a *= y; a; }) ? : 2;
+}

Jakub


Re: [ARM] Enable __fp16 as a function parameter and return type.

2016-04-28 Thread Joseph Myers
On Thu, 28 Apr 2016, Matthew Wahab wrote:

> Hello,
> 
> The ARM target supports the half-precision floating point type __fp16
> but does not allow its use as a function return or parameter type. This
> patch removes that restriction and defines the ACLE macro
> __ARM_FP16_ARGS to indicate this. The code generated for passing __fp16
> values into and out of functions depends on the level of hardware
> support but conforms to the AAPCS (see
> http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf).

The sole use of the TARGET_INVALID_PARAMETER_TYPE and 
TARGET_INVALID_RETURN_TYPE hooks was to disallow __fp16 use as a function 
return or parameter type.  Thus, I think this patch should completely 
remove those hooks and poison them in system.h.

This patch addresses one incompatibility of the original __fp16 
specification with the more recent ACLE specification and the 
specification in ISO/IEC TS 18661-3 for how such types should work.  
Another such incompatibility is the peculiar rule in the original 
specification that conversions from double to __fp16 go via float, with 
double rounding.  Do you have plans to eliminate that and move to the 
single-rounding semantics that are in current specifications?

I note that that AAPCS revision says for __fp16, in 7.1.1 Arithmetic 
Types, "In a variadic function call this will be passed as a 
double-precision value.".  I haven't checked what this patch implements, 
but that could be problematic, and different from what's said under 7.2, 
"For variadic functions, float arguments that match the ellipsis (...) are 
converted to type double.".

In TS 18661-3, _Float16 is *not* affected by default argument promotions; 
only float is.  This reflects how the default conversion of float to 
double is a legacy feature; note for example how in C99 and C11 float 
_Imaginary is not promoted to double _Imaginary, and float _Complex is not 
promoted to double _Complex.

Thus it would be better for compatibility with TS 18661-3 to pass __fp16 
values to variadic functions as themselves, unpromoted.  (Formally of 
course the lack of promotion is a language feature not an ABI feature; as 
long as va_arg for _Float16 named works correctly, you could promote at 
the ABI level and then convert back, and the only effect would be that 
sNaNs get quieted, so passing a _Float16 sNaN through variable arguments 
would act as a convertFormat operation instead of a copy operation.  It's 
not clear that having such an ABI-level promotion is a good idea, 
however.)

Now, in the context of the current implementation and current ACLE 
arithmetic on __fp16 values produces float results - the operands are 
promoted at the C language level.  This is different from TS 18661-3, 
where _Float16 arithmetic produces results whose semantics type is 
_Float16 but which, if FLT_EVAL_METHOD is 0, are evaluated with excess 
range and precision to the range and precision of float.  So if __fp16 and 
float are differently passed to variadic functions, you have the issue 
that if the argument is an expression resulting from __fp16 arithmetic, 
the way it is passed depends on whether current ACLE or TS 18661-3 are 
followed.  But if the eventual aim is for __fp16 (when using the IEEE 
format rather than the alternative format) to become just a typedef for 
_Float16, then these issues will need to be addressed.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH][reload1.c] Convert conditional compilation on WORD_REGISTER_OPERATIONS

2016-04-28 Thread Jeff Law

On 12/15/2015 10:28 AM, Kyrill Tkachov wrote:

Hi all,

This converts the preprocessor check for WORD_REGISTER_OPERATIONS into a
runtime check
in reload1.c.

Since this one is used to guard part of a condition, I'd appreciate it
if someone
double-checks that the logic is still equivalent.

Bootstrapped and tested on arm, aarch64, x86_64.

Ok for trunk?

Thanks,
Kyrill

2015-12-15  Kyrylo Tkachov  

* reload1.c (eliminate_regs_1): Convert preprocessor check
for WORD_REGISTER_OPERATIONS to runtime check.

In this one we had

#if FOO
 && !(condition)
#endif

And you changed it to

  && !(FOO && condition))

Which I think is wrong.

Original:

FOO  condition  result
00 0
01 0
10 1
11 0


New:

FOO  condition  result
00 1
01 1
10 1
11 0

I think you really wanted

&& (FOO
&& ! (condition))



Jeff


[PATCH] nds32: Fix casesi (PR70668)

2016-04-28 Thread Segher Boessenkool
Expanders do not have more elements in the operands array than declared
in the pattern.  So, we cannot use operands[5] here.  Instead just
declare and use another rtx.

Built with a cross compiler; not tested otherwise.  Is this okay for trunk?


Segher


2016-04-28  Segher Boessenkool  

PR target/70668
* config/nds32/nds32.md (casesi): Don't access the operands array
out of bounds.

---
 gcc/config/nds32/nds32.md | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/gcc/config/nds32/nds32.md b/gcc/config/nds32/nds32.md
index 5cdd8b2..494a78d 100644
--- a/gcc/config/nds32/nds32.md
+++ b/gcc/config/nds32/nds32.md
@@ -2288,11 +2288,9 @@ (define_expand "casesi"
   emit_jump_insn (gen_cbranchsi4 (test, operands[0], operands[2],
  operands[4]));
 
-  operands[5] = gen_reg_rtx (SImode);
-  /* Step C, D, E, and F, using another temporary register operands[5].  */
-  emit_jump_insn (gen_casesi_internal (operands[0],
-  operands[3],
-  operands[5]));
+  /* Step C, D, E, and F, using another temporary register.  */
+  rtx tmp = gen_reg_rtx (SImode);
+  emit_jump_insn (gen_casesi_internal (operands[0], operands[3], tmp));
   DONE;
 })
 
-- 
1.9.3



[PATCH] Add peephole for -Os lock; dec (PR target/70821)

2016-04-28 Thread Jakub Jelinek
Hi!

Optimizing atomic_fetch_add followed by comparison into just testing
the flags of the lock; sub is handled by a peephole2, which works usually
fine, except that for -Os we have another peephole2 that transforms
movl $-1, %reg into orl $-1, %reg and that causes the above mentioned
peephole2 not to trigger anymore.

Fixed by adding a peephole2 even for this case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-04-28  Jakub Jelinek  

PR target/70821
* config/i386/sync.md (define_peephole2 *atomic_fetch_add_cmp):
Add new peephole2 where the first insn is *mov_or instead of
*mov_internal.

* gcc.target/i386/pr70821.c: New test.

--- gcc/config/i386/sync.md.jj  2016-01-04 14:55:56.0 +0100
+++ gcc/config/i386/sync.md 2016-04-28 09:40:28.265764880 +0200
@@ -467,6 +467,36 @@ (define_peephole2
   (plus:SWI (match_dup 1)
 (match_dup 2)))])])
 
+;; Likewise, but for the -Os special case of *mov_or.
+(define_peephole2
+  [(parallel [(set (match_operand:SWI 0 "register_operand")
+  (match_operand:SWI 2 "constm1_operand"))
+ (clobber (reg:CC FLAGS_REG))])
+   (parallel [(set (match_dup 0)
+  (unspec_volatile:SWI
+[(match_operand:SWI 1 "memory_operand")
+ (match_operand:SI 4 "const_int_operand")]
+UNSPECV_XCHG))
+ (set (match_dup 1)
+  (plus:SWI (match_dup 1)
+(match_dup 0)))
+ (clobber (reg:CC FLAGS_REG))])
+   (set (reg:CCZ FLAGS_REG)
+   (compare:CCZ (match_dup 0)
+(match_operand:SWI 3 "const_int_operand")))]
+  "peep2_reg_dead_p (3, operands[0])
+   && (unsigned HOST_WIDE_INT) INTVAL (operands[2])
+  == -(unsigned HOST_WIDE_INT) INTVAL (operands[3])
+   && !reg_overlap_mentioned_p (operands[0], operands[1])"
+  [(parallel [(set (reg:CCZ FLAGS_REG)
+  (compare:CCZ
+(unspec_volatile:SWI [(match_dup 1) (match_dup 4)]
+ UNSPECV_XCHG)
+(match_dup 3)))
+ (set (match_dup 1)
+  (plus:SWI (match_dup 1)
+(match_dup 2)))])])
+
 (define_insn "*atomic_fetch_add_cmp"
   [(set (reg:CCZ FLAGS_REG)
(compare:CCZ
--- gcc/testsuite/gcc.target/i386/pr70821.c.jj  2016-04-28 09:56:06.239893613 
+0200
+++ gcc/testsuite/gcc.target/i386/pr70821.c 2016-04-28 09:55:23.0 
+0200
@@ -0,0 +1,16 @@
+/* PR target/70821 */
+/* { dg-do compile } */
+/* { dg-options "-Os" } */
+/* { dg-additional-options "-march=i686" { target ia32 } } */
+
+void bar (void);
+
+void
+foo (int *p)
+{
+  if (__atomic_sub_fetch (p, 1, __ATOMIC_SEQ_CST))
+bar ();
+}
+
+/* { dg-final { scan-assembler "lock;? dec" } } */
+/* { dg-final { scan-assembler-not "lock;? xadd" } } */

Jakub


Re: [PATCHv2 0/7] ARC: Add support for nps400 variant

2016-04-28 Thread Joern Wolfgang Rennecke



On 21/04/16 12:39, Andrew Burgess wrote:

This new iteration of the previous version is largely the same except
that I now no longer use configure time options to build in support
for nps400.  Instead support controlled with a -mcpu=nps400 command
line switch.  This change was made to mirror a similar change that was
requested when I pushed nps400 support upstream into binutils.

The considerations for these toolchain components are different;
it costs little to have do a build with support for an entire architecture
with various variants in binutils.
gcc having run-time options to generate code for a subtarget is
also useful.
However, setting defaults and multilib sets at gcc configure time is
also quite useful, as otherwise every user is confronted with building
multilibs for a burgeoning array of variants.


Re: [PATCH][reload.c] Convert conditional compilation of WORD_REGISTER_OPERATIONS

2016-04-28 Thread Jeff Law

On 12/15/2015 10:27 AM, Kyrill Tkachov wrote:

Hi all,

This converts the preprocessor checks for WORD_REGISTER_OPERATIONS into
runtime checks
in reload.c.

Since this one is used to guard part of a large condition, I'd
appreciate it if someone
double-checks that the logic is still equivalent.

Bootstrapped and tested on arm, aarch64, x86_64.

Ok for trunk?

Thanks,
Kyrill

2015-12-15  Kyrylo Tkachov  

* reload.c (push_reload): Convert preprocessor checks
for WORD_REGISTER_OPERATIONS to runtime checks.

You just changed

#if FOO
   || (somecondition)
#endif

to

   || (FOO
   && (somecondition))

Looks good to me.  OK for the trunk.

Thanks,
Jeff


Re: [PATCH] Update TARGET_FUNCTION_INCOMING_ARG documentation

2016-04-28 Thread H.J. Lu
On Thu, Apr 28, 2016 at 8:25 AM, Jeff Law  wrote:
> On 11/30/2015 03:35 AM, Bernd Schmidt wrote:
>>
>> On 11/29/2015 06:14 PM, H.J. Lu wrote:
>>>
>>> Is this safe for stage 3?
>>
>>
>> Is there a reason to do it now? This doesn't include a testcase.
>
> Handling the proposed attribute requires extensions to the current
> function_arg capabilities.
>
> I need to go back to the discussion between HJ, rth, Uros, myself and
> probably others to get the full details.  I recall two extensions to the
> current set of return values from function_arg.   One was to allow the
> target to return an address.  That address will be forced by the generic
> code into a pseudo.
>
> I thought we agreed to one other extension to support the interrupt
> mechanism, but again, I'll have to dig through the archives to remember the
> full details.

This is the only extension needed to implement x86 interrupt attribute. It
should have no impact on other targets which always have arguments
either in memory or register.

> These extensions were necessary to avoid some horrid hacks in the x86
> backend which Uros, quite reasonably, rejected.  We agreed to return to this
> after stage1 opened.
>

That is correct.

Thanks.

-- 
H.J.


Re: [PATCH] doc: discourage use of __attribute__((optimize())) in production code

2016-04-28 Thread Jeff Law

On 12/13/2015 01:19 AM, Markus Trippelsdorf wrote:

Many developers are still using __attribute__((optimize())) in
production code, although it quite broken.

* doc/extend.texi (Common Function Attributes) [optimize]:
Discourage use of the optimize attribute.
I went back and reviewed the discussion as well as the BZs.  I think 
this patch is fine for the trunk.  I don't think a runtime warning is 
likely to be looked upon favorably by most users.


jeff



[PATCH PR70803]Require "vect_int_mult" for the test.

2016-04-28 Thread Bin Cheng
Hi,
This patch fixes PR70803 by skipping targets that don't support vect_int_mult.  
It's an obvious change.

Thanks,
bin

gcc/testsuite/ChangeLog
2016-04-29  Bin Cheng  

PR tree-optimization/70803
* gcc.dg/vect/pr56625.c: Require vect_int_mult.diff --git a/gcc/testsuite/gcc.dg/vect/pr56625.c 
b/gcc/testsuite/gcc.dg/vect/pr56625.c
index b903be3..fe3fd7b 100644
--- a/gcc/testsuite/gcc.dg/vect/pr56625.c
+++ b/gcc/testsuite/gcc.dg/vect/pr56625.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_int_mult } */
 
 void foo (int a[], int b[])
 {


Re: [PATCH] Update TARGET_FUNCTION_INCOMING_ARG documentation

2016-04-28 Thread Jeff Law

On 11/30/2015 03:35 AM, Bernd Schmidt wrote:

On 11/29/2015 06:14 PM, H.J. Lu wrote:

Is this safe for stage 3?


Is there a reason to do it now? This doesn't include a testcase.
Handling the proposed attribute requires extensions to the current 
function_arg capabilities.


I need to go back to the discussion between HJ, rth, Uros, myself and 
probably others to get the full details.  I recall two extensions to the 
current set of return values from function_arg.   One was to allow the 
target to return an address.  That address will be forced by the generic 
code into a pseudo.


I thought we agreed to one other extension to support the interrupt 
mechanism, but again, I'll have to dig through the archives to remember 
the full details.


These extensions were necessary to avoid some horrid hacks in the x86 
backend which Uros, quite reasonably, rejected.  We agreed to return to 
this after stage1 opened.


jeff



Re: [PATCH] [ARC] Add new ARCv2 instructions.

2016-04-28 Thread Joern Wolfgang Rennecke



On 20/04/16 13:12, Claudiu Zissulescu wrote:

This patch adds new instruction variants as introduced by the ARCv2
architecture.

  
You have used groups of 8 spaces at line starts; tabs should be used 
instead for indentation.

(arc_dwarf_register_span): Remove enum keyword.

That bit should be separate.

(compact_memory_operand_p): New function.

The description of the arguments in the start-of-function comment
does not agree with how the arguments are used in the definition of the
"UTS" .

Moreover, it says:

CODE_DENSITY indicates ARCv2 code density operations are
+   available

which implies that these are additional opcodes that are available, yet for the
base+index register case, we have:

+{
+  return !code_density;
+}

Considering this comment:

+  /* Reverting for the moment since ld/st{w,h}_s does not have sp
+as a valid parameter.  */

The historical context is utterly lost, so the "Reverting for the moment since"
bit only confuses.



Re: [PATCH] add -fprolog-pad=N option to c-family

2016-04-28 Thread Jeff Law

On 04/28/2016 05:18 AM, Torsten Duwe wrote:

On Thu, Apr 28, 2016 at 11:39:48AM +0300, Maxim Kuvyrkov wrote:

On Apr 27, 2016, at 6:22 PM, Torsten Duwe  wrote:


Your current patch is great for experiments for the kernel engineers to check 
if suggested approaches to code patching will work.  Still, I prefer to 
implement LTO-friendly way of handling -fprolog-pad=N via function attributes.


That was exactly my intention. I only wanted *some* working compiler.
I'm sure you compiler people will have a better way to finally implement this.
Conceptually we have the concept of nops insn patterns, so generically 
I'd implement this by emitting a suitable set of nops followed by a 
scheduling barrier, then thread the mess at the start of the prologue. 
This would be 99.9% target independent changes.


We'd just punt targets that don't represent prologues as RTL.




All I can say so far about the ipa-ra issue is that it'd be great if
x9(?) could be left as volatile / scratch; the rest can be preserved.
ipa-ra doesn't really work that way.  It just notes what's used in the 
callee and the caller is allowed to look at that information and use it 
to optimize stuff on the caller side.


For example, call-clobbered registers that are not used in the callee 
can be used in the caller to hold values across the call.


This is going to wreck havoc for anything that assumes a call-clobbered 
register can always be safely used in the callee, particularly in the 
patched codepath.  One could argue that the patched codepath is the 
uncommon case and should be responsible for saving/restoring any 
register it uses to ensure it doesn't mess up any visible state.


Jeff




Re: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.

2016-04-28 Thread Joern Wolfgang Rennecke



On 28/04/16 15:11, Claudiu Zissulescu wrote:

Sure thing, running for ARC700, using original implementation and enabled 
guarded code for FPX handling:

[0x02a2] 0xc000 K Zld_s   r0,[sp,0x0] : lw 
[0x5000c0c0] => 0x : (w1) r0 <= 0x *
[0x02a4] 0xc101 K Zld_s   r1,[sp,0x4] : lw 
[0x5000c0c4] => 0x7fef : (w1) r1 <= 0x7fef *
[0x02a6] 0xc202 K Zld_s   r2,[sp,0x8] : lw 
[0x5000c0c8] => 0x : (w1) r2 <= 0x *
[0x02a8] 0xc303 K Zld_s   r3,[sp,0xc] : lw 
[0x5000c0cc] => 0x7fef : (w1) r3 <= 0x7fef *
[0x02aa] 0x0aea K Zbl 0x2e8 : (w0) r31 <= 
0x02ae *
[0x0590] 0x091d00e1 K Zbrne.d r1,r3,0x1c
[0x0594] 0x2153050c K Zbmsk   r12,r1,0x14 : (w0) r12 
<= 0x000f *
[0x0598] 0x200580be K Zor.f   0,r0,r2 *
[0x059c] 0x24cf1562 K  N   bset.ner12,r12,0x15 : (w0) r12 
<= 0x002f *
[0x05a0] 0x2414904c K  N   add1.f r12,r12,r1 : (w0) r12 
<= 0x000d *
[0x05a4] 0x7fe0 K   C  j_s.d  [blink] *
[0x05a6] 0x20cc8086 KD  C  cmp.cc r0,r2
  
  

I see, we basically have an overflow.
I think the DPFP_COMPAT / __HS__ variant should be something like:

brne DBL0H,DBL1H,.Lhighdiff
mov_s r12,0x0020
or.f 0,DBL0L,DBL1L
bset.ne r12,r12,0

add1.f  r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN.  */
j_s.d   [blink]
cmp.cc  DBL0L,DBL1L
...

Where the mov_s could be replaced with something else that loads the 
same value,

depending on what instructions are supported.


Re: [PATCH] Turn some compile-time tests into run-time tests

2016-04-28 Thread Jeff Law

On 04/28/2016 08:03 AM, Patrick Palka wrote:



The rest seem OK to me.  Note that I'm not convinced all these tests were
designed to be execution tests, even though they use __builtin_abort and
friends.  Though it's a good marker of something that can/should be looked
at.


True..  What made me look into this in the first place is that I
caught myself making a similar mistake, i.e. marking an execution test
case as dg-do compile instead of dg-do run out of habit.
It's an easy mistake to make and, it's pretty low in terms of real world 
impact :-)



 But I

suppose it's worth looking at the context of each of these tests to
see if they were not actually intended to be execution tests.  I'll
double check this and report back; in the meantime I also found some
more tests that ought to be looked at.
I think for the set you already identified go ahead and make the 
approved changes.  We don't really lose anything by doing so.  Going 
forward we just have to continue to watch for this kind of thing 
slipping through the cracks and updating tests as mistakes are identified.


jeff


Re: [PATCH] Update gmp/mpfr/mpc in-tree versions

2016-04-28 Thread Richard Biener
On Thu, 28 Apr 2016, Bernd Edlinger wrote:

> On 28.04.2016 14:35, Richard Biener wrote:
> > On Thu, 28 Apr 2016, Bernd Edlinger wrote:
> >
> >> Hi,
> >>
> >> here is the first part of the patch that addresses only the in-tree
> >> builds.  I tried different combinations of the documented supported
> >> in-tree versions, and all combinations seem to work.
> >> Then I changed the download_prerequisites batch to pick each pre-
> >> requisite's minimum version (that part is not tested, because I have
> >> no way to update the gcc.gnu.org ftp server).
> >>
> >> Various boot-straps for x86_64-linux-gnu and armv7-linux-gnueabihf
> >> were successful.
> >>
> >> Is it OK for trunk?
> >
> > Please do not document that in-tree versions greater than XXX are
> > supported, instead just point at download_prerequesites.
> >
> 
> OK, done.
> 
> > Why do you not update to latest mpc (there is 1.0.3) and mpfr but leave
> > bugfixes for mpfr on the plate (there is 3.1.4).
> >
> 
> There's not really a good reason for that choice.
> 
> I just started with the latest version, and later moved to older
> versions, because I did not want to restrict the supported versions
> more than absolutely necessary, not even in-tree.
> 
> Are there any bug-fixes that we could depend upon?
> 
> > Does it make sense to wait for a new GMP release that allows to get
> > rid of -DNO_ASM?
> >
> 
> I was _very_ surprised that gmp-6.0.0 did at first work in-tree but
> enabled invalid assembly code, in gmp-6.0.0/mpn/generic/div_qr_1n_pi1.c
> when __arm__ or __sparc__ or __s390x__ is defined together with NO_ASM.
> 
> All in all GMP contains really much assembler code that we don't need
> at all, my impression is that it is nearly impossible to test GMP
> on every possible target, although it is all about mathematics.
> So at least some choice would be good for us.
> 
> In that sense, I would not like to restrict the supported GMP versions
> to just one version, that is not even released at this time.

Another option would be to try if mini-gmp is enough for our
(in-tree) use and what the performance impact would be if we'd
use that (in-tree).

> > I will upload mpfr 3.1.4 and mpc 1.0.3.
> >
> 
> Good.  I updated the download_prerequsites to mpfr-3.1.4 and mpc-1.0.3
> again, but left gmp-6.1.0 at the moment.

Thanks,
Richard.


Re: [PATCH 1/4] PR c++/62314: add fixit hint for missing "template <> " in explicit specialization

2016-04-28 Thread Trevor Saunders
On Thu, Apr 28, 2016 at 10:28:15AM -0400, David Malcolm wrote:
> This is a resend of a patch kit I sent in stage 3; the original post
> was here:
>   https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01933.html
> 
> I've rebased the patches against yesterday's trunk and retested them.
> 
> They add various fix-it hints to existing diagnostics (PR 62314 is a
> catch-all for adding fix-its).
> 
> The first patch in the kit adds a fix-it insertion hint for missing
> "template <> " in explicit specializations, and improves the
> reported range of the type name by capturing the full range, rather
> than just one token within it.
> 
> I note that clang (http://clang.llvm.org/diagnostics.html) suggests
> inserting
>   template<>
> whereas our diagnostic talks about
>   template <>
> hence I have the fixit suggest inserting that.  Should we change our
> wording instead, and lose the space?

Selfishly I'd prefer to lose the space on the grounds all the other
projects I work on don't put one there and gcc is inconsistant about it.
That said assuming there are projects that put a space there it seems
unfortunate we need to pick one which will definitely be suboptimal for
some people.

Trev


Re: [PATCH] Improve AVX512F sse4_1_round* patterns

2016-04-28 Thread Kirill Yukhin
Hi Jakub,
On 27 Apr 23:34, Jakub Jelinek wrote:
> Hi!
> 
> While AVX512F doesn't contain EVEX encoded vround{ss,sd,ps,pd} instructions,
> it contains vrndscale* which performs the same thing if bits [4:7] of the
> immediate are zero.
> 
> For _mm*_round_{ps,pd} we actually already emit vrndscale* for -mavx512f
> instead of vround* unconditionally (because
> _rndscale
> instruction has the same RTL as _round
> and the former, enabled for TARGET_AVX512F, comes first), for the scalar
> cases (thus __builtin_round* or _mm*_round_s{s,d}) the patterns we have
> don't allow extended registers and thus we end up with unnecessary moves
> if the inputs and/or outputs are or could be most effectively allocated
> in the xmm16+ registers.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
Your patch is OK.
> 
> 2016-04-27  Jakub Jelinek  
> 
>   * config/i386/i386.md (sse4_1_round2): Add avx512f alternative.
>   * config/i386/sse.md (sse4_1_round): Likewise.
> 
>   * gcc.target/i386/avx-vround-1.c: New test.
>   * gcc.target/i386/avx-vround-2.c: New test.
>   * gcc.target/i386/avx512vl-vround-1.c: New test.
>   * gcc.target/i386/avx512vl-vround-2.c: New test.

--
Thanks, K


Re: [PATCH] Update gmp/mpfr/mpc in-tree versions

2016-04-28 Thread Bernd Edlinger
On 28.04.2016 14:35, Richard Biener wrote:
> On Thu, 28 Apr 2016, Bernd Edlinger wrote:
>
>> Hi,
>>
>> here is the first part of the patch that addresses only the in-tree
>> builds.  I tried different combinations of the documented supported
>> in-tree versions, and all combinations seem to work.
>> Then I changed the download_prerequisites batch to pick each pre-
>> requisite's minimum version (that part is not tested, because I have
>> no way to update the gcc.gnu.org ftp server).
>>
>> Various boot-straps for x86_64-linux-gnu and armv7-linux-gnueabihf
>> were successful.
>>
>> Is it OK for trunk?
>
> Please do not document that in-tree versions greater than XXX are
> supported, instead just point at download_prerequesites.
>

OK, done.

> Why do you not update to latest mpc (there is 1.0.3) and mpfr but leave
> bugfixes for mpfr on the plate (there is 3.1.4).
>

There's not really a good reason for that choice.

I just started with the latest version, and later moved to older
versions, because I did not want to restrict the supported versions
more than absolutely necessary, not even in-tree.

Are there any bug-fixes that we could depend upon?

> Does it make sense to wait for a new GMP release that allows to get
> rid of -DNO_ASM?
>

I was _very_ surprised that gmp-6.0.0 did at first work in-tree but
enabled invalid assembly code, in gmp-6.0.0/mpn/generic/div_qr_1n_pi1.c
when __arm__ or __sparc__ or __s390x__ is defined together with NO_ASM.

All in all GMP contains really much assembler code that we don't need
at all, my impression is that it is nearly impossible to test GMP
on every possible target, although it is all about mathematics.
So at least some choice would be good for us.

In that sense, I would not like to restrict the supported GMP versions
to just one version, that is not even released at this time.

> I will upload mpfr 3.1.4 and mpc 1.0.3.
>

Good.  I updated the download_prerequsites to mpfr-3.1.4 and mpc-1.0.3
again, but left gmp-6.1.0 at the moment.


Thanks,
Bernd.
2016-04-28  Bernd Edlinger  

* configure.ac (mpfr): Remove pre-3.1.0 mpfr compatibility code.
* configure: Regenerated.
* Makefile.def (gmp): Explicitly disable assembler.
(mpfr): Adjust lib_path.
(mpc): Likewise.
* Makefile.in: Regenerated.

gcc/
2016-04-28  Bernd Edlinger  

* doc/install.texi: Document supported in-tree gmp/mpfr/mpc versions.

contrib/
2016-04-28  Bernd Edlinger  

* download_prerequisites: Adjust gmp/mpfr/mpc versions.
Index: Makefile.def
===
--- Makefile.def	(Revision 235487)
+++ Makefile.def	(Arbeitskopie)
@@ -50,6 +50,7 @@ host_modules= { module= gcc; bootstrap=true;
 host_modules= { module= gmp; lib_path=.libs; bootstrap=true;
 		// Work around in-tree gmp configure bug with missing flex.
 		extra_configure_flags='--disable-shared LEX="touch lex.yy.c"';
+		extra_make_flags='AM_CFLAGS="-DNO_ASM"';
 		no_install= true;
 		// none-*-* disables asm optimizations, bootstrap-testing
 		// the compiler more thoroughly.
@@ -57,11 +58,11 @@ host_modules= { module= gmp; lib_path=.libs; boots
 		// gmp's configure will complain if given anything
 		// different from host for target.
 	target="none-${host_vendor}-${host_os}"; };
-host_modules= { module= mpfr; lib_path=.libs; bootstrap=true;
+host_modules= { module= mpfr; lib_path=src/.libs; bootstrap=true;
 		extra_configure_flags='--disable-shared @extra_mpfr_configure_flags@';
 		extra_make_flags='AM_CFLAGS="-DNO_ASM"';
 		no_install= true; };
-host_modules= { module= mpc; lib_path=.libs; bootstrap=true;
+host_modules= { module= mpc; lib_path=src/.libs; bootstrap=true;
 		extra_configure_flags='--disable-shared @extra_mpc_gmp_configure_flags@ @extra_mpc_mpfr_configure_flags@';
 		no_install= true; };
 host_modules= { module= isl; lib_path=.libs; bootstrap=true;
Index: Makefile.in
===
--- Makefile.in	(Revision 235487)
+++ Makefile.in	(Arbeitskopie)
@@ -639,12 +639,12 @@ HOST_LIB_PATH_gmp = \
 
 @if mpfr
 HOST_LIB_PATH_mpfr = \
-  $$r/$(HOST_SUBDIR)/mpfr/.libs:$$r/$(HOST_SUBDIR)/prev-mpfr/.libs:
+  $$r/$(HOST_SUBDIR)/mpfr/src/.libs:$$r/$(HOST_SUBDIR)/prev-mpfr/src/.libs:
 @endif mpfr
 
 @if mpc
 HOST_LIB_PATH_mpc = \
-  $$r/$(HOST_SUBDIR)/mpc/.libs:$$r/$(HOST_SUBDIR)/prev-mpc/.libs:
+  $$r/$(HOST_SUBDIR)/mpc/src/.libs:$$r/$(HOST_SUBDIR)/prev-mpc/src/.libs:
 @endif mpc
 
 @if isl
@@ -11300,7 +11300,7 @@ all-gmp: configure-gmp
 	s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
 	$(HOST_EXPORTS)  \
 	(cd $(HOST_SUBDIR)/gmp && \
-	  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) $(STAGE1_FLAGS_TO_PASS)  \
+	  $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) $(STAGE1_FLAGS_TO_PASS) AM_CFLAGS="-DNO_ASM" \
 		$(TARGET-gmp))
 @endif gmp
 
@@ -11329,7 +11329,7 @@ all-stage1-gmp: 

Re: [ubsan PATCH] Fix compile-time hog with _EXPRs (PR sanitizer/70342)

2016-04-28 Thread Jakub Jelinek
On Thu, Apr 28, 2016 at 04:10:01PM +0200, Marek Polacek wrote:
> That works too, though it of course affects all users, not just ubsan.  Here's

Of course, but I think that is a good thing ;)

> the patch with your suggested change.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2016-04-28  Marek Polacek  
>   Jakub Jelinek  
> 
>   PR sanitizer/70342
>   * fold-const.c (tree_single_nonzero_warnv_p): For TARGET_EXPR, use
>   TARGET_EXPR_SLOT as a base.
> 
>   * g++.dg/ubsan/null-7.C: New test.

Ok for trunk.
For 6.2 dunno, either the same patch after a while, or perhaps your original
patch is safer (though, wonder if e.g. one can construct a testcase where it
will use instrument &(TARGET_EXPR <...>.field) nested many times and still
trigger the compile time hog with your patch).

Jakub


Re: [PATCH] Mark predicates generated by genmatch as static

2016-04-28 Thread Patrick Palka
On Thu, Apr 28, 2016 at 10:02 AM, Richard Biener
 wrote:
> On Thu, Apr 28, 2016 at 3:50 PM, Patrick Palka  wrote:
>> The predicate functions emitted by genmatch are expected to only be used
>> locally within {gimple,generic}-match.c, so this patch marks them as
>> static.  Does this look OK to commit after bootstrap and regtest?
>
> Actually the idea was to for example generate predicates in match.pd
> format for things like vectorizer pattern recog (I've done this for a few,
> need to search for (partial) patches on my disk), so they are supposed
> to be externally visible.
>
> Of course we might want to make that explicit in some way with
> a (extern_match ...) [or by prefixing local ones with a '*' ...].

Oh, I see.  That sounds useful.

>
> Richard.
>
>> gcc/ChangeLog:
>>
>> * genmatch.c (write_predicate): Mark the emitted function as
>> static.
>> ---
>>  gcc/genmatch.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/genmatch.c b/gcc/genmatch.c
>> index ce964fa..2f5147f 100644
>> --- a/gcc/genmatch.c
>> +++ b/gcc/genmatch.c
>> @@ -3552,7 +3552,7 @@ decision_tree::gen (FILE *f, bool gimple)
>>  void
>>  write_predicate (FILE *f, predicate_id *p, decision_tree , bool gimple)
>>  {
>> -  fprintf (f, "\nbool\n"
>> +  fprintf (f, "\nstatic bool\n"
>>"%s%s (tree t%s%s)\n"
>>"{\n", gimple ? "gimple_" : "tree_", p->id,
>>p->nargs > 0 ? ", tree *res_ops" : "",
>> --
>> 2.8.1.361.g2fbef4c
>>


RE: [PATCH 4/6] [ARC] Handle FPX NaN within optimized floating point library.

2016-04-28 Thread Claudiu Zissulescu
Hi,

> Where exactly does the test go wrong?

The test which fails is this one: 
TEST_EQ (double, __DBL_MAX__, __DBL_MAX__, 1);
From the test file included in the patch.

> Can you show a trace of __eqdf2 with register values?

Sure thing, running for ARC700, using original implementation and enabled 
guarded code for FPX handling:

[0x02a2] 0xc000 K Zld_s   r0,[sp,0x0] : lw 
[0x5000c0c0] => 0x : (w1) r0 <= 0x *
[0x02a4] 0xc101 K Zld_s   r1,[sp,0x4] : lw 
[0x5000c0c4] => 0x7fef : (w1) r1 <= 0x7fef *
[0x02a6] 0xc202 K Zld_s   r2,[sp,0x8] : lw 
[0x5000c0c8] => 0x : (w1) r2 <= 0x *
[0x02a8] 0xc303 K Zld_s   r3,[sp,0xc] : lw 
[0x5000c0cc] => 0x7fef : (w1) r3 <= 0x7fef *
[0x02aa] 0x0aea K Zbl 0x2e8 : (w0) r31 <= 
0x02ae *
[0x0590] 0x091d00e1 K Zbrne.d r1,r3,0x1c
[0x0594] 0x2153050c K Zbmsk   r12,r1,0x14 : (w0) 
r12 <= 0x000f *
[0x0598] 0x200580be K Zor.f   0,r0,r2 *
[0x059c] 0x24cf1562 K  N   bset.ner12,r12,0x15 : (w0) 
r12 <= 0x002f *
[0x05a0] 0x2414904c K  N   add1.f r12,r12,r1 : (w0) r12 
<= 0x000d *
[0x05a4] 0x7fe0 K   C  j_s.d  [blink] *
[0x05a6] 0x20cc8086 KD  C  cmp.cc r0,r2
 
For reference, the routine:

.global __eqdf2
.balign 4
HIDDEN_FUNC(__eqdf2)
/* Good performance as long as the difference in high word is
   well predictable (as seen from the branch predictor).  */
__eqdf2:
brne.d DBL0H,DBL1H,.Lhighdiff
bmskr12,DBL0H,20
#ifndef __HS__
/* The next two instructions are required to recognize the FPX
NaN, which has a pattern like this: 0x7ff0__8000_, as
oposite to 0x7ff8___.  */
or.f0,DBL0L,DBL1L
bset.ne r12,r12,21
#endif /* __HS__ */
add1.f  r12,r12,DBL0H /* set c iff NaN; also, clear z if NaN.  */
j_s.d   [blink]
cmp.cc  DBL0L,DBL1L
.balign 4
.Lhighdiff:
or  r12,DBL0H,DBL1H
or.f0,DBL0L,DBL1L
j_s.d   [blink]
bmsk.eq.f r12,r12,30
ENDFUNC(__eqdf2)

All those results were collected using nsimfree.

Please let me know if you need more info,
Claudiu



[AArch64] Remove an unused reload hook.

2016-04-28 Thread Matthew Wahab

Hello,

Yvan Roux pointed out that the patch at
https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01713.html was never
committed.

From the original submission:

  The LEGITIMIZE_RELOAD_ADDRESS macro is only needed for reload. Since
  the Aarch64 backend no longer supports reload, this macro is not
  needed and this patch removes it.

This is a rebased and retested version of that patch.

Tested aarch64-none-linux-gnu with native bootstrap and make check.

Ok for trunk?
Matthew

gcc/
2016-04-26  Matthew Wahab  

* config/aarch64/aarch64.h (LEGITIMIZE_RELOAD_ADDRESS): Remove.
* config/aarch64/arch64-protos.h
(aarch64_legitimize_reload_address): Remove.
* config/aarch64/aarch64.c (aarch64_legitimize_reload_address):
Remove.
[PATCH] [AArch64] Remove an unused reload hook.

Yvan Roux pointed out that the patch at
https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01713.html was never
committed.

>From the original submission:

  The LEGITIMIZE_RELOAD_ADDRESS macro is only needed for reload. Since
  the Aarch64 backend no longer supports reload, this macro is not
  needed and this patch removes it.

This is a rebased and retested version of that patch.

Tested aarch64-none-linux-gnu with native bootstrap and make check.

Ok for trunk?
Matthew

gcc/
2016-04-26  Matthew Wahab  

* config/aarch64/aarch64.h (LEGITIMIZE_RELOAD_ADDRESS): Remove.
* config/aarch64/arch64-protos.h
(aarch64_legitimize_reload_address): Remove.
* config/aarch64/aarch64.c (aarch64_legitimize_reload_address):
Remove.
---
 gcc/config/aarch64/aarch64-protos.h |   1 -
 gcc/config/aarch64/aarch64.c| 114 
 gcc/config/aarch64/aarch64.h|  15 -
 3 files changed, 130 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index f22a31c..6a8a850 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -339,7 +339,6 @@ int aarch64_simd_attr_length_move (rtx_insn *);
 int aarch64_uxt_size (int, HOST_WIDE_INT);
 int aarch64_vec_fpconst_pow_of_2 (rtx);
 rtx aarch64_final_eh_return_addr (void);
-rtx aarch64_legitimize_reload_address (rtx *, machine_mode, int, int, int);
 rtx aarch64_mask_from_zextract_ops (rtx, rtx);
 const char *aarch64_output_move_struct (rtx *operands);
 rtx aarch64_return_addr (int, rtx);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 9995494..4a1acc9 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -5022,120 +5022,6 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, machine_mode mode)
   return x;
 }
 
-/* Try a machine-dependent way of reloading an illegitimate address
-   operand.  If we find one, push the reload and return the new rtx.  */
-
-rtx
-aarch64_legitimize_reload_address (rtx *x_p,
-   machine_mode mode,
-   int opnum, int type,
-   int ind_levels ATTRIBUTE_UNUSED)
-{
-  rtx x = *x_p;
-
-  /* Do not allow mem (plus (reg, const)) if vector struct mode.  */
-  if (aarch64_vect_struct_mode_p (mode)
-  && GET_CODE (x) == PLUS
-  && REG_P (XEXP (x, 0))
-  && CONST_INT_P (XEXP (x, 1)))
-{
-  rtx orig_rtx = x;
-  x = copy_rtx (x);
-  push_reload (orig_rtx, NULL_RTX, x_p, NULL,
-		   BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
-		   opnum, (enum reload_type) type);
-  return x;
-}
-
-  /* We must recognize output that we have already generated ourselves.  */
-  if (GET_CODE (x) == PLUS
-  && GET_CODE (XEXP (x, 0)) == PLUS
-  && REG_P (XEXP (XEXP (x, 0), 0))
-  && CONST_INT_P (XEXP (XEXP (x, 0), 1))
-  && CONST_INT_P (XEXP (x, 1)))
-{
-  push_reload (XEXP (x, 0), NULL_RTX,  (x, 0), NULL,
-		   BASE_REG_CLASS, GET_MODE (x), VOIDmode, 0, 0,
-		   opnum, (enum reload_type) type);
-  return x;
-}
-
-  /* We wish to handle large displacements off a base register by splitting
- the addend across an add and the mem insn.  This can cut the number of
- extra insns needed from 3 to 1.  It is only useful for load/store of a
- single register with 12 bit offset field.  */
-  if (GET_CODE (x) == PLUS
-  && REG_P (XEXP (x, 0))
-  && CONST_INT_P (XEXP (x, 1))
-  && HARD_REGISTER_P (XEXP (x, 0))
-  && mode != TImode
-  && mode != TFmode
-  && aarch64_regno_ok_for_base_p (REGNO (XEXP (x, 0)), true))
-{
-  HOST_WIDE_INT val = INTVAL (XEXP (x, 1));
-  HOST_WIDE_INT low = val & 0xfff;
-  HOST_WIDE_INT high = val - low;
-  HOST_WIDE_INT offs;
-  rtx cst;
-  machine_mode xmode = GET_MODE (x);
-
-  /* In ILP32, xmode can be either DImode or SImode.  */
-  gcc_assert (xmode == DImode || xmode == SImode);
-
-  /* Reload non-zero BLKmode offsets.  This is because we cannot ascertain
-	 BLKmode alignment.  */
-  if (GET_MODE_SIZE (mode) == 0)
-	return NULL_RTX;
-
-  offs = low % GET_MODE_SIZE (mode);
-
-  /* 

Re: [ubsan PATCH] Fix compile-time hog with _EXPRs (PR sanitizer/70342)

2016-04-28 Thread Marek Polacek
On Thu, Apr 28, 2016 at 11:07:30AM +0200, Jakub Jelinek wrote:
> On Wed, Apr 27, 2016 at 07:03:25PM +0200, Marek Polacek wrote:
> > This test took forever to compile with -fsanitize=null, because the
> > instrumentation was creating incredible amount of duplicated expressions, 
> > in a
> > quadratic fashion.  I think the problem is that we instrument _EXPR 
> > <>
> > expressions, which doesn't seem to be needed -- we only need to instrument 
> > the
> > initializers in TARGET_EXPRs.  With this patch, we avoid creating tons of 
> > useless
> > expressions and the compile time is reduced from ~ infinity to <1s.
> > 
> > Jakub, do you see any problem with this?
> > 
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > 
> > 2016-04-27  Marek Polacek  
> > 
> > PR sanitizer/70342
> > * c-ubsan.c (ubsan_maybe_instrument_reference_or_call): Don't
> > null-instrument _EXPR <...>.
> > 
> > * g++.dg/ubsan/null-7.C: New test.
> 
> I wonder if this wouldn't be better handled in tree_single_nonzero_warnv_p,
> perhaps like:
> 
>  case ADDR_EXPR:
>{
>   tree base = TREE_OPERAND (t, 0);
>  
>   if (!DECL_P (base))
> base = get_base_address (base);
> +
> + if (base && TREE_CODE (base) == TARGET_EXPR)
> +   base = TARGET_EXPR_SLOT (base);
>   
>   if (!base)
> return false;
> 
> (untested)?

That works too, though it of course affects all users, not just ubsan.  Here's
the patch with your suggested change.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-04-28  Marek Polacek  
Jakub Jelinek  

PR sanitizer/70342
* fold-const.c (tree_single_nonzero_warnv_p): For TARGET_EXPR, use
TARGET_EXPR_SLOT as a base.

* g++.dg/ubsan/null-7.C: New test.

diff --git gcc/fold-const.c gcc/fold-const.c
index 96d8484..171ac83 100644
--- gcc/fold-const.c
+++ gcc/fold-const.c
@@ -13531,6 +13531,9 @@ tree_single_nonzero_warnv_p (tree t, bool 
*strict_overflow_p)
if (!DECL_P (base))
  base = get_base_address (base);
 
+   if (base && TREE_CODE (base) == TARGET_EXPR)
+ base = TARGET_EXPR_SLOT (base);
+
if (!base)
  return false;
 
diff --git gcc/testsuite/g++.dg/ubsan/null-7.C 
gcc/testsuite/g++.dg/ubsan/null-7.C
index e69de29..8284bc7 100644
--- gcc/testsuite/g++.dg/ubsan/null-7.C
+++ gcc/testsuite/g++.dg/ubsan/null-7.C
@@ -0,0 +1,24 @@
+// PR sanitizer/70342
+// { dg-do compile }
+// { dg-options "-fsanitize=null" }
+
+class A {};
+class B {
+public:
+  B(A);
+};
+class C {
+public:
+  C operator<<(B);
+};
+class D {
+  D(const int &);
+  C m_blackList;
+};
+D::D(const int &) {
+  m_blackList << A() << A() << A() << A() << A() << A() << A() << A() << A()
+  << A() << A() << A() << A() << A() << A() << A() << A() << A()
+  << A() << A() << A() << A() << A() << A() << A() << A() << A()
+  << A() << A() << A() << A() << A() << A() << A() << A() << A()
+  << A() << A() << A() << A() << A() << A() << A() << A() << A();
+}

Marek


[PATCH 4/4] C: add fixit hint to misspelled field names

2016-04-28 Thread David Malcolm
Similar to the C++ case, but more involved as the location of the
pertinent token isn't readily available.  The patch adds it as a param
to build_component_ref.  All callers are updated to provide the info,
apart from objc_build_component_ref; fixing the latter would lead to
a cascade of other changes, so it's simplest to provide UNKNOWN_LOCATION
there and have build_component_ref fall back gracefully for this case
to the old behavior of showing a hint in the message, without a fixit
replacement in the source view.

This does slightly change the location of the error; before we had:

test.c:11:13: error: 'union u' has no member named 'colour'; did you mean 
'color'?
   return ptr->colour;
 ^~

with the patch we have:

test.c:11:15: error: 'union u' has no member named 'colour'; did you mean 
'color'?
   return ptr->colour;
   ^~
   color

I think the location change is an improvement.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/c/ChangeLog:
* c-parser.c (c_parser_postfix_expression): In __builtin_offsetof
and structure element reference, capture the location of the
element name token and pass it to build_component_ref.
(c_parser_postfix_expression_after_primary): Likewise for
structure element dereference.
(c_parser_omp_variable_list): Likewise for
OMP_CLAUSE_{_CACHE, MAP, FROM, TO},
* c-tree.h (build_component_ref): Add location_t param.
* c-typeck.c (build_component_ref): Add location_t param
COMPONENT_LOC.  Use it, if available, when issuing hints about
mispelled member names to provide a fixit replacement hint.

gcc/objc/ChangeLog:
* objc-act.c (objc_build_component_ref): Update call
to build_component_ref for added param, passing UNKNOWN_LOCATION.

gcc/testsuite/ChangeLog:
* gcc.dg/spellcheck-fields-2.c: New test case.
---
 gcc/c/c-parser.c   | 34 +-
 gcc/c/c-tree.h |  2 +-
 gcc/c/c-typeck.c   | 26 +++
 gcc/objc/objc-act.c|  3 ++-
 gcc/testsuite/gcc.dg/spellcheck-fields-2.c | 19 +
 5 files changed, 68 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/spellcheck-fields-2.c

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 36c44ab..19e6772 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -7707,8 +7707,9 @@ c_parser_postfix_expression (c_parser *parser)
   accept sub structure and sub array references.  */
if (c_parser_next_token_is (parser, CPP_NAME))
  {
+   c_token *comp_tok = c_parser_peek_token (parser);
offsetof_ref = build_component_ref
- (loc, offsetof_ref, c_parser_peek_token (parser)->value);
+ (loc, offsetof_ref, comp_tok->value, comp_tok->location);
c_parser_consume_token (parser);
while (c_parser_next_token_is (parser, CPP_DOT)
   || c_parser_next_token_is (parser,
@@ -7734,9 +7735,10 @@ c_parser_postfix_expression (c_parser *parser)
c_parser_error (parser, "expected identifier");
break;
  }
+   c_token *comp_tok = c_parser_peek_token (parser);
offsetof_ref = build_component_ref
- (loc, offsetof_ref,
-  c_parser_peek_token (parser)->value);
+ (loc, offsetof_ref, comp_tok->value,
+  comp_tok->location);
c_parser_consume_token (parser);
  }
else
@@ -8213,7 +8215,7 @@ c_parser_postfix_expression_after_primary (c_parser 
*parser,
 {
   struct c_expr orig_expr;
   tree ident, idx;
-  location_t sizeof_arg_loc[3];
+  location_t sizeof_arg_loc[3], comp_loc;
   tree sizeof_arg[3];
   unsigned int literal_zero_mask;
   unsigned int i;
@@ -8327,7 +8329,11 @@ c_parser_postfix_expression_after_primary (c_parser 
*parser,
  c_parser_consume_token (parser);
  expr = default_function_array_conversion (expr_loc, expr);
  if (c_parser_next_token_is (parser, CPP_NAME))
-   ident = c_parser_peek_token (parser)->value;
+   {
+ c_token *comp_tok = c_parser_peek_token (parser);
+ ident = comp_tok->value;
+ comp_loc = comp_tok->location;
+   }
  else
{
  c_parser_error (parser, "expected identifier");
@@ -8339,7 +8345,8 @@ c_parser_postfix_expression_after_primary (c_parser 
*parser,
  start = expr.get_start ();
  finish = c_parser_peek_token (parser)->get_finish ();
  c_parser_consume_token (parser);
- expr.value = build_component_ref (op_loc, expr.value, 

[PATCH 2/4] PR c++/62314: add fixit hint for "expected ';' after class definition"

2016-04-28 Thread David Malcolm
Looking over the discussion of missing semicolons in
  "Quality of Implementation and Attention to Detail"
within
  http://clang.llvm.org/diagnostics.html
and comparing with
  https://gcc.gnu.org/wiki/ClangDiagnosticsComparison
I noticed that of the cases we do handle [1], there's room for
improvement; we currently emit:

test.c:2:11: error: expected ';' after struct definition
 struct a {}
   ^

whereas clang reportedly emits:

test.c:2:12: error: expected ';' after struct
 struct a {}
^
;

(note the offset of the location, and the fix-it hint)

The following patch gives us the latter, more readable output.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?

[1] I've also filed PR c++/68970 about a case given on the clang
page that we still don't handle.

gcc/cp/ChangeLog:
PR c++/62314
* parser.c (cp_parser_class_specifier_1): When reporting
missing semicolons, use a fixit-hint to suggest insertion
of a semicolon immediately after the closing brace,
offsetting the reported column accordingly.

gcc/testsuite/ChangeLog:
PR c++/62314
* gcc/testsuite/g++.dg/parse/error5.C: Update column
number of missing semicolon error.
* g++.dg/pr62314-2.C: New test case.
---
 gcc/cp/parser.c | 19 ---
 gcc/testsuite/g++.dg/parse/error5.C |  2 +-
 gcc/testsuite/g++.dg/pr62314-2.C| 22 ++
 3 files changed, 39 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr62314-2.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index ff16f73..e3133d0 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -21440,17 +21440,30 @@ cp_parser_class_specifier_1 (cp_parser* parser)
closing brace.  */
 if (closing_brace && TYPE_P (type) && want_semicolon)
   {
+   /* Locate the closing brace.  */
cp_token_position prev
  = cp_lexer_previous_token_position (parser->lexer);
cp_token *prev_token = cp_lexer_token_at (parser->lexer, prev);
location_t loc = prev_token->location;
 
+   /* We want to suggest insertion of a ';' immediately *after* the
+  closing brace, so, if we can, offset the location by 1 column.  */
+   location_t next_loc = loc;
+   if (!linemap_location_from_macro_expansion_p (line_table, loc))
+ next_loc = linemap_position_for_loc_and_offset (line_table, loc, 1);
+
+   rich_location richloc (line_table, next_loc);
+   richloc.add_fixit_insert (next_loc, ";");
+
if (CLASSTYPE_DECLARED_CLASS (type))
- error_at (loc, "expected %<;%> after class definition");
+ error_at_rich_loc (,
+"expected %<;%> after class definition");
else if (TREE_CODE (type) == RECORD_TYPE)
- error_at (loc, "expected %<;%> after struct definition");
+ error_at_rich_loc (,
+"expected %<;%> after struct definition");
else if (TREE_CODE (type) == UNION_TYPE)
- error_at (loc, "expected %<;%> after union definition");
+ error_at_rich_loc (,
+"expected %<;%> after union definition");
else
  gcc_unreachable ();
 
diff --git a/gcc/testsuite/g++.dg/parse/error5.C 
b/gcc/testsuite/g++.dg/parse/error5.C
index eb1f9c7..d14a476 100644
--- a/gcc/testsuite/g++.dg/parse/error5.C
+++ b/gcc/testsuite/g++.dg/parse/error5.C
@@ -13,7 +13,7 @@ class Foo { int foo() return 0; } };
 // need make cp_parser_error() report more accurate column numbers.
 // { dg-error "30:expected '\{' at end of input" "brace" { target *-*-* } 4 }
 
-// { dg-error "33:expected ';' after class definition" "semicolon" {target 
*-*-* } 4 }
+// { dg-error "34:expected ';' after class definition" "semicolon" {target 
*-*-* } 4 }
 
 // { dg-error "35:expected declaration before '\}' token" "declaration" 
{target *-*-* } 4 }
 
diff --git a/gcc/testsuite/g++.dg/pr62314-2.C b/gcc/testsuite/g++.dg/pr62314-2.C
new file mode 100644
index 000..deb0cb7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr62314-2.C
@@ -0,0 +1,22 @@
+// { dg-options "-fdiagnostics-show-caret" }
+
+template
+class a {} // { dg-error "11: expected .;. after class definition" }
+class temp {};
+a b;
+struct b {
+} // { dg-error "2: expected .;. after struct definition" }
+
+/* Verify that we emit fixit hints.  */
+
+/* { dg-begin-multiline-output "" }
+ class a {}
+   ^
+   ;
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+ }
+  ^
+  ;
+   { dg-end-multiline-output "" } */
-- 
1.8.5.3



[PATCH 3/4] PR c++/62314: C++: add fixit hint to misspelled member names

2016-04-28 Thread David Malcolm
When we emit a hint about a misspelled member name, it will slightly
aid readability if we use a fixit-hint to show the proposed
name in context within the source code (and in the future this
might support some kind of auto-apply in an IDE).

This patch adds such a hint to the C++ frontend, taking us from:

test.cc:10:15: error: 'struct foo' has no member named 'colour'; did you mean 
'color'?
   return ptr->colour;
   ^~

to:

test.cc:10:15: error: 'struct foo' has no member named 'colour'; did you mean 
'color'?
   return ptr->colour;
   ^~
   color

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/cp/ChangeLog:
PR c++/62314
* typeck.c (finish_class_member_access_expr): When
giving a hint about a possibly-misspelled member name,
add a fix-it replacement hint.

gcc/testsuite/ChangeLog:
PR c++/62314
* g++.dg/spellcheck-fields-2.C: New test case.
---
 gcc/cp/typeck.c| 18 +++---
 gcc/testsuite/g++.dg/spellcheck-fields-2.C | 19 +++
 2 files changed, 34 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/spellcheck-fields-2.C

diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 7e12009..95c777d 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -2817,9 +2817,21 @@ finish_class_member_access_expr (cp_expr object, tree 
name, bool template_p,
  tree guessed_id = lookup_member_fuzzy (access_path, name,
 /*want_type=*/false);
  if (guessed_id)
-   error ("%q#T has no member named %qE; did you mean %qE?",
-  TREE_CODE (access_path) == TREE_BINFO
-  ? TREE_TYPE (access_path) : object_type, name, 
guessed_id);
+   {
+ location_t bogus_component_loc = input_location;
+ rich_location rich_loc (line_table, bogus_component_loc);
+ source_range bogus_component_range =
+   get_range_from_loc (line_table, bogus_component_loc);
+ rich_loc.add_fixit_replace
+   (bogus_component_range,
+IDENTIFIER_POINTER (guessed_id));
+ error_at_rich_loc
+   (_loc,
+"%q#T has no member named %qE; did you mean %qE?",
+TREE_CODE (access_path) == TREE_BINFO
+? TREE_TYPE (access_path) : object_type, name,
+guessed_id);
+   }
  else
error ("%q#T has no member named %qE",
   TREE_CODE (access_path) == TREE_BINFO
diff --git a/gcc/testsuite/g++.dg/spellcheck-fields-2.C 
b/gcc/testsuite/g++.dg/spellcheck-fields-2.C
new file mode 100644
index 000..eb10b44
--- /dev/null
+++ b/gcc/testsuite/g++.dg/spellcheck-fields-2.C
@@ -0,0 +1,19 @@
+// { dg-options "-fdiagnostics-show-caret" }
+
+union u
+{
+  int color;
+  int shape;
+};
+
+int test (union u *ptr)
+{
+  return ptr->colour; // { dg-error "did you mean .color.?" }
+}
+
+// Verify that we get an underline and a fixit hint.
+/* { dg-begin-multiline-output "" }
+   return ptr->colour;
+   ^~
+   color
+   { dg-end-multiline-output "" } */
-- 
1.8.5.3



Re: [PATCH] Turn some compile-time tests into run-time tests

2016-04-28 Thread Patrick Palka
On Wed, Apr 27, 2016 at 5:36 PM, Jeff Law  wrote:
> On 03/10/2016 04:38 PM, Patrick Palka wrote:
>>
>> I ran the command
>>
>>   git grep -l "dg-do compile" | xargs grep -l __builtin_abort | xargs grep
>> -lw main
>>
>> to find tests marked as compile-time tests that likely ought to instead
>> be marked as run-time tests, by the rationale that they use
>> __builtin_abort and they also define main().  (I also then confirmed that
>> they
>> compile, link and run cleanly on my machine.)
>>
>> After this patch, the remaining test files reported by the above command
>> are:
>>
>>   These do not define all the functions they use:
>> gcc/testsuite/g++.dg/ipa/devirt-41.C
>> gcc/testsuite/g++.dg/ipa/devirt-44.C
>> gcc/testsuite/g++.dg/ipa/devirt-45.C
>> gcc/testsuite/gcc.target/i386/pr55672.c
>>
>>   These are non-x86 tests so I can't confirm that they run cleanly:
>> gcc/testsuite/gcc.target/arm/pr58041.c
>> gcc/testsuite/gcc.target/powerpc/pr35907.c
>> gcc/testsuite/gcc.target/s390/dwarfregtable-1.c
>> gcc/testsuite/gcc.target/s390/dwarfregtable-2.c
>> gcc/testsuite/gcc.target/s390/dwarfregtable-3.c
>>
>>   These use dg-error:
>> libstdc++-v3/testsuite/20_util/forward/c_neg.cc
>> libstdc++-v3/testsuite/20_util/forward/f_neg.cc
>>
>> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK to
>> commit?  Does anyone have another heuristic one can use to help find
>> these kinds of typos?
>>
>> gcc/testsuite/ChangeLog:
>>
>> * g++.dg/cpp0x/constexpr-aggr2.C: Make it a run-time test.
>> * g++.dg/cpp0x/nullptr32.C: Likewise.
>> * g++.dg/cpp1y/digit-sep-cxx11-neg.C: Likewise.
>> * g++.dg/cpp1y/digit-sep.C: Likewise.
>> * g++.dg/ext/flexary13.C: Likewise.
>> * gcc.dg/alias-14.c: Likewise.
>> * gcc.dg/ipa/PR65282.c: Likewise.
>> * gcc.dg/pr69644.c: Likewise.
>> * gcc.dg/tree-ssa/pr38533.c: Likewise.
>> * gcc.dg/tree-ssa/pr61385.c: Likewise.
>
> My worry with the 38533 test is that while the ASM defines "f" from the
> standpoint of dataflow, it does not actually emit any code to ensure "f" is
> actually defined.  This could lead to spurious aborts due to use of an
> uninitialized value at runtime.  Similarly for alias-14.c
>
> I'd be worried that we don't necessarily have sync_bool_compare_and_swap on
> all targets for 69644.

Ah yeah, good points..

>
> flexary13.C probably won't link on a cross target unless the cross libraries
> are available.  But that's probably OK.
>
> The rest seem OK to me.  Note that I'm not convinced all these tests were
> designed to be execution tests, even though they use __builtin_abort and
> friends.  Though it's a good marker of something that can/should be looked
> at.

True..  What made me look into this in the first place is that I
caught myself making a similar mistake, i.e. marking an execution test
case as dg-do compile instead of dg-do run out of habit.  But I
suppose it's worth looking at the context of each of these tests to
see if they were not actually intended to be execution tests.  I'll
double check this and report back; in the meantime I also found some
more tests that ought to be looked at.

>
>
> jeff
>


[PATCH 1/4] PR c++/62314: add fixit hint for missing "template <> " in explicit specialization

2016-04-28 Thread David Malcolm
This is a resend of a patch kit I sent in stage 3; the original post
was here:
  https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01933.html

I've rebased the patches against yesterday's trunk and retested them.

They add various fix-it hints to existing diagnostics (PR 62314 is a
catch-all for adding fix-its).

The first patch in the kit adds a fix-it insertion hint for missing
"template <> " in explicit specializations, and improves the
reported range of the type name by capturing the full range, rather
than just one token within it.

I note that clang (http://clang.llvm.org/diagnostics.html) suggests
inserting
  template<>
whereas our diagnostic talks about
  template <>
hence I have the fixit suggest inserting that.  Should we change our
wording instead, and lose the space?

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/cp/ChangeLog:
PR c++/62314
* parser.c (cp_parser_class_head): Capture the start location;
use it to emit a fix-it insertion hint when complaining
about missing "template <> " in explicit specializations.

gcc/testsuite/ChangeLog:
PR c++/62314
* g++.dg/pr62314.C: New test case.
---
 gcc/cp/parser.c| 18 --
 gcc/testsuite/g++.dg/pr62314.C | 17 +
 2 files changed, 33 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr62314.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 98a0cd4..ff16f73 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -21655,6 +21655,8 @@ cp_parser_class_head (cp_parser* parser,
   if (class_key == none_type)
 return error_mark_node;
 
+  location_t class_head_start_location = input_location;
+
   /* Parse the attributes.  */
   attributes = cp_parser_attributes_opt (parser);
 
@@ -21871,8 +21873,20 @@ cp_parser_class_head (cp_parser* parser,
   && parser->num_template_parameter_lists == 0
   && template_id_p)
 {
-  error_at (type_start_token->location,
-   "an explicit specialization must be preceded by %%>");
+  /* Build a location of this form:
+   struct typename 
+   ^~
+ with caret==start at the start token, and
+ finishing at the end of the type.  */
+  location_t reported_loc
+= make_location (class_head_start_location,
+ class_head_start_location,
+ get_finish (type_start_token->location));
+  rich_location richloc (line_table, reported_loc);
+  richloc.add_fixit_insert (class_head_start_location, "template <> ");
+  error_at_rich_loc
+(,
+ "an explicit specialization must be preceded by %%>");
   invalid_explicit_specialization_p = true;
   /* Take the same action that would have been taken by
 cp_parser_explicit_specialization.  */
diff --git a/gcc/testsuite/g++.dg/pr62314.C b/gcc/testsuite/g++.dg/pr62314.C
new file mode 100644
index 000..ebe75ec
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr62314.C
@@ -0,0 +1,17 @@
+// { dg-options "-fdiagnostics-show-caret" }
+
+template 
+struct iterator_traits {};
+
+struct file_iterator;
+
+struct iterator_traits { // { dg-error "explicit specialization 
must be preceded by .template" }
+};
+
+/* Verify that we emit a fixit hint for this case.  */
+
+/* { dg-begin-multiline-output "" }
+ struct iterator_traits
+ ^
+ template <> 
+   { dg-end-multiline-output "" } */
-- 
1.8.5.3



Re: [PATCH] Mark predicates generated by genmatch as static

2016-04-28 Thread Richard Biener
On Thu, Apr 28, 2016 at 3:50 PM, Patrick Palka  wrote:
> The predicate functions emitted by genmatch are expected to only be used
> locally within {gimple,generic}-match.c, so this patch marks them as
> static.  Does this look OK to commit after bootstrap and regtest?

Actually the idea was to for example generate predicates in match.pd
format for things like vectorizer pattern recog (I've done this for a few,
need to search for (partial) patches on my disk), so they are supposed
to be externally visible.

Of course we might want to make that explicit in some way with
a (extern_match ...) [or by prefixing local ones with a '*' ...].

Richard.

> gcc/ChangeLog:
>
> * genmatch.c (write_predicate): Mark the emitted function as
> static.
> ---
>  gcc/genmatch.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/genmatch.c b/gcc/genmatch.c
> index ce964fa..2f5147f 100644
> --- a/gcc/genmatch.c
> +++ b/gcc/genmatch.c
> @@ -3552,7 +3552,7 @@ decision_tree::gen (FILE *f, bool gimple)
>  void
>  write_predicate (FILE *f, predicate_id *p, decision_tree , bool gimple)
>  {
> -  fprintf (f, "\nbool\n"
> +  fprintf (f, "\nstatic bool\n"
>"%s%s (tree t%s%s)\n"
>"{\n", gimple ? "gimple_" : "tree_", p->id,
>p->nargs > 0 ? ", tree *res_ops" : "",
> --
> 2.8.1.361.g2fbef4c
>


Re: [PATCH] Fixup nb_iterations_upper_bound adjustment for vectorized loops

2016-04-28 Thread Richard Biener
On Thu, Apr 28, 2016 at 3:26 PM, Ilya Enkovich  wrote:
> On 27 Apr 16:05, Richard Biener wrote:
>> >>
>> >> I'd like to see testcases covering the corner-cases - have them have
>> >> upper bound estimates by adjusting known array sizes and also cover
>> >> the case of peeling for gaps.
>> >
>> > OK, I'll make more tests.
>> > Thanks,
>> > Ilya
>> >
>> >>
>> >> Richard.
>> >>
>
> Could you please look at new tests?  I added one simple case with
> known array size and similar tests with a peeling for gaps w/ and
> w/o vector iteration peeled.
>
> Checked new tests with RUNTESTFLAGS="vect.exp=vect-nb-iter-ub-* 
> --target_board=unix{-m32,}
> on x86_64-pc-linux-gnu.  OK for trunk?

Can you make the new testcases runtime ones, thus check that the
vectorized outcome
is ok (so we don't forget any trailing iterations)?

Ok with that change.

Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2016-04-28  Ilya Enkovich  
>
> * tree-vect-loop.c (vect_transform_loop): Fix
> nb_iterations_upper_bound computation for vectorized loop.
>
> gcc/testsuite/
>
> 2016-04-28  Ilya Enkovich  
>
> * gcc.target/i386/vect-unpack-2.c (avx512bw_test): Avoid
> optimization of vector loop.
> * gcc.target/i386/vect-unpack-3.c: New test.
> * gcc.dg/vect/vect-nb-iter-ub-1.c: New test.
> * gcc.dg/vect/vect-nb-iter-ub-2.c: New test.
> * gcc.dg/vect/vect-nb-iter-ub-3.c: New test.
>
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-1.c 
> b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-1.c
> new file mode 100644
> index 000..b7504a8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-1.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-mavx512bw -fdump-tree-cunroll-details" { target 
> { i?86-*-* x86_64-*-* } } } */
> +
> +int ii[127];
> +char cc[127];
> +
> +void
> +foo (int s)
> +{
> +  int i;
> +   for (i = 0; i < s; i++)
> + ii[i] = (int) cc[i];
> +}
> +
> +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target { 
> i?86-*-* x86_64-*-* } } } } */
> +/* { dg-final { scan-tree-dump "loop turned into non-loop; it never loops" 
> "cunroll" { target { i?86-*-* x86_64-*-* } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c 
> b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c
> new file mode 100644
> index 000..5332636
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-mavx512bw -fdump-tree-cunroll-details" { target 
> { i?86-*-* x86_64-*-* } } } */
> +
> +int ii[128];
> +char cc[256];
> +
> +void
> +foo (int s)
> +{
> +  int i;
> +   for (i = 0; i < s; i++)
> + ii[i] = (int) cc[i*2];
> +}
> +
> +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target { 
> i?86-*-* x86_64-*-* } } } } */
> +/* { dg-final { scan-tree-dump "loop turned into non-loop; it never loops" 
> "cunroll" { target { i?86-*-* x86_64-*-* } } } } */
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-3.c 
> b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-3.c
> new file mode 100644
> index 000..5610f6a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-3.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-mavx512bw -fdump-tree-cunroll-details" { target 
> { i?86-*-* x86_64-*-* } } } */
> +
> +int ii[130];
> +char cc[258];
> +
> +void
> +foo (int s)
> +{
> +  int i;
> +   for (i = 0; i < s; i++)
> + ii[i] = (int) cc[i*2];
> +}
> +
> +/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target { 
> i?86-*-* x86_64-*-* } } } } */
> +/* { dg-final { scan-tree-dump-not "loop turned into non-loop; it never 
> loops" "cunroll" { target { i?86-*-* x86_64-*-* } } } } */
> diff --git a/gcc/testsuite/gcc.target/i386/vect-unpack-2.c 
> b/gcc/testsuite/gcc.target/i386/vect-unpack-2.c
> index 4825248..51c518e 100644
> --- a/gcc/testsuite/gcc.target/i386/vect-unpack-2.c
> +++ b/gcc/testsuite/gcc.target/i386/vect-unpack-2.c
> @@ -6,19 +6,22 @@
>
>  #define N 120
>  signed int yy[1];
> +signed char zz[1];
>
>  void
> -__attribute__ ((noinline)) foo (signed char s)
> +__attribute__ ((noinline,noclone)) foo (int s)
>  {
> -   signed char i;
> +   int i;
> for (i = 0; i < s; i++)
> - yy[i] = (signed int) i;
> + yy[i] = zz[i];
>  }
>
>  void
>  avx512bw_test ()
>  {
>signed char i;
> +  for (i = 0; i < N; i++)
> +zz[i] = i;
>foo (N);
>for (i = 0; i < N; i++)
>  if ( (signed int)i != yy [i] )
> diff --git a/gcc/testsuite/gcc.target/i386/vect-unpack-3.c 
> b/gcc/testsuite/gcc.target/i386/vect-unpack-3.c
> new file mode 100644
> index 000..eb8a93e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/vect-unpack-3.c
> @@ -0,0 +1,29 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -fdump-tree-vect-details -ftree-vectorize -ffast-math 
> -mavx512bw -save-temps" } */
> +/* 

[PATCH] Do not build a pointer-to-element type for arrays in layout_type

2016-04-28 Thread Richard Biener

I stumbled over this odd call, present since CVS repo import (r348).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2016-04-28  Richard Biener  

* stor-layout.c (layout_type): Do not build a pointer-to-element
type for arrays.

Index: gcc/stor-layout.c
===
--- gcc/stor-layout.c   (revision 235557)
+++ gcc/stor-layout.c   (working copy)
@@ -2243,8 +2243,6 @@ layout_type (tree type)
tree index = TYPE_DOMAIN (type);
tree element = TREE_TYPE (type);
 
-   build_pointer_type (element);
-
/* We need to know both bounds in order to compute the size.  */
if (index && TYPE_MAX_VALUE (index) && TYPE_MIN_VALUE (index)
&& TYPE_SIZE (element))


[PATCH] Mark predicates generated by genmatch as static

2016-04-28 Thread Patrick Palka
The predicate functions emitted by genmatch are expected to only be used
locally within {gimple,generic}-match.c, so this patch marks them as
static.  Does this look OK to commit after bootstrap and regtest?

gcc/ChangeLog:

* genmatch.c (write_predicate): Mark the emitted function as
static.
---
 gcc/genmatch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index ce964fa..2f5147f 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -3552,7 +3552,7 @@ decision_tree::gen (FILE *f, bool gimple)
 void
 write_predicate (FILE *f, predicate_id *p, decision_tree , bool gimple)
 {
-  fprintf (f, "\nbool\n"
+  fprintf (f, "\nstatic bool\n"
   "%s%s (tree t%s%s)\n"
   "{\n", gimple ? "gimple_" : "tree_", p->id,
   p->nargs > 0 ? ", tree *res_ops" : "",
-- 
2.8.1.361.g2fbef4c



[PATCH][internal-fn.c][committed] Convert conditional compilation on WORD_REGISTER_OPERATIONS

2016-04-28 Thread Kyrill Tkachov

Hi all,

This is another instance of conditional compilation on WORD_REGISTER_OPERATIONS 
that's trivial to remove.

Bootstrapped and tested on arm, aarch64, x86_64.
Committing to trunk as obvious.

Thanks,
Kyrill

2016-04-28  Kyrylo Tkachov  

* internal-fn.c (expand_arith_overflow): Convert preprocessor check
for WORD_REGISTER_OPERATIONS to runtime check.
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 3ceaffe67eaa694afe35de8f7a13a182c46f05ff..2cbe198924c7b3aed34d52c0cc35612e09f5646c 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -1807,11 +1807,7 @@ expand_arith_overflow (enum tree_code code, gimple *stmt)
   /* For sub-word operations, retry with a wider type first.  */
   if (orig_precres == precres && precop <= BITS_PER_WORD)
 	{
-#if WORD_REGISTER_OPERATIONS
-	  int p = BITS_PER_WORD;
-#else
-	  int p = precop;
-#endif
+	  int p = WORD_REGISTER_OPERATIONS ? BITS_PER_WORD : precop;
 	  enum machine_mode m = smallest_mode_for_size (p, MODE_INT);
 	  tree optype = build_nonstandard_integer_type (GET_MODE_PRECISION (m),
 			uns0_p && uns1_p


Re: [PATCH] Fixup nb_iterations_upper_bound adjustment for vectorized loops

2016-04-28 Thread Ilya Enkovich
On 27 Apr 16:05, Richard Biener wrote:
> >>
> >> I'd like to see testcases covering the corner-cases - have them have
> >> upper bound estimates by adjusting known array sizes and also cover
> >> the case of peeling for gaps.
> >
> > OK, I'll make more tests.
> > Thanks,
> > Ilya
> >
> >>
> >> Richard.
> >>

Could you please look at new tests?  I added one simple case with
known array size and similar tests with a peeling for gaps w/ and
w/o vector iteration peeled.

Checked new tests with RUNTESTFLAGS="vect.exp=vect-nb-iter-ub-* 
--target_board=unix{-m32,}
on x86_64-pc-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2016-04-28  Ilya Enkovich  

* tree-vect-loop.c (vect_transform_loop): Fix
nb_iterations_upper_bound computation for vectorized loop.

gcc/testsuite/

2016-04-28  Ilya Enkovich  

* gcc.target/i386/vect-unpack-2.c (avx512bw_test): Avoid
optimization of vector loop.
* gcc.target/i386/vect-unpack-3.c: New test.
* gcc.dg/vect/vect-nb-iter-ub-1.c: New test.
* gcc.dg/vect/vect-nb-iter-ub-2.c: New test.
* gcc.dg/vect/vect-nb-iter-ub-3.c: New test.


diff --git a/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-1.c 
b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-1.c
new file mode 100644
index 000..b7504a8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-mavx512bw -fdump-tree-cunroll-details" { target { 
i?86-*-* x86_64-*-* } } } */
+
+int ii[127];
+char cc[127];
+
+void
+foo (int s)
+{
+  int i;
+   for (i = 0; i < s; i++)
+ ii[i] = (int) cc[i];
+}
+
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target { i?86-*-* 
x86_64-*-* } } } } */
+/* { dg-final { scan-tree-dump "loop turned into non-loop; it never loops" 
"cunroll" { target { i?86-*-* x86_64-*-* } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c 
b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c
new file mode 100644
index 000..5332636
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-2.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-mavx512bw -fdump-tree-cunroll-details" { target { 
i?86-*-* x86_64-*-* } } } */
+
+int ii[128];
+char cc[256];
+
+void
+foo (int s)
+{
+  int i;
+   for (i = 0; i < s; i++)
+ ii[i] = (int) cc[i*2];
+}
+
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target { i?86-*-* 
x86_64-*-* } } } } */
+/* { dg-final { scan-tree-dump "loop turned into non-loop; it never loops" 
"cunroll" { target { i?86-*-* x86_64-*-* } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-3.c 
b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-3.c
new file mode 100644
index 000..5610f6a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-nb-iter-ub-3.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-mavx512bw -fdump-tree-cunroll-details" { target { 
i?86-*-* x86_64-*-* } } } */
+
+int ii[130];
+char cc[258];
+
+void
+foo (int s)
+{
+  int i;
+   for (i = 0; i < s; i++)
+ ii[i] = (int) cc[i*2];
+}
+
+/* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" { target { i?86-*-* 
x86_64-*-* } } } } */
+/* { dg-final { scan-tree-dump-not "loop turned into non-loop; it never loops" 
"cunroll" { target { i?86-*-* x86_64-*-* } } } } */
diff --git a/gcc/testsuite/gcc.target/i386/vect-unpack-2.c 
b/gcc/testsuite/gcc.target/i386/vect-unpack-2.c
index 4825248..51c518e 100644
--- a/gcc/testsuite/gcc.target/i386/vect-unpack-2.c
+++ b/gcc/testsuite/gcc.target/i386/vect-unpack-2.c
@@ -6,19 +6,22 @@
 
 #define N 120
 signed int yy[1];
+signed char zz[1];
 
 void
-__attribute__ ((noinline)) foo (signed char s)
+__attribute__ ((noinline,noclone)) foo (int s)
 {
-   signed char i;
+   int i;
for (i = 0; i < s; i++)
- yy[i] = (signed int) i;
+ yy[i] = zz[i];
 }
 
 void
 avx512bw_test ()
 {
   signed char i;
+  for (i = 0; i < N; i++)
+zz[i] = i;
   foo (N);
   for (i = 0; i < N; i++)
 if ( (signed int)i != yy [i] )
diff --git a/gcc/testsuite/gcc.target/i386/vect-unpack-3.c 
b/gcc/testsuite/gcc.target/i386/vect-unpack-3.c
new file mode 100644
index 000..eb8a93e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/vect-unpack-3.c
@@ -0,0 +1,29 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -fdump-tree-vect-details -ftree-vectorize -ffast-math 
-mavx512bw -save-temps" } */
+/* { dg-require-effective-target avx512bw } */
+
+#include "avx512bw-check.h"
+
+#define N 120
+signed int yy[1];
+
+void
+__attribute__ ((noinline)) foo (signed char s)
+{
+   signed char i;
+   for (i = 0; i < s; i++)
+ yy[i] = (signed int) i;
+}
+
+void
+avx512bw_test ()
+{
+  signed char i;
+  foo (N);
+  for (i = 0; i < N; i++)
+if ( (signed int)i != yy [i] )
+  abort ();
+}
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+/* { dg-final { scan-assembler-not "vpmovsxbw\[ \\t\]+\[^\n\]*%zmm" } } */
diff 

RE: [PATCH][SMS] SMS use loop induction variable analysis instead of depending on doloop optimization

2016-04-28 Thread Shiva Chen
Hi, 

I fixed some bug to pass testing on x86-64 and update the patch
as 0001-SMS-use-loop-induction-variable-analysis-v1.patch.

Thanks,
Shiva

-Original Message-
From: Shiva Chen 
Sent: Thursday, April 28, 2016 2:07 PM
To: GCC Patches ; Shiva Chen 
Subject: [PATCH][SMS] SMS use loop induction variable analysis instead of 
depending on doloop optimization

Hi, 

According to Richard's suggestion in
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01240.html
I try to remove the SMS dependency on doloop pass.

SMS would need to adjust kernel loop iteration count during the transformation.

To adjust loop iteration count, SMS would need to find count_reg which contain 
the loop iteration count and then generate adjustment instruction.

Currently, SMS would find the doloop_end pattern to get count_reg, and generate 
adjustment instruction according to doloop optimization result (tranfer the 
loop to count to zero with step = 1).

If can't find doloop_end pattern or the loop form not the doloop optimization 
result, the SMS will skip the loop.

Doloop optimization could have benefit for some target even if the target don't 
support special loop instruction.

E.g. For arm , doloop optimization could transfer
 the instructions to subs and branch which save the
 comparison instruction.

However, If the loop iteration count computation of doloop optimization is too 
complicate, it would drop performance.
(PARAM_MAX_ITERATIONS_COMPUTATION_COST default value is 10 which may too high 
for the target not support special loop instruction)

This kind loop not suitable for doloop optimization and SMS can't activate.

To free the SMS dependency on doloop optimization, I try to use loop induction 
variable analysis to find count_reg and generate kernel loop adjustment 
instruction for the loop form without doloop optimization(increment/decrement 
loop with step != 1).

Without doloop optimization, induction variable could be a 
POST_INC/POST_DEC/PRE_INC/PRE_DEC in memory reference which current 
implementation won't identify as loop iv. So I modify relative code in 
loop-iv.c to identify this case.

With the patch, backend target could active SMS without define doloop_end 
pattern.

Could anyone help me to review the patch?
Any suggestion would be very helpful.

Thanks,
Shiva



0001-SMS-use-loop-induction-variable-analysis-v1.patch
Description: 0001-SMS-use-loop-induction-variable-analysis-v1.patch


  1   2   >