date:20141210

Re: [PATCH] Fix a typo in range_entry_cmp (PR tree-optimization/61686)

2014-12-10 Thread Marek Polacek

On Wed, Dec 10, 2014 at 08:59:09AM +0100, Jakub Jelinek wrote:
 On Wed, Dec 10, 2014 at 07:57:46AM +0100, Marek Polacek wrote:
  I don't really know this code, but this typo looks obvious enough.
  Using if (p-high != NULL_TREE) ... else if (p-high != NULL_TREE)
  couldn't be possibly desired, so use Q in the else branch, as in
  the code slightly above.
  
  Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk?
  
  2014-12-10  Marek Polacek  pola...@redhat.com
  
  PR tree-optimization/61686
  * tree-ssa-reassoc.c (range_entry_cmp): Use q-high instead of
  p-high.
 
 Ok for trunk/4.9/4.8.  Shouldn't we have a FE warning for this kind of thing?
 I mean
   if (conditionX)
 {
 }
   else if (conditionY)
 ...
 when the two conditions don't have side-effects and are operand_equal_p?

Yes, we should, I'll file a PR.  Not sure whether such a warning is
stage 3 material.

Thanks,

Marek

Re: [PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables

2014-12-10 Thread Jakub Jelinek

On Tue, Dec 09, 2014 at 03:32:33PM +0300, Ilya Verbin wrote:
 On 04 Dec 20:52, Jakub Jelinek wrote:
  On Thu, Dec 04, 2014 at 10:35:19PM +0300, Ilya Verbin wrote:
   This issue can be resolved by forcing output of such variables.
   Is this fix ok?  Should I add a testcase?
  
  Yes, with proper ChangeLog.  Yes.
 
 Here is updated patch, ok to commit?
 
 However, I don't see -flto option in the build log.  It seems that
 check_effective_target_lto isn't working inside libgomp/ directory.
 Maybe because ENABLE_LTO is defined only in gcc/configure.ac ?
 
 
 gcc/
   * varpool.c (varpool_node::get_create): Force output of vars with
   omp declare target attribute.
 libgomp/
   * testsuite/libgomp.c/target-9.c: New test.

Ok, though please try to find out why effective target lto check doesn't
work in libgomp.  Perhaps you just need to include some further *.exp
file?

Jakub

Re: [PING ^ 3][PATCH, AArch64] Add doloop_end pattern for -fmodulo-sched

2014-12-10 Thread Yangfei (Felix)

  --- gcc/config/aarch64/aarch64.c(revision 217394)
  +++ gcc/config/aarch64/aarch64.c(working copy)
  @@ -10224,6 +10224,9 @@ aarch64_use_by_pieces_infrastructure_p
 (unsigned i
#define TARGET_USE_BY_PIECES_INFRASTRUCTURE_P \
  aarch64_use_by_pieces_infrastructure_p
 
  +#undef TARGET_CAN_USE_DOLOOP_P
  +#define TARGET_CAN_USE_DOLOOP_P can_use_doloop_if_innermost
  +
struct gcc_target targetm = TARGET_INITIALIZER;
 
#include gt-aarch64.h
 
 
 Hi Felix,
 
 This patch causes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64240
 when sms-3 is tested with -fPIC. It runs fine when I reverse this patch out.
 
 Please could you have a look?
 
 Thanks,
 Tejas.

OK, I have reproduced with -fPIC option. Will take a look.

Re: [SH][committed] Document FPSCR built-in functions

2014-12-10 Thread Oleg Endo

On Wed, 2014-12-10 at 01:24 +0100, Oleg Endo wrote:
 Hi,
 
 This documents the new SH FPSCR built-in functions.
 Tested with 'make info dvi pdf', committed as r218551.
 
 Cheers,
 Oleg
 
 gcc/ChangeLog:
   PR target/53513
   * doc/extend.texi (__builtin_sh_get_fpscr, __builtin_sh_get_fpscr):
   Document it.

... and here's the typo / copy-pasta fix.  One of those functions is
__builtin_sh_set_fpscr.  Committed as r218563.

Cheers,
Oleg

gcc/ChangeLog:
PR target/53513
* doc/extend.texi (__builtin_sh_set_fpscr): Fix typo.
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 218559)
+++ gcc/doc/extend.texi	(working copy)
@@ -16906,7 +16906,7 @@
 Returns the value that is currently set in the @samp{FPSCR} register.
 @end deftypefn
 
-@deftypefn {Built-in Function} {void} __builtin_sh_get_fpscr (unsigned int @var{val})
+@deftypefn {Built-in Function} {void} __builtin_sh_set_fpscr (unsigned int @var{val})
 Sets the @samp{FPSCR} register to the specified value @var{val}, while
 preserving the current values of the FR, SZ and PR bits.
 @end deftypefn

Re: [PATCH] TYPE_OVERFLOW_* cleanup

2014-12-10 Thread Richard Biener

On Tue, 9 Dec 2014, Marek Polacek wrote:

 The issue here is that TYPE_OVERFLOW_TRAPS, TYPE_OVERFLOW_UNDEFINED,
 and TYPE_OVERFLOW_WRAPS macros work on integral types only, yet we
 pass e.g. pointer_type/real_type to them.  This patch adds proper
 checking for these macros and adds missing guards to various places.
 This looks pretty straightforward, but I had to tweak a few places
 wrt vector_types (where I've used VECTOR_INTEGER_TYPE_P) to not to
 regress folding - and I'm afraid I missed places that aren't tested
 in our testsuite :/.

Apart from what Marc already pointed out I think that for vectors
and complex types of integral types the macros work ok (TYPE_UNSIGNED
is well-defined for those).  It would be wrong to disable the
tests for those (I probably mislead you here).  Similar to FLOAT_TYPE_P
we probably want a ANY_INTEGRAL_TYPE_P () predicate (bah,
INTEGRAL_TYPE_P should be really SCALAR_INTEGRAL_TYPE_P...).

Thus the TYPE_OVERFLOW_* macros should be guarded with a tree check
checking for that ANY_INTEGRAL_TYPE_P instead.

Sorry for not catching that initially,

Thanks,
Richard.

 Bootstrapped/regtested on ppc64-linux and x86_64-linux.
 
 2014-12-09  Marek Polacek  pola...@redhat.com
 
   * fold-const.c (negate_expr_p): Check for INTEGRAL_TYPE_P.
   (fold_negate_expr): Likewise.
   (extract_muldiv_1): Likewise.
   (maybe_canonicalize_comparison_1): Likewise.
   (fold_comparison): Likewise.
   (fold_binary_loc): Likewise.
   (tree_binary_nonnegative_warnv_p): Likewise.
   (tree_binary_nonzero_warnv_p): Likewise.
   * gimple-ssa-strength-reduction.c (legal_cast_p_1): Likewise.
   * tree-scalar-evolution.c (simple_iv): Likewise.
   (scev_const_prop): Likewise.
   * tree-ssa-loop-niter.c (expand_simple_operations): Likewise.
   * match.pd (X % -C): Likewise.
   (-A - 1 - ~A): Likewise.
   (~A + A - -1): Check for INTEGRAL_TYPE_P or VECTOR_INTEGER_TYPE_P.
   * tree-vect-generic.c (expand_vector_operation): Likewise.
   * tree.h (INTEGRAL_TYPE_CHECK): Define.
   (TYPE_OVERFLOW_WRAPS, TYPE_OVERFLOW_UNDEFINED, TYPE_OVERFLOW_TRAPS):
   Add INTEGRAL_TYPE_CHECK.
 
 diff --git gcc/fold-const.c gcc/fold-const.c
 index 0c4fe40..ff9d917 100644
 --- gcc/fold-const.c
 +++ gcc/fold-const.c
 @@ -426,7 +426,8 @@ negate_expr_p (tree t)
  
  case VECTOR_CST:
{
 - if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))
 + if (FLOAT_TYPE_P (TREE_TYPE (type))
 + || (INTEGRAL_TYPE_P (type)  TYPE_OVERFLOW_WRAPS (type)))
 return true;
  
   int count = TYPE_VECTOR_SUBPARTS (type), i;
 @@ -558,7 +559,8 @@ fold_negate_expr (location_t loc, tree t)
  case INTEGER_CST:
tem = fold_negate_const (t, type);
if (TREE_OVERFLOW (tem) == TREE_OVERFLOW (t)
 -   || (!TYPE_OVERFLOW_TRAPS (type)
 +   || (INTEGRAL_TYPE_P (type)
 +!TYPE_OVERFLOW_TRAPS (type)
  TYPE_OVERFLOW_WRAPS (type))
 || (flag_sanitize  SANITIZE_SI_OVERFLOW) == 0)
   return tem;
 @@ -5951,7 +5953,8 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
 tree wide_type,
  || EXPRESSION_CLASS_P (op0))
 /* ... and has wrapping overflow, and its type is smaller
than ctype, then we cannot pass through as widening.  */
 -((TYPE_OVERFLOW_WRAPS (TREE_TYPE (op0))
 +(((INTEGRAL_TYPE_P (TREE_TYPE (op0))
 +  TYPE_OVERFLOW_WRAPS (TREE_TYPE (op0)))
   (TYPE_PRECISION (ctype)
   TYPE_PRECISION (TREE_TYPE (op0
 /* ... or this is a truncation (t is narrower than op0),
 @@ -5966,7 +5969,8 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
 tree wide_type,
 /* ... or has undefined overflow while the converted to
type has not, we cannot do the operation in the inner type
as that would introduce undefined overflow.  */
 -   || (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0))
 +   || ((INTEGRAL_TYPE_P (TREE_TYPE (op0))
 + TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0)))
  !TYPE_OVERFLOW_UNDEFINED (type
   break;
  
 @@ -6159,6 +6163,7 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
 tree wide_type,
if ((code == TRUNC_MOD_EXPR || code == CEIL_MOD_EXPR
  || code == FLOOR_MOD_EXPR || code == ROUND_MOD_EXPR)
 /* If the multiplication can overflow we cannot optimize this.  */
 +INTEGRAL_TYPE_P (TREE_TYPE (t))
  TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (t))
  TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST
  wi::multiple_of_p (op1, c, TYPE_SIGN (type)))
 @@ -6211,7 +6216,8 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
 tree wide_type,
  
If we have an unsigned type, we cannot do this since it will change
the result if the original computation overflowed.  */
 -  if (TYPE_OVERFLOW_UNDEFINED (ctype)
 +  if

Re: [PATCH 2/4] vldN_lane error message enhancements (D registers)

2014-12-10 Thread Christophe Lyon

On 9 December 2014 at 16:27,  charles.bay...@linaro.org wrote:
 From: Charles Baylis charles.bay...@linaro.org

 gcc/ChangeLog

 DATE  Charles Baylis  charles.bay...@linaro.org

 * config/aarch64/arm_neon.h (__LD2_LANE_FUNC): Add explicit lane
 bounds check.
 (__LD3_LANE_FUNC): Likewise.
 (__LD4_LANE_FUNC): Likewise

 gcc/testsuite/ChangeLog:

 DATE  Charles Baylis  charles.bay...@linaro.org

 * gcc.target/aarch64/simd/vld4_lane.c: New test.

 Change-Id: Ia95fbed34b50cf710ea9032ff3428a5f1432e0aa
 ---
  gcc/config/aarch64/arm_neon.h |  6 ++
  gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c | 15 +++
  2 files changed, 21 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c

 diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
 index 8cff719..22df564 100644
 --- a/gcc/config/aarch64/arm_neon.h
 +++ b/gcc/config/aarch64/arm_neon.h
 @@ -17901,6 +17901,8 @@ vld2_lane_##funcsuffix (const ptrtype * __ptr, intype 
 __b, const int __c)  \
__o = __builtin_aarch64_set_qregoi##mode (__o,  \
(signedtype) __temp.val[1], \
1); \
 +  __builtin_aarch64_im_lane_boundsi (__c, \
 +sizeof (vectype) / sizeof (*__ptr));  \

Shouldn't the arguments be reversed? (I'm looking at
__AARCH64_LANE_CHECK: the lane index is the 2nd parameter)

__o =__builtin_aarch64_ld2_lane##mode (
  \
   (__builtin_aarch64_simd_##ptrmode *) __ptr, __o, __c);   \
__b.val[0] = (vectype) __builtin_aarch64_get_dregoidi (__o, 0); \
 @@ -17991,6 +17993,8 @@ vld3_lane_##funcsuffix (const ptrtype * __ptr, intype 
 __b, const int __c)  \
__o = __builtin_aarch64_set_qregci##mode (__o,  \
(signedtype) __temp.val[2], \
2); \
 +  __builtin_aarch64_im_lane_boundsi (__c, \
 +sizeof (vectype) / sizeof (*__ptr));  \
__o =__builtin_aarch64_ld3_lane##mode (
  \
   (__builtin_aarch64_simd_##ptrmode *) __ptr, __o, __c);   \
__b.val[0] = (vectype) __builtin_aarch64_get_dregcidi (__o, 0); \
 @@ -18089,6 +18093,8 @@ vld4_lane_##funcsuffix (const ptrtype * __ptr, intype 
 __b, const int __c)  \
__o = __builtin_aarch64_set_qregxi##mode (__o,  \
(signedtype) __temp.val[3], \
3); \
 +  __builtin_aarch64_im_lane_boundsi (__c, \
 +sizeof (vectype) / sizeof (*__ptr));  \
__o =__builtin_aarch64_ld4_lane##mode (
  \
   (__builtin_aarch64_simd_##ptrmode *) __ptr, __o, __c);   \
__b.val[0] = (vectype) __builtin_aarch64_get_dregxidi (__o, 0); \
 diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c 
 b/gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c
 new file mode 100644
 index 000..d14e6c1
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c
 @@ -0,0 +1,15 @@
 +/* Test error message when passing an invalid value as a lane index.  */
 +
 +/* { dg-do compile } */
 +
 +#include arm_neon.h
 +
 +int8x8x4_t
 +f_vld4_lane (int8_t * p, int8x8x4_t v)
 +{
 +  int8x8x4_t res;
 +  /* { dg-error lane 8 out of range 0 - 7  { target *-*-* } 0 } */
 +  res = vld4_lane_s8 (p, v, 8);
 +  return res;
 +}
 +
 --
 1.9.1

[PATCH] Fix PR64191, 2nd part

2014-12-10 Thread Richard Biener


This fixes DCE to remove pointless clobbers which then enables to
DCE empty loops (with just clobbers).  This is IMHO important
to get rid of empty constructor calling loops which are not uncommon.

The way this now works is to treat clobbers as not necessary - but
avoid removing them if required uses are not DCEd.  Thus they get
treated similar to debug stmts.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk
sofar.

Richard.

2014-12-10  Richard Biener  rguent...@suse.de

PR tree-optimization/64191
* tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Do not
mark clobbers as necessary.
(eliminate_unnecessary_stmts): Keep clobbers live if we can.

* g++.dg/pr64191.C: Make sure we can DCE empty loops with
indirect clobbers.

Index: gcc/tree-ssa-dce.c
===
--- gcc/tree-ssa-dce.c  (revision 218479)
+++ gcc/tree-ssa-dce.c  (working copy)
@@ -292,8 +292,7 @@ mark_stmt_if_obviously_necessary (gimple
   break;
 
 case GIMPLE_ASSIGN:
-  if (TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME
-  TREE_CLOBBER_P (gimple_assign_rhs1 (stmt)))
+  if (gimple_clobber_p (stmt))
return;
   break;
 
@@ -1362,6 +1361,25 @@ eliminate_unnecessary_stmts (void)
  /* If GSI is not necessary then remove it.  */
  if (!gimple_plf (stmt, STMT_NECESSARY))
{
+ /* Keep clobbers that we can keep live live.  */
+ if (gimple_clobber_p (stmt))
+   {
+ ssa_op_iter iter;
+ use_operand_p use_p;
+ bool dead = false;
+ FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, SSA_OP_USE)
+   {
+ tree name = USE_FROM_PTR (use_p);
+ if (!SSA_NAME_IS_DEFAULT_DEF (name)
+  !bitmap_bit_p (processed, SSA_NAME_VERSION (name)))
+   {
+ dead = true;
+ break;
+   }
+   }
+ if (!dead)
+   continue;
+   }
  if (!is_gimple_debug (stmt))
something_changed = true;
  remove_dead_stmt (gsi, bb);
Index: gcc/testsuite/g++.dg/pr64191.C
===
--- gcc/testsuite/g++.dg/pr64191.C  (revision 0)
+++ gcc/testsuite/g++.dg/pr64191.C  (working copy)
@@ -0,0 +1,25 @@
+// { dg-do compile }
+// { dg-options -O2 -fdump-tree-cddce1 }
+
+struct Bar
+{
+  int i;
+  ~Bar() { }
+};
+void bar_dtor_loop(Bar* p, unsigned int n)
+{
+  if (p) {
+  Bar* e = p + n;
+  while (e  p) {
+ --e;
+ e-~Bar();
+  }
+  }
+}
+
+// The clobber in ~Bar should persist but those inlined into
+// bar_dtor_loop not, nor should the loop therein
+
+// { dg-final { scan-tree-dump-times CLOBBER 1 cddce1 } }
+// { dg-final { scan-tree-dump-times if 0 cddce1 } }
+// { dg-final { cleanup-tree-dump cddce1 } }

Re: [PATCH] Fix PR42108

2014-12-10 Thread Richard Biener

On Tue, 9 Dec 2014, Richard Biener wrote:

 
 The following finally fixes PR42108 (well, hopefully...) by using
 range-information on SSA names to allow the integer divisions introduced
 by Fortran array lowering being hoisted out of loops, thus detecting
 them as not trapping.
 
 I chose to enhance tree_single_nonzero_warnv_p for this and adjusted
 operation_could_trap_helper_p to use this helper.

Unfortunately it can't be done this way as range-information is
dependent on the location of the definition - but if we change
predicates the way I tried code motion optimizations (like LIM)
will happily move that definition before dominating conditions
which may make the range information invalid.

The patch caused

=== acats tests ===
FAIL:   c450001
FAIL:   cxg2023
FAIL:   cxg2024

in testing (not investigated).

 Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
 
 Richard.
 
 2014-12-09  Richard Biener  rguent...@suse.de
 
   PR tree-optimization/42108
   * fold-const.c (tree_single_nonzero_warnv_p): Use range
   information associated with SSA names.
   * tree-eh.c (operation_could_trap_helper_p): Use
   tree_single_nonzero_warnv_p to check for trapping non-fp
   operation.
   * tree-vrp.c (remove_range_assertions): Remove bogus assert.
 
   * gfortran.dg/pr42108.f90: Adjust testcase.
 
 Index: gcc/fold-const.c
 ===
 --- gcc/fold-const.c  (revision 218479)
 +++ gcc/fold-const.c  (working copy)
 @@ -83,6 +83,8 @@ along with GCC; see the file COPYING3.
  #include cgraph.h
  #include generic-match.h
  #include optabs.h
 +#include stringpool.h
 +#include tree-ssanames.h
  
  /* Nonzero if we are folding constants inside an initializer; zero
 otherwise.  */
 @@ -15362,6 +15381,26 @@ tree_single_nonzero_warnv_p (tree t, boo
   }
break;
  
 +case SSA_NAME:
 +  if (INTEGRAL_TYPE_P (TREE_TYPE (t)))
 + {
 +   wide_int minv, maxv;
 +   enum value_range_type rtype = get_range_info (t, minv, maxv);
 +   if (rtype == VR_RANGE)
 + {
 +   if (wi::lt_p (maxv, 0, TYPE_SIGN (TREE_TYPE (t)))
 +   || wi::gt_p (minv, 0, TYPE_SIGN (TREE_TYPE (t
 + return true;
 + }
 +   else if (rtype == VR_ANTI_RANGE)
 + {
 +   if (wi::le_p (minv, 0, TYPE_SIGN (TREE_TYPE (t)))
 +wi::ge_p (maxv, 0, TYPE_SIGN (TREE_TYPE (t
 + return true;
 + }
 + }
 +  break;
 +
  default:
break;
  }
 Index: gcc/tree-eh.c
 ===
 --- gcc/tree-eh.c (revision 218479)
 +++ gcc/tree-eh.c (working copy)
 @@ -2440,13 +2440,16 @@ operation_could_trap_helper_p (enum tree
  case ROUND_MOD_EXPR:
  case TRUNC_MOD_EXPR:
  case RDIV_EXPR:
 -  if (honor_snans || honor_trapv)
 - return true;
 -  if (fp_operation)
 - return flag_trapping_math;
 -  if (!TREE_CONSTANT (divisor) || integer_zerop (divisor))
 -return true;
 -  return false;
 +  {
 + if (honor_snans || honor_trapv)
 +   return true;
 + if (fp_operation)
 +   return flag_trapping_math;
 + bool sop;
 + if (!tree_single_nonzero_warnv_p (divisor, sop))
 +   return true;
 + return false;
 +  }
  
  case LT_EXPR:
  case LE_EXPR:
 Index: gcc/testsuite/gfortran.dg/pr42108.f90
 ===
 --- gcc/testsuite/gfortran.dg/pr42108.f90 (revision 218479)
 +++ gcc/testsuite/gfortran.dg/pr42108.f90 (working copy)
 @@ -1,5 +1,5 @@
  ! { dg-do compile }
 -! { dg-options -O2 -fdump-tree-fre1 }
 +! { dg-options -O2 -fdump-tree-fre1 -fdump-tree-lim1-details }
  
  subroutine  eval(foo1,foo2,foo3,foo4,x,n,nnd)
implicit real*8 (a-h,o-z)
 @@ -21,7 +21,10 @@ subroutine  eval(foo1,foo2,foo3,foo4,x,n
end do
  end subroutine eval
  
 -! There should be only one load from n left
 +! There should be only one load from n left and the division should
 +! be hoisted out of the loop
  
  ! { dg-final { scan-tree-dump-times \\*n_ 1 fre1 } }
 +! { dg-final { scan-tree-dump-times Moving statement 5 lim1 } }
  ! { dg-final { cleanup-tree-dump fre1 } }
 +! { dg-final { cleanup-tree-dump lim1 } }
 Index: gcc/tree-vrp.c
 ===
 --- gcc/tree-vrp.c(revision 218479)
 +++ gcc/tree-vrp.c(working copy)
 @@ -6866,12 +6866,9 @@ remove_range_assertions (void)
   tree lhs = gimple_assign_lhs (stmt);
   tree rhs = gimple_assign_rhs1 (stmt);
   tree var;
 - tree cond = fold (ASSERT_EXPR_COND (rhs));
   use_operand_p use_p;
   imm_use_iterator iter;
  
 - gcc_assert (cond != boolean_false_node);
 -
   var = ASSERT_EXPR_VAR (rhs);
   gcc_assert (TREE_CODE (var) == SSA_NAME);
  
 

-- 
Richard Biener

Re: [PATCH, REPOST] Fix PR fortran/60718

2014-12-10 Thread Tobias Burnus

Hi Bernd,

On Tue, 2 Dec 2014 11:25:42, Bernd Edlinger wrote:
 a long time ago, I posted this patch, but it got forgotten.

Sorry, but persistent pinging helps...

 However the described problem is still unsolved,
 so I thought my patch should be re-posted now.
 
 
 Boot-strapped and regression-tested on arm-linux-gnueabihf.
 OK for trunk?

OK. Thanks for the patch!

Tobias

Re: [PATCH][rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1

2014-12-10 Thread Alan Hayward


On 02/12/2014 12:36, Alan Hayward alan.hayw...@arm.com wrote:


On 21/11/2014 14:08, Alan Hayward alan.hayw...@arm.com wrote:


On 14/11/2014 16:48, Alan Hayward alan.hayw...@arm.com wrote:

This is a new version of my BE patch from a few weeks ago.
This is part 1 and covers rtlanal.c. The second part will be aarch64
specific.

When combined with the second patch, It fixes up movoi/ci/xi for Big
Endian, so that we end up with the lab of a big-endian integer to be in
the low byte of the highest-numbered register.

This will apply cleanly by itself and no regressions were seen when
testing aarch64 and x86_64 on make check.


Changelog:

2014-11-14  Alan Hayward  alan.hayw...@arm.com

* rtlanal.c
(subreg_get_info): Exit early for simple and common cases


Alan.

Hi,

The second part to this patch (aarch64 specific) has been approved.


Could someone review this one please.


Thanks,
Alan.


Ping.


Thanks,
Alan.


Ping ping.

Thanks,
Alan.


0001-BE-fix-load-stores.-Common-code.patch
Description: Binary data

Re: [gomp4] acc enter/exit data

2014-12-10 Thread Thomas Schwinge

Hi!

On Thu, 30 Oct 2014 17:11:04 -0700, Cesar Philippidis ce...@codesourcery.com 
wrote:
 This patch add support for OpenACC's enter/exit data directive. [...]

   gcc/
   * gimple.h (enum gf_mask): Add GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA.

In r218567, I committed the following to gomp-4_0-branch:

commit 86724db93ad780106102573f2cfadd6f884e8650
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Wed Dec 10 09:52:14 2014 +

Fix OpenACC enter/exit data ICE.

[...]: In function 'f_acc_data':
[...]:4:1: internal compiler error: in expand_gimple_stmt_1, at 
cfgexpand.c:3413
 f_acc_data (void)
 ^
0x70cad3 expand_gimple_stmt_1
[...]/source-gcc/gcc/cfgexpand.c:3413
0x70cad3 expand_gimple_stmt
[...]/source-gcc/gcc/cfgexpand.c:3440
0x712b3d expand_gimple_basic_block
[...]/source-gcc/gcc/cfgexpand.c:5273
0x71479e execute
[...]/source-gcc/gcc/cfgexpand.c:5882

gcc/
* omp-low.c (build_omp_regions_1, make_gimple_omp_edges)
GIMPLE_OMP_TARGET: Handle
GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA the same as
GF_OMP_TARGET_KIND_OACC_UPDATE.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@218567 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp   |  7 +++
 gcc/omp-low.c|  8 ++--
 gcc/testsuite/c-c++-common/goacc/nesting-2.c | 11 +++
 3 files changed, 24 insertions(+), 2 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index af59ada..bece7c1 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,10 @@
+2014-12-10  Thomas Schwinge  tho...@codesourcery.com
+
+   * omp-low.c (build_omp_regions_1, make_gimple_omp_edges)
+   GIMPLE_OMP_TARGET: Handle
+   GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA the same as
+   GF_OMP_TARGET_KIND_OACC_UPDATE.
+
 2014-11-13  Cesar Philippidis  ce...@codesourcery.com
 
* omp-low.c (oacc_get_reduction_array_id): Fix whitespace.
diff --git gcc/omp-low.c gcc/omp-low.c
index 9af3b8a..6fed38f 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -9404,7 +9404,9 @@ build_omp_regions_1 (basic_block bb, struct omp_region 
*parent,
   else if (code == GIMPLE_OMP_TARGET
(gimple_omp_target_kind (stmt) == GF_OMP_TARGET_KIND_UPDATE
   || (gimple_omp_target_kind (stmt)
-  == GF_OMP_TARGET_KIND_OACC_UPDATE)))
+  == GF_OMP_TARGET_KIND_OACC_UPDATE)
+  || (gimple_omp_target_kind (stmt)
+  == GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA)))
new_omp_region (bb, code, parent);
   else
{
@@ -12270,7 +12272,9 @@ make_gimple_omp_edges (basic_block bb, struct 
omp_region **region,
   cur_region = new_omp_region (bb, code, cur_region);
   fallthru = true;
   if (gimple_omp_target_kind (last) == GF_OMP_TARGET_KIND_UPDATE
- || gimple_omp_target_kind (last) == GF_OMP_TARGET_KIND_OACC_UPDATE)
+ || gimple_omp_target_kind (last) == GF_OMP_TARGET_KIND_OACC_UPDATE
+ || (gimple_omp_target_kind (last)
+ == GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA))
cur_region = cur_region-outer;
   break;
 
diff --git gcc/testsuite/c-c++-common/goacc/nesting-2.c 
gcc/testsuite/c-c++-common/goacc/nesting-2.c
new file mode 100644
index 000..0d350c6
--- /dev/null
+++ gcc/testsuite/c-c++-common/goacc/nesting-2.c
@@ -0,0 +1,11 @@
+int i;
+
+void
+f_acc_data (void)
+{
+#pragma acc data
+  {
+#pragma acc update host(i)
+#pragma acc enter data copyin(i)
+  }
+}


Grüße,
 Thomas


signature.asc
Description: PGP signature

Re: OpenACC GIMPLE_OACC_* -- or not?

2014-12-10 Thread Thomas Schwinge

Hi!

On Wed, 12 Nov 2014 14:45:02 +0100, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Nov 12, 2014 at 02:33:43PM +0100, Thomas Schwinge wrote:
  Months later, with months' worth of GCC internals experience, I now came
  to realize that maybe this has not actually been a useful thing to do
  (and likewise for the GIMPLE_OACC_KERNELS also added later on,
  http://news.gmane.org/find-root.php?message_id=%3C1393579386-11666-1-git-send-email-thomas%40codesourcery.com%3E).
  All handling of GIMPLE_OACC_PARALLEL and GIMPLE_OACC_KERNELS closely
  follows that of GIMPLE_OMP_TARGET's GF_OMP_TARGET_KIND_REGION, with only
  minor divergence.  What I did not understand back then, has not been
  obvious to me, was that the underlying structure of all those codes will
  in fact be the same (as already made apparent by using the one
  GIMPLE_OMP_TARGET for all of: OpenMP target offloading regions, OpenMP
  target data regions, OpenMP target data maintenenace executable
  statements), and any customization then happens via the clauses
  attached to GIMPLE_OMP_TARGET.
 
 I'm fine with merging them into kinds, [...]
 
  So, sanity check: should we now merge GIMPLE_OACC_PARALLEL and
  GIMPLE_OACC_KERNELS into being subtypes of GIMPLE_OMP_TARGET (like
  GF_OMP_TARGET_KIND_REGION), as already done for
  GF_OMP_TARGET_KIND_OACC_DATA (like GF_OMP_TARGET_KIND_DATA), and
  GF_OMP_TARGET_KIND_OACC_UPDATE and
  GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA (like GF_OMP_TARGET_KIND_UPDATE).
 
 Yep.

In r218568, I applied the following to gomp-4_0-branch:

commit 28629d718a63a782170cfb06a4d0278de0779039
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Wed Dec 10 09:52:28 2014 +

Merge GIMPLE_OACC_KERNELS and GIMPLE_OACC_PARALLEL into GIMPLE_OMP_TARGET.

gcc/
* gimple.def (GIMPLE_OACC_KERNELS, GIMPLE_OACC_PARALLEL): Merge
into GIMPLE_OMP_TARGET.  Update all users.

gcc/
* cgraphbuild.c (pass_build_cgraph_edges::execute): Remove
handling of GIMPLE_OACC_PARALLEL.
* gimple-pretty-print.c (dump_gimple_omp_target): Dump a bit more
data, pretty-printing.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@218568 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp|  11 ++
 gcc/cgraphbuild.c |  13 +-
 gcc/doc/gimple.texi   |  15 --
 gcc/gimple-low.c  |   2 -
 gcc/gimple-pretty-print.c | 118 +++
 gcc/gimple-walk.c |  32 ---
 gcc/gimple.c  |  40 +---
 gcc/gimple.def|  35 +---
 gcc/gimple.h  | 273 ++
 gcc/gimplify.c|   6 +-
 gcc/omp-low.c | 225 +++--
 gcc/testsuite/gfortran.dg/goacc/private-1.f95 |   2 +-
 gcc/tree-inline.c |   6 -
 gcc/tree-nested.c |  16 --
 14 files changed, 136 insertions(+), 658 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index bece7c1..06e8583 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,4 +1,15 @@
 2014-12-10  Thomas Schwinge  tho...@codesourcery.com
+   Bernd Schmidt  ber...@codesourcery.com
+
+   * gimple.def (GIMPLE_OACC_KERNELS, GIMPLE_OACC_PARALLEL): Merge
+   into GIMPLE_OMP_TARGET.  Update all users.
+
+2014-12-10  Thomas Schwinge  tho...@codesourcery.com
+
+   * cgraphbuild.c (pass_build_cgraph_edges::execute): Remove
+   handling of GIMPLE_OACC_PARALLEL.
+   * gimple-pretty-print.c (dump_gimple_omp_target): Dump a bit more
+   data, pretty-printing.
 
* omp-low.c (build_omp_regions_1, make_gimple_omp_edges)
GIMPLE_OMP_TARGET: Handle
diff --git gcc/cgraphbuild.c gcc/cgraphbuild.c
index 9b078bc..c72ceab 100644
--- gcc/cgraphbuild.c
+++ gcc/cgraphbuild.c
@@ -368,21 +368,14 @@ pass_build_cgraph_edges::execute (function *fun)
bb-count, freq);
}
  node-record_stmt_references (stmt);
- if (gimple_code (stmt) == GIMPLE_OACC_PARALLEL
-  gimple_oacc_parallel_child_fn (stmt))
-   {
- tree fn = gimple_oacc_parallel_child_fn (stmt);
- node-create_reference (cgraph_node::get_create (fn),
- IPA_REF_ADDR, stmt);
-   }
- else if (gimple_code (stmt) == GIMPLE_OMP_PARALLEL
-   gimple_omp_parallel_child_fn (stmt))
+ if (gimple_code (stmt) == GIMPLE_OMP_PARALLEL
+  gimple_omp_parallel_child_fn (stmt))
{
  tree fn = gimple_omp_parallel_child_fn (stmt);
  node-create_reference (cgraph_node::get_create (fn),
  IPA_REF_ADDR, stmt);
}

Re: [gomp4] acc enter/exit data

2014-12-10 Thread Jakub Jelinek

On Wed, Dec 10, 2014 at 10:54:13AM +0100, Thomas Schwinge wrote:
 --- gcc/omp-low.c
 +++ gcc/omp-low.c
 @@ -9404,7 +9404,9 @@ build_omp_regions_1 (basic_block bb, struct omp_region 
 *parent,
else if (code == GIMPLE_OMP_TARGET
   (gimple_omp_target_kind (stmt) == GF_OMP_TARGET_KIND_UPDATE
  || (gimple_omp_target_kind (stmt)
 -== GF_OMP_TARGET_KIND_OACC_UPDATE)))
 +== GF_OMP_TARGET_KIND_OACC_UPDATE)
 +|| (gimple_omp_target_kind (stmt)
 +== GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA)))
   new_omp_region (bb, code, parent);
else
   {
 @@ -12270,7 +12272,9 @@ make_gimple_omp_edges (basic_block bb, struct 
 omp_region **region,
cur_region = new_omp_region (bb, code, cur_region);
fallthru = true;
if (gimple_omp_target_kind (last) == GF_OMP_TARGET_KIND_UPDATE
 -   || gimple_omp_target_kind (last) == GF_OMP_TARGET_KIND_OACC_UPDATE)
 +   || gimple_omp_target_kind (last) == GF_OMP_TARGET_KIND_OACC_UPDATE
 +   || (gimple_omp_target_kind (last)
 +   == GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA))

I'd say that at this point a
  switch (gimple_omp_target_kind (last))
{
case GF_OMP_TARGET_KIND_UPDATE:
case GF_OMP_TARGET_KIND_OACC_UPDATE:
case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA:
  ...
default:
  ...
}
would be cleaner.  The first hunk is more questionable, because there is
else and it would require duplicating of the else body in default:, goto
or similar, but perhaps it would be better that way too.

Jakub

Nested OpenACC/OpenMP constructs (was: OpenACC GIMPLE_OACC_* -- or not?)

2014-12-10 Thread Thomas Schwinge

Hi!

On Wed, 12 Nov 2014 14:45:02 +0100, Jakub Jelinek ja...@redhat.com wrote:
 please make sure we'll have
 some tests on mixing OpenMP and OpenACC directives in the same functions
 (it is fine if we error out on combinations that don't make sense or are
 too hard to support).
 E.g. supporting OpenACC #pragma omp target counterpart inside
 of #pragma omp parallel or #pragma omp task should be presumably fine,
 supporting OpenACC inside of #pragma omp target should be IMHO just
 diagnosed, mixing target data and openacc is generically hard to diagnose,
 perhaps at runtime, supporting #pragma omp directives inside of OpenACC
 regions not needed (perhaps there are exceptions you want to support?).

We have not yet tested such nested OpenACC/OpenMP constructs (one thing
after the other), so I'm not confident to claim support for that, and
earlier on had already enable checking for such nesting.  In r218569, I
have now committed to gomp-4_0-branch the following patch to rework this:

commit 30b2cd7ac340764d4f7eb14730b16a49e8799e32
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Wed Dec 10 09:52:42 2014 +

OpenACC: Rework nested constructs checking.

gcc/
* omp-low.c (scan_omp_target): Remove taskreg_nesting_level and
target_nesting_level assertions.
(check_omp_nesting_restrictions): Rework OpenACC constructs
handling.  Update and extend the relevant test cases.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@218569 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |   7 +
 gcc/omp-low.c  | 124 +++-
 .../c-c++-common/goacc-gomp/nesting-fail-1.c   | 212 +
 gcc/testsuite/c-c++-common/goacc/nesting-1.c   |  49 +
 gcc/testsuite/c-c++-common/goacc/nesting-2.c   |  11 --
 gcc/testsuite/c-c++-common/goacc/nesting-fail-1.c  |  22 ++-
 .../gfortran.dg/goacc/parallel-kernels-regions.f95 |  25 ++-
 7 files changed, 288 insertions(+), 162 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index 06e8583..970e744 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,4 +1,11 @@
 2014-12-10  Thomas Schwinge  tho...@codesourcery.com
+
+   * omp-low.c (scan_omp_target): Remove taskreg_nesting_level and
+   target_nesting_level assertions.
+   (check_omp_nesting_restrictions): Rework OpenACC constructs
+   handling.  Update and extend the relevant test cases.
+
+2014-12-10  Thomas Schwinge  tho...@codesourcery.com
Bernd Schmidt  ber...@codesourcery.com
 
* gimple.def (GIMPLE_OACC_KERNELS, GIMPLE_OACC_PARALLEL): Merge
diff --git gcc/omp-low.c gcc/omp-low.c
index 39e2f22..d16e2de 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -2647,12 +2647,6 @@ scan_omp_target (gimple stmt, omp_context *outer_ctx)
   tree name;
   bool offloaded = is_gimple_omp_offloaded (stmt);
 
-  if (is_gimple_omp_oacc_specifically (stmt))
-{
-  gcc_assert (taskreg_nesting_level == 0);
-  gcc_assert (target_nesting_level == 0);
-}
-
   ctx = new_omp_context (stmt, outer_ctx);
   ctx-field_map = splay_tree_new (splay_tree_compare_pointers, 0, 0);
   ctx-default_kind = OMP_CLAUSE_DEFAULT_SHARED;
@@ -2706,46 +2700,26 @@ scan_omp_teams (gimple stmt, omp_context *outer_ctx)
   scan_omp (gimple_omp_body_ptr (stmt), ctx);
 }
 
-/* Check OpenMP nesting restrictions.  */
+/* Check nesting restrictions.  */
 static bool
 check_omp_nesting_restrictions (gimple stmt, omp_context *ctx)
 {
-  /* TODO: While the OpenACC specification does allow for certain kinds of
- nesting, we don't support many of these yet.  */
-  if (is_gimple_omp (stmt)
-   is_gimple_omp_oacc_specifically (stmt))
+  /* TODO: Some OpenACC/OpenMP nesting should be allowed.  */
+
+  /* No nesting of non-OpenACC STMT (that is, an OpenMP one, or a GOMP builtin)
+ inside an OpenACC CTX.  */
+  if (!(is_gimple_omp (stmt)
+is_gimple_omp_oacc_specifically (stmt)))
 {
-  /* Regular handling of OpenACC loop constructs.  */
-  if (gimple_code (stmt) == GIMPLE_OMP_FOR
-  gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_OACC_LOOP)
-   goto cont;
-  /* No nesting of OpenACC STMT inside any OpenACC or OpenMP CTX different
-from an OpenACC data construct.  */
-  for (omp_context *ctx_ = ctx; ctx_ != NULL; ctx_ = ctx_-outer)
-   if (is_gimple_omp (ctx_-stmt)
-!(gimple_code (ctx_-stmt) == GIMPLE_OMP_TARGET
- (gimple_omp_target_kind (ctx_-stmt)
-== GF_OMP_TARGET_KIND_OACC_DATA)))
- {
-   error_at (gimple_location (stmt),
- may not be nested);
-   return false;
- }
-}
-  else
-{
-  /* No nesting of non-OpenACC STMT (that is, an OpenMP one, or a GOMP
-builtin) inside any OpenACC CTX.  */
   for (omp_context *ctx_ = ctx; ctx_ != NULL; ctx_ =

Re: OpenACC GIMPLE_OACC_* -- or not?

2014-12-10 Thread Thomas Schwinge

Hi Jakub!

On Wed, 10 Dec 2014 10:57:58 +0100, I wrote:
 In r218568, I applied the following to gomp-4_0-branch:
 
 commit 28629d718a63a782170cfb06a4d0278de0779039
 Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
 Date:   Wed Dec 10 09:52:28 2014 +
 
 Merge GIMPLE_OACC_KERNELS and GIMPLE_OACC_PARALLEL into GIMPLE_OMP_TARGET.
 
   gcc/
   * gimple.def (GIMPLE_OACC_KERNELS, GIMPLE_OACC_PARALLEL): Merge
   into GIMPLE_OMP_TARGET.  Update all users.

Regarding this change:

 --- gcc/gimple-walk.c
 +++ gcc/gimple-walk.c
 @@ -304,36 +304,6 @@ walk_gimple_op (gimple stmt, walk_tree_fn callback_op,
   return ret;
break;
  
 -case GIMPLE_OACC_KERNELS:
 -  ret = walk_tree (gimple_oacc_kernels_clauses_ptr (stmt), callback_op,
 -wi, pset);
 -  if (ret)
 - return ret;
 -  ret = walk_tree (gimple_oacc_kernels_child_fn_ptr (stmt), callback_op,
 -wi, pset);
 -  if (ret)
 - return ret;
 -  ret = walk_tree (gimple_oacc_kernels_data_arg_ptr (stmt), callback_op,
 -wi, pset);
 -  if (ret)
 - return ret;
 -  break;
 -
 -case GIMPLE_OACC_PARALLEL:
 -  ret = walk_tree (gimple_oacc_parallel_clauses_ptr (stmt), callback_op,
 -wi, pset);
 -  if (ret)
 - return ret;
 -  ret = walk_tree (gimple_oacc_parallel_child_fn_ptr (stmt), callback_op,
 -wi, pset);
 -  if (ret)
 - return ret;
 -  ret = walk_tree (gimple_oacc_parallel_data_arg_ptr (stmt), callback_op,
 -wi, pset);
 -  if (ret)
 - return ret;
 -  break;
 -
  case GIMPLE_OMP_CONTINUE:
ret = walk_tree (gimple_omp_continue_control_def_ptr (stmt),
  callback_op, wi, pset);

..., I noticed that GIMPLE_OMP_TARGET doesn't walk the child_fn and
data_arg.  Is that intentional, or should that be done?  If the latter
(but this doesn't seem to cause any ill effects -- why?), OK to commit
the following to trunk?

 gcc/gimple-walk.c | 8 
 1 file changed, 8 insertions(+)

diff --git gcc/gimple-walk.c gcc/gimple-walk.c
index bfa3532..1330c04 100644
--- gcc/gimple-walk.c
+++ gcc/gimple-walk.c
@@ -416,6 +416,14 @@ walk_gimple_op (gimple stmt, walk_tree_fn callback_op,
   pset);
   if (ret)
return ret;
+  ret = walk_tree (gimple_omp_target_child_fn_ptr (stmt), callback_op, wi,
+  pset);
+  if (ret)
+   return ret;
+  ret = walk_tree (gimple_omp_target_data_arg_ptr (stmt), callback_op, wi,
+  pset);
+  if (ret)
+   return ret;
   break;
 
 case GIMPLE_OMP_TEAMS:


Grüße,
 Thomas


signature.asc
Description: PGP signature

Re: OpenACC GIMPLE_OACC_* -- or not?

2014-12-10 Thread Jakub Jelinek

On Wed, Dec 10, 2014 at 11:07:37AM +0100, Thomas Schwinge wrote:
 ..., I noticed that GIMPLE_OMP_TARGET doesn't walk the child_fn and
 data_arg.  Is that intentional, or should that be done?  If the latter
 (but this doesn't seem to cause any ill effects -- why?), OK to commit
 the following to trunk?

Ok with proper ChangeLog.

  gcc/gimple-walk.c | 8 
  1 file changed, 8 insertions(+)
 
 diff --git gcc/gimple-walk.c gcc/gimple-walk.c
 index bfa3532..1330c04 100644
 --- gcc/gimple-walk.c
 +++ gcc/gimple-walk.c
 @@ -416,6 +416,14 @@ walk_gimple_op (gimple stmt, walk_tree_fn callback_op,
  pset);
if (ret)
   return ret;
 +  ret = walk_tree (gimple_omp_target_child_fn_ptr (stmt), callback_op, 
 wi,
 +pset);
 +  if (ret)
 + return ret;
 +  ret = walk_tree (gimple_omp_target_data_arg_ptr (stmt), callback_op, 
 wi,
 +pset);
 +  if (ret)
 + return ret;
break;
  
  case GIMPLE_OMP_TEAMS:

Jakub

Re: Nested OpenACC/OpenMP constructs (was: OpenACC GIMPLE_OACC_* -- or not?)

2014-12-10 Thread Thomas Schwinge

Hi Jakub!

On Wed, 10 Dec 2014 11:02:24 +0100, I wrote:
 OpenACC: Rework nested constructs checking.
 
   gcc/
   * omp-low.c (scan_omp_target): Remove taskreg_nesting_level and
   target_nesting_level assertions.
   (check_omp_nesting_restrictions): Rework OpenACC constructs
   handling.  Update and extend the relevant test cases.

Regarding the check_omp_nesting_restrictions rework:

 --- gcc/omp-low.c
 +++ gcc/omp-low.c
 @@ -2706,46 +2700,26 @@ scan_omp_teams (gimple stmt, omp_context *outer_ctx)
scan_omp (gimple_omp_body_ptr (stmt), ctx);
  }
  
 -/* Check OpenMP nesting restrictions.  */
 +/* Check nesting restrictions.  */
  static bool
  check_omp_nesting_restrictions (gimple stmt, omp_context *ctx)
  {
 -  /* TODO: While the OpenACC specification does allow for certain kinds of
 - nesting, we don't support many of these yet.  */
 -  if (is_gimple_omp (stmt)
 -   is_gimple_omp_oacc_specifically (stmt))
 +  /* TODO: Some OpenACC/OpenMP nesting should be allowed.  */
 +
 +  /* No nesting of non-OpenACC STMT (that is, an OpenMP one, or a GOMP 
 builtin)
 + inside an OpenACC CTX.  */
 +  if (!(is_gimple_omp (stmt)
 +  is_gimple_omp_oacc_specifically (stmt)))
  {
 -  /* Regular handling of OpenACC loop constructs.  */
 -  if (gimple_code (stmt) == GIMPLE_OMP_FOR
 -gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_OACC_LOOP)
 - goto cont;
 -  /* No nesting of OpenACC STMT inside any OpenACC or OpenMP CTX 
 different
 -  from an OpenACC data construct.  */
 -  for (omp_context *ctx_ = ctx; ctx_ != NULL; ctx_ = ctx_-outer)
 - if (is_gimple_omp (ctx_-stmt)
 -  !(gimple_code (ctx_-stmt) == GIMPLE_OMP_TARGET
 -   (gimple_omp_target_kind (ctx_-stmt)
 -  == GF_OMP_TARGET_KIND_OACC_DATA)))
 -   {
 - error_at (gimple_location (stmt),
 -   may not be nested);
 - return false;
 -   }
 -}
 -  else
 -{
 -  /* No nesting of non-OpenACC STMT (that is, an OpenMP one, or a GOMP
 -  builtin) inside any OpenACC CTX.  */
for (omp_context *ctx_ = ctx; ctx_ != NULL; ctx_ = ctx_-outer)
   if (is_gimple_omp (ctx_-stmt)
is_gimple_omp_oacc_specifically (ctx_-stmt))
 {
   error_at (gimple_location (stmt),
 -   may not be nested);
 +   non-OpenACC construct inside of OpenACC region);
   return false;
 }
  }
 - cont:
  
if (ctx != NULL)
  {
 @@ -3003,20 +2977,74 @@ check_omp_nesting_restrictions (gimple stmt, 
 omp_context *ctx)
break;
  case GIMPLE_OMP_TARGET:
for (; ctx != NULL; ctx = ctx-outer)
 - if (gimple_code (ctx-stmt) == GIMPLE_OMP_TARGET
 -  gimple_omp_target_kind (ctx-stmt) == GF_OMP_TARGET_KIND_REGION)
 -   {
 - const char *name;
 - switch (gimple_omp_target_kind (stmt))
 -   {
 -   case GF_OMP_TARGET_KIND_REGION: name = target; break;
 -   case GF_OMP_TARGET_KIND_DATA: name = target data; break;
 -   case GF_OMP_TARGET_KIND_UPDATE: name = target update; break;
 -   default: gcc_unreachable ();
 -   }
 - warning_at (gimple_location (stmt), 0,
 - %s construct inside of target region, name);
 -   }
 + {
 +   if (gimple_code (ctx-stmt) != GIMPLE_OMP_TARGET)
 + {
 +   if (is_gimple_omp (stmt)
 +is_gimple_omp_oacc_specifically (stmt)
 +is_gimple_omp (ctx-stmt))
 + {
 +   error_at (gimple_location (stmt),
 + OpenACC construct inside of non-OpenACC region);
 +   return false;
 + }
 +   continue;
 + }
 +
 +   const char *stmt_name, *ctx_stmt_name;
 +   switch (gimple_omp_target_kind (stmt))
 + {
 + case GF_OMP_TARGET_KIND_REGION: stmt_name = target; break;
 + case GF_OMP_TARGET_KIND_DATA: stmt_name = target data; break;
 + case GF_OMP_TARGET_KIND_UPDATE: stmt_name = target update; break;
 + case GF_OMP_TARGET_KIND_OACC_PARALLEL: stmt_name = parallel; 
 break;
 + case GF_OMP_TARGET_KIND_OACC_KERNELS: stmt_name = kernels; break;
 + case GF_OMP_TARGET_KIND_OACC_DATA: stmt_name = data; break;
 + case GF_OMP_TARGET_KIND_OACC_UPDATE: stmt_name = update; break;
 + case GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA: stmt_name = 
 enter/exit data; break;
 + default: gcc_unreachable ();
 + }
 +   switch (gimple_omp_target_kind (ctx-stmt))
 + {
 + case GF_OMP_TARGET_KIND_REGION: ctx_stmt_name = target; break;
 + case GF_OMP_TARGET_KIND_DATA: ctx_stmt_name = target data; break;
 + case GF_OMP_TARGET_KIND_OACC_PARALLEL: ctx_stmt_name = parallel; 
 break;
 + case GF_OMP_TARGET_KIND_OACC_KERNELS: ctx_stmt_name = kernels; 
 break;
 + case

Re: Nested OpenACC/OpenMP constructs (was: OpenACC GIMPLE_OACC_* -- or not?)

2014-12-10 Thread Jakub Jelinek

On Wed, Dec 10, 2014 at 11:10:19AM +0100, Thomas Schwinge wrote:
 --- /dev/null
 +++ gcc/testsuite/c-c++-common/gomp/nesting-1.c
 @@ -0,0 +1,77 @@
 +void
 +f_omp_parallel (void)
 +{
 +#pragma omp parallel
 +  {
 +int i;

Can you please use a global variable declared outside of
f_omp_parallel instead?

 +
 +#pragma omp parallel
 +;
 +
 +#pragma omp target
 +;
 +
 +#pragma omp target data
 +;
 +
 +#pragma omp target update to(i)

The thing is, if GCC tried harder, it could complain here,
because i can't really be mapped at this point and thus it would be always
undefined behavior.  If the var is global, it is possible
somebody uses
  #pragma omp target map(i)
  f_omp_parallel ();
and then it would be valid.  Similarly in other tests.

Otherwise LGTM.

Jakub

Re: [COMMITTED] [PING] [PATCH] [AArch64, NEON] More NEON intrinsics improvement

2014-12-10 Thread Yangfei (Felix)

   +__extension__ static __inline float32x2_t __attribute__
   +((__always_inline__))
   +vfms_f32 (float32x2_t __a, float32x2_t __b, float32x2_t __c) {
   +  return __builtin_aarch64_fmav2sf (-__b, __c, __a); }
   +
   +__extension__ static __inline float32x4_t __attribute__
   +((__always_inline__))
   +vfmsq_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c) {
   +  return __builtin_aarch64_fmav4sf (-__b, __c, __a); }
   +
   +__extension__ static __inline float64x2_t __attribute__
   +((__always_inline__))
   +vfmsq_f64 (float64x2_t __a, float64x2_t __b, float64x2_t __c) {
   +  return __builtin_aarch64_fmav2df (-__b, __c, __a); }
   +
   +
  
  
   Thanks, the patch looks good. Just one comment:
   You could also add
   float32x2_t vfms_n_f32(float32x2_t a, float32x2_t b, float32_t n)
   and its Q-variant.
 
  You can, if you wish,  deal with Tejas' comment with a follow on patch
  rather than re-spinning this one.   Provided this patch has no
  regressions on a big endian and a little endian test run then you can 
  commit it.
  Thanks
  /Marcus
 
 
  No regressions for aarch64_be-linux-gnu target.  Committed as r218484.
  Will come up with a new patch to deal with Tejas' comment.  Thanks.
 
 My validations of trunk show that your new tests are incorrect: none of them
 compiles because the hfloat64_t type isn't defined.
 
 Also, keep in mind that the tests in this directory are executed by the 
 aarch32
 target too.
 
 Christophe


It seems that some code for the newly added testcases is missing when the patch 
is generated.  
I will fix them soon.  Thanks for pointing this out.

Re: [PATCH][rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1

2014-12-10 Thread Marcus Shawcroft

On 10 December 2014 at 09:51, Alan Hayward alan.hayw...@arm.com wrote:

This is a new version of my BE patch from a few weeks ago.
This is part 1 and covers rtlanal.c. The second part will be aarch64
specific.

When combined with the second patch, It fixes up movoi/ci/xi for Big
Endian, so that we end up with the lab of a big-endian integer to be in
the low byte of the highest-numbered register.

This will apply cleanly by itself and no regressions were seen when
testing aarch64 and x86_64 on make check.

Hi Alan, I'm not a maintainer in this area of the compiler... however
I would suggest that the relevant maintainers of the middle end
rtlanal.c etc are more likely to pick up this review if the text of
the email focussed on how rtlanal.c is broken and why the proposed
patch is the right way to go.  You should also CC one of the
maintainers...

Cheers
/Marcus

Re: [PATCH 2/4] vldN_lane error message enhancements (D registers)

2014-12-10 Thread Alan Lawrence

Hmmm. Yes I think I may have switched that in the patch introducing 
__AARCH64_LANE_CHECK, and it was correct at time of Charles' writing. However, 
maybe we could (now) use __AARCH64_LANE_CHECK directly? (Referencing one of the 
component vectors in the blahLxMxN_t struct?)


--Alan

Christophe Lyon wrote:

On 9 December 2014 at 16:27,  charles.bay...@linaro.org wrote:

From: Charles Baylis charles.bay...@linaro.org

gcc/ChangeLog

DATE  Charles Baylis  charles.bay...@linaro.org

* config/aarch64/arm_neon.h (__LD2_LANE_FUNC): Add explicit lane
bounds check.
(__LD3_LANE_FUNC): Likewise.
(__LD4_LANE_FUNC): Likewise

gcc/testsuite/ChangeLog:

DATE  Charles Baylis  charles.bay...@linaro.org

* gcc.target/aarch64/simd/vld4_lane.c: New test.

Change-Id: Ia95fbed34b50cf710ea9032ff3428a5f1432e0aa
---
 gcc/config/aarch64/arm_neon.h |  6 ++
 gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c | 15 +++
 2 files changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c

diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 8cff719..22df564 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -17901,6 +17901,8 @@ vld2_lane_##funcsuffix (const ptrtype * __ptr, intype 
__b, const int __c)  \
   __o = __builtin_aarch64_set_qregoi##mode (__o,  \
   (signedtype) __temp.val[1], \
   1); \
+  __builtin_aarch64_im_lane_boundsi (__c, \
+sizeof (vectype) / sizeof (*__ptr));  \


Shouldn't the arguments be reversed? (I'm looking at
__AARCH64_LANE_CHECK: the lane index is the 2nd parameter)


   __o =__builtin_aarch64_ld2_lane##mode (  
   \
  (__builtin_aarch64_simd_##ptrmode *) __ptr, __o, __c);   \
   __b.val[0] = (vectype) __builtin_aarch64_get_dregoidi (__o, 0); \
@@ -17991,6 +17993,8 @@ vld3_lane_##funcsuffix (const ptrtype * __ptr, intype 
__b, const int __c)  \
   __o = __builtin_aarch64_set_qregci##mode (__o,  \
   (signedtype) __temp.val[2], \
   2); \
+  __builtin_aarch64_im_lane_boundsi (__c, \
+sizeof (vectype) / sizeof (*__ptr));  \
   __o =__builtin_aarch64_ld3_lane##mode (  
   \
  (__builtin_aarch64_simd_##ptrmode *) __ptr, __o, __c);   \
   __b.val[0] = (vectype) __builtin_aarch64_get_dregcidi (__o, 0); \
@@ -18089,6 +18093,8 @@ vld4_lane_##funcsuffix (const ptrtype * __ptr, intype 
__b, const int __c)  \
   __o = __builtin_aarch64_set_qregxi##mode (__o,  \
   (signedtype) __temp.val[3], \
   3); \
+  __builtin_aarch64_im_lane_boundsi (__c, \
+sizeof (vectype) / sizeof (*__ptr));  \
   __o =__builtin_aarch64_ld4_lane##mode (  
   \
  (__builtin_aarch64_simd_##ptrmode *) __ptr, __o, __c);   \
   __b.val[0] = (vectype) __builtin_aarch64_get_dregxidi (__o, 0); \
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c 
b/gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c
new file mode 100644
index 000..d14e6c1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c
@@ -0,0 +1,15 @@
+/* Test error message when passing an invalid value as a lane index.  */
+
+/* { dg-do compile } */
+
+#include arm_neon.h
+
+int8x8x4_t
+f_vld4_lane (int8_t * p, int8x8x4_t v)
+{
+  int8x8x4_t res;
+  /* { dg-error lane 8 out of range 0 - 7  { target *-*-* } 0 } */
+  res = vld4_lane_s8 (p, v, 8);
+  return res;
+}
+
--
1.9.1

Re: [PATCH 0/4] [AARCH64,SIMD] PR63870 Improve error messages for single lane load/store

2014-12-10 Thread Alan Lawrence


Thanks, Charles. A couple of thoughts.

I think the approach in patches 2+3+4 of using __builtin_aarch64_im_lane_boundsi 
is justified and works quite neatly. Modulo the question of argument ordering 
and __AARCH64_LANE_CHECK, those patches look good.


However, the SIMD_ARG_STRUCT_LOAD_STORE_LANE_INDEX, seems a lot of 
infrastructure to introduce if we are only going to use it in one place, and I 
think I might argue in favour of using ...__im_lane_bound or AARCH64_LANE_CHECK 
there also. Of course all of this palaver stems from using the same builtins for 
both D- and Q-reg intrinsics, and I suspect some cleanup may be due to those 
intrinsics *at some point*, but probably not in time for gcc 5.0. However, this 
does mean that if I use a D-reg intrinsic with a lane index that's out of bounds 
for the Q-reg too, I get a double error message: e.g. for testcase


int8x8x4_t
f_vld4_lane (int8_t * p, int8x8x4_t v)
{
  int8x8x4_t res;
  return vld4_lane_s8 (p, v, 18);
}

I get output:

In file included from gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c:5:0:
.../install/lib/gcc/aarch64-none-elf/5.0.0/include/arm_neon.h: In function 
'f_vld4_lane':
.../install/lib/gcc/aarch64-none-elf/5.0.0/include/arm_neon.h:18123:1: error: 
lane 18 out of range 0 - 7

 __LD4_LANE_FUNC (int8x8x4_t, int8x8_t, int8x16x4_t, int8_t, v16qi, qi, s8,
 ^
In function 'vld4_lane_s8',
inlined from 'f_vld4_lane' at 
gcc/testsuite/gcc.target/aarch64/simd/vld4_lane.c:12:7:
.../install/lib/gcc/aarch64-none-elf/5.0.0/include/arm_neon.h:18123:1: error: 
lane 18 out of range 0 - 15

 __LD4_LANE_FUNC (int8x8x4_t, int8x8_t, int8x16x4_t, int8_t, v16qi, qi, s8,
 ^

which (although not serious) could be mildly confusing.

--Alan

charles.bay...@linaro.org wrote:

From: Charles Baylis charles.bay...@linaro.org

This patch series moves the checking of lane indices for vld[234](q?)_lane and
vst[234](q?)_lane intrinsics so that it occurs during builtin expansion.

The q register variants are checked directly, but since the d register variants
use the same intrinsics, these are checked in arm_neon.h using
__builtin_aarch64_im_land_boundsi().

Tested with make check-gcc on aarch64-oe-linux, with no regressions.

Charles Baylis (4):
  vldN_lane error message enhancements (Q registers)
  vldN_lane error message enhancements (D registers)
  vstN_lane error message enhancements (Q register)
  vstN_lane error message enhancements (D registers)

 gcc/config/aarch64/aarch64-builtins.c | 32 +++-
 gcc/config/aarch64/aarch64-simd.md| 12 ++--
 gcc/config/aarch64/arm_neon.h | 12 
 3 files changed, 45 insertions(+), 11 deletions(-)

Re: [PATCH 2/3] Extended if-conversion

2014-12-10 Thread Yuri Rumyantsev

Richard,

Sorry that I forgot to delete debug dump from my fix.
I have few questions about your comments.

1. You wrote :
 You also still have two functions for PHI predication.  And the
 new extended variant doesn't commonize the 2-args and general
 path
 Did you mean that I must combine predicate_scalar_phi and
predicate_extended scalar phi to one function?
Please note that if additional flag was not set up (i.e.
aggressive_if_conv is false) extended predication is required more
compile time since it builds hash_map.

2. About critical edge splitting.

Did you mean that we should perform it (1) under aggressive_if_conv
option only; (2) should we split all critical edges.
Note that this leads to recomputing of topological order.

It is worth noting that in current implementation bb's with 2
predecessors and both are on critical edges are accepted without
additional option.

Thanks ahead.
Yuri.
2014-12-09 18:20 GMT+03:00 Richard Biener richard.guent...@gmail.com:
 On Tue, Dec 9, 2014 at 2:11 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Richard,

 Here is updated patch2 with the following changes:
 1. Delete functions  phi_has_two_different_args and find_insertion_point.
 2. Use only one function for extended predication -
 predicate_extended_scalar_phi.
 3. Save gsi before insertion of predicate computations for basic
 blocks if it has 2 predecessors and
 both incoming edges are critical or it gas more than 2 predecessors
 and at least one incoming edge
 is critical. This saved iterator can be used by extended phi predication.

 Here is motivated test-case which explains this point.
 Test-case is attached (t5.c) and it must be compiled with -O2
 -ftree-loop-vectorize -fopenmp options.
 The problem phi is in bb-7:

   bb_5 (preds = {bb_4 }, succs = {bb_7 bb_9 })
   {
 bb 5:
 xmax_edge_18 = xmax_edge_36 + 1;
 if (xmax_17 == xmax_27)
   goto bb 7;
 else
   goto bb 9;

   }
   bb_6 (preds = {bb_4 }, succs = {bb_7 bb_8 })
   {
 bb 6:
 if (xmax_17 == xmax_27)
   goto bb 7;
 else
   goto bb 8;

   }
   bb_7 (preds = {bb_6 bb_5 }, succs = {bb_11 })
   {
 bb 7:
 # xmax_edge_30 = PHI xmax_edge_36(6), xmax_edge_18(5)
 xmax_edge_19 = xmax_edge_39 + 1;
 goto bb 11;

   }

 Note that both incoming edges to bb_7 are critical. If we comment out
 restoring gsi in predicate_all_scalar_phi:
 #if 0
  if ((EDGE_COUNT (bb-preds) == 2  all_preds_critical_p (bb))
  || (EDGE_COUNT (bb-preds)  2  has_pred_critical_p (bb)))
gsi = bb_insert_point (bb);
  else
 #endif
gsi = gsi_after_labels (bb);

 we will get ICE:
 t5.c: In function 'foo':
 t5.c:9:6: error: definition in block 4 follows the use
  void foo (int n)
   ^
 for SSA_NAME: _1 in statement:
 _52 = _1  _3;
 t5.c:9:6: internal compiler error: verify_ssa failed

 smce predicate computations were inserted in bb_7.

 The issue is obviously that the predicates have already been emitted
 in the target BB - that's of course the wrong place.  This is done
 by insert_gimplified_predicates.

 This just shows how edge predicate handling is broken - we don't
 seem to have a sequence of gimplified stmts for edge predicates
 but push those to e-dest which makes this really messy.

 Rather than having a separate phase where we insert all
 gimplified bb predicates we should do that on-demand when
 predicating a PHI.

 Your patch writes to stderr - that's bad - use dump_file and guard
 the printfs properly.

 You also still have two functions for PHI predication.  And the
 new extended variant doesn't commonize the 2-args and general
 paths.

 I'm not at all happy with this code.  It may be existing if-conv codes
 fault but making it even worse is not an option.

 Again - what's wrong with simply splitting critical edges if
 aggressive_if_conv?  I think that would very much simplify
 things here.  Or alternatively use gsi_insert_on_edge and
 commit edge insertions before merging the blocks.

 Thanks,
 Richard.

 ChangeLog is

 2014-12-09  Yuri Rumyantsev  ysrum...@gmail.com

 * tree-if-conv.c : Include hash-map.h.
 (struct bb_predicate_s): Add new field to save copy of gimple
 statement iterator.
 (bb_insert_point): New function.
 (set_bb_insert_point): New function.
 (has_pred_critical_p): New function.
 (if_convertible_bb_p): Allow bb has more than 2 predecessors if
 AGGRESSIVE_IF_CONV is true.
 (if_convertible_bb_p): Delete check that bb has at least one
 non-critical incoming edge.
 (is_cond_scalar_reduction): Add arguments ARG_0, ARG_1 and EXTENDED.
 Allow interchange PHI arguments if EXTENDED is false.
 Change check that block containing reduction statement candidate
 is predecessor of phi-block since phi may have more than two arguments.
 (predicate_scalar_phi): Add new arguments for call of
 is_cond_scalar_reduction.
 (get_predicate_for_edge): New function.
 (struct phi_args_hash_traits): New type.
 (phi_args_hash_traits::hash): New function.
 (phi_args_hash_traits::equal_keys): New function.

Re: [PATCH, x86] Fix pblendv expand.

2014-12-10 Thread Jakub Jelinek

On Wed, Dec 10, 2014 at 02:37:13AM +0300, Evgeny Stupachenko wrote:
 2014-12-10  Evgeny Stupachenko  evstu...@gmail.com

I went ahead and filed a PR, so we have something to refer to in the
ChangeLog and name the testcases.

 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -47546,6 +47546,7 @@ expand_vec_perm_pblendv (struct expand_vec_perm_d *d)
  dcopy.op0 = dcopy.op1 = d-op1;
else
  dcopy.op0 = dcopy.op1 = d-op0;
 +  dcopy.target = gen_reg_rtx (vmode);
dcopy.one_operand_p = true;
 
for (i = 0; i  nelt; ++i)

This is incorrect if d-testing_p is true (can happen e.g. on the testcase
below; generally, when testing_p is true, we should not use gen_reg_rtx
because it can be called from GIMPLE optimizers before init_emit is called.
See PR57896 for details.
If d-testing_p is true, target is never the same as any of the operands,
all 3 are virtual registers, so there is no overlap (and even if there would
be, it should not matter).

 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/vect/blend.c
 @@ -0,0 +1,63 @@
 +/* Test correctness of size 3 store groups permutation.  */
 +/* { dg-do run } */

The testcase as is does not fail even with unfixed compiler.  I had to add
some dg-additional-options.

 +  bar(2, q);
 +  for (i = 0; i  N; i++)
 +if (q[0].a[i].f != 0 || q[0].a[i].c != i || q[0].a[i].p != -1)
 +  return 1;

Furthermore, tests in the testsuite should fail through abort ()
/ __builtin_abort () instead of just returning non-zero exit code.

So, here is complete updated patch I'm going to bootstrap/regtest.

2014-12-10  Jakub Jelinek  ja...@redhat.com
Evgeny Stupachenko  evstu...@gmail.com

* config/i386/i386.c (expand_vec_perm_pblendv): If not testing_p,
set dcopy.target to a new pseudo.

* gcc.dg/vect/pr64252.c: New test.
* gcc.dg/pr64252.c: New test.
* gcc.target/i386/avx2-pr64252.c: New test.

--- gcc/config/i386/i386.c.jj   2014-12-10 09:45:15.0 +0100
+++ gcc/config/i386/i386.c  2014-12-10 11:38:12.530795610 +0100
@@ -47554,6 +47554,8 @@ expand_vec_perm_pblendv (struct expand_v
 dcopy.op0 = dcopy.op1 = d-op1;
   else
 dcopy.op0 = dcopy.op1 = d-op0;
+  if (!d-testing_p)
+dcopy.target = gen_reg_rtx (vmode);
   dcopy.one_operand_p = true;
 
   for (i = 0; i  nelt; ++i)
--- gcc/testsuite/gcc.dg/vect/pr64252.c.jj  2014-12-10 11:42:47.669991028 
+0100
+++ gcc/testsuite/gcc.dg/vect/pr64252.c 2014-12-10 11:47:43.895818223 +0100
@@ -0,0 +1,66 @@
+/* PR target/64252 */
+/* Test correctness of size 3 store groups permutation.  */
+/* { dg-do run } */
+/* { dg-additional-options -O3 } */
+/* { dg-additional-options -mavx { target avx_runtime } } */
+
+#include tree-vect.h
+
+#define N 50
+
+enum num3
+{
+  a, b, c
+};
+
+struct flags
+{
+  enum num3 f;
+  unsigned int c;
+  unsigned int p;
+};
+
+struct flagsN
+{
+  struct flags a[N];
+};
+
+void
+bar (int n, struct flagsN *ff)
+{
+  struct flagsN *fc;
+  for (fc = ff + 1; fc  (ff + n); fc++)
+{
+  int i;
+  for (i = 0; i  N; ++i)
+   {
+ ff-a[i].f = 0;
+ ff-a[i].c = i;
+ ff-a[i].p = -1;
+   }
+  for (i = 0; i  n; i++)
+   {
+ int j;
+ for (j = 0; j  N - n; ++j)
+   {
+ fc-a[i + j].f = 0;
+ fc-a[i + j].c = j + i;
+ fc-a[i + j].p = -1;
+   }
+   }
+}
+}
+
+struct flagsN q[2];
+
+int main()
+{
+  int i;
+  check_vect ();
+  bar(2, q);
+  for (i = 0; i  N; i++)
+if (q[0].a[i].f != 0 || q[0].a[i].c != i || q[0].a[i].p != -1)
+  abort ();
+  return 0;
+}
+/* { dg-final { cleanup-tree-dump vect } } */
--- gcc/testsuite/gcc.dg/pr64252.c.jj   2014-12-10 11:26:05.649467180 +0100
+++ gcc/testsuite/gcc.dg/pr64252.c  2014-12-10 11:25:27.057139759 +0100
@@ -0,0 +1,30 @@
+/* PR target/64252 */
+/* { dg-do run } */
+/* { dg-options -O2 } */
+
+typedef unsigned int V __attribute__((vector_size (32)));
+
+__attribute__((noinline, noclone)) void
+foo (V *a, V *b, V *c, V *d, V *e)
+{
+  V t = __builtin_shuffle (*a, *b, *c);
+  V v = __builtin_shuffle (t, (V) { ~0U, ~0U, ~0U, ~0U, ~0U, ~0U, ~0U, ~0U }, 
(V) { 0, 1, 8, 3, 4, 5, 9, 7 });
+  v = v + *d;
+  *e = v;
+}
+
+int
+main ()
+{
+  V a, b, c, d, e;
+  int i;
+  a = (V) { 1, 2, 3, 4, 5, 6, 7, 8 };
+  b = (V) { 9, 10, 11, 12, 13, 14, 15, 16 };
+  c = (V) { 1, 3, 5, 7, 9, 11, 13, 15 };
+  d = (V) { 0, 0, 0, 0, 0, 0, 0, 0 };
+  foo (a, b, c, d, e);
+  for (i = 0; i  8; i++)
+if (e[i] != ((i == 2 || i == 6) ? ~0U : 2 + 2 * i))
+  __builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.target/i386/avx2-pr64252.c.jj 2014-12-10 
11:27:36.868877426 +0100
+++ gcc/testsuite/gcc.target/i386/avx2-pr64252.c2014-12-10 
11:30:49.500520279 +0100
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+/* { dg-options -O2 -mavx2 } */
+/* { dg-require-effective-target avx2 } */
+
+#include avx2-check.h
+
+#define main() do_main ()
+
+#include ../../gcc.dg/pr64252.c
+
+static void

Re: [RFC] diagnostics.c: For terminals, restrict messages to terminal width?

2014-12-10 Thread Dodji Seketeli

Hello Tobias,

Thank you for this patch.  I have a few comments about it below.  Just
as a heads-up, I am asking questions to Manuel in there, as well as
referring to comments from FX's.  Please read below.

Tobias Burnus bur...@net-b.de writes:

 This patch fixes a Fortran diagnostic regression.

 With the current common diagnostic, the width shown with caret
 diagnostic is determined by:

 case OPT_fmessage_length_:
   pp_set_line_maximum_length (dc-printer, value);
   diagnostic_set_caret_max_width (dc, value);

 plus

  diagnostic_set_caret_max_width (diagnostic_context *context, int value)
  {
/* One minus to account for the leading empty space.  */
value = value ? value - 1
  : (isatty (fileno (pp_buffer (context-printer)-stream))
? getenv_columns () - 1: INT_MAX);

if (value = 0)
  value = INT_MAX;

context-caret_max_width = value;
  }

 where getenv_columns looks at the environment variable COLUMNS.

 Note that -fmessage-length= applies to the error message (wraps) _and_
 the caret diagnostic (truncates) while the COLUMNS variable _only_
 applies to the caret diagnostic. (BTW: The documentation currently
 does not mention COLUMNS.)

I guess we should adjust the documentation to mention COLUMNS.

Manuel, was there a particular reason to avoid mentioning the COLUMNS
environment variable in the documentation?

 On most terminals, which I tried, COLUMNS does not seem to be set. In
 Fortran, error.c's get_terminal_width has a similar check, but
 additionally it uses ioctl to determine the terminal width.

 I think with caret diagnostics, it is useful not to exceed the
 terminal width as having several empty lines before the ^ does not
 really improve the readability. Thus, I would propose to additionally
 use ioctl. Which rises two questions: (a) Should the COLUMNS
 environment variable or ioctl have a higher priority? [Fortran ranks
 ioctl higher; in the patch, for backward compatibilty, I rank COLUMNS
 higher.]

I agree.

 (b) Should ioctl be always used or only for Fortran?

I'd go for using it in the common diagnostics framework, unless there is
a sound motivated reason.  Manuel, do you remember why we didn't query the
TIOCGWINSZ ioctl property to get the terminal size when that capability
was available?

 Comments?

If the change comes with ChangeLog, passes bootstrap and nobody else
objects, I pre-approve this patch.

Thanks!

-- 
Dodji

Re: [PATCH][ARM] Fix names of some rounding intrinsics, impement vrndx_f32 and vrndxq_f32

2014-12-10 Thread Kyrill Tkachov


Ping.

Kyrill
On 01/12/14 11:43, Kyrill Tkachov wrote:

Ping.

Kyrill

On 21/11/14 11:30, Kyrill Tkachov wrote:

Ping again.

Thanks,
Kyrill

On 13/11/14 14:45, Kyrill Tkachov wrote:

Ping.

Kyrill

On 04/11/14 10:56, Kyrill Tkachov wrote:

Phew,

This one slipped through the cracks. Ping?
https://gcc.gnu.org/ml/gcc-patches/2014-09/msg01981.html

Thanks,
Kyrill

On 23/09/14 16:25, Kyrill Tkachov wrote:

On 23/09/14 16:07, Kyrill Tkachov wrote:

Hi all,

Some intrinsics had the wrong name (inconsistent with the NEON
intrinsics spec). This patch fixes that and adds the vrndx_f32 and
vrndxq_f32 intrinsics that were missing.

For reference, the NEON intrinsics spec can be found at:
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0073a/IHI0073A_arm_neon_intrinsics_ref.pdf

Kyrill


These map down to vrintx.f32 NEON instructions (d and q forms). We
already had builtins defined for them, just the intrinsics were not
wired up to them properly.

Tested arm-none-eabi

Ok for trunk?

2014-09-23  Kyrylo Tkachov  kyrylo.tkac...@arm.com

   * config/arm/arm_neon.h (vrndqn_f32): Rename to...
   (vrndnq_f32): ... this.
   (vrndqa_f32): Rename to...
   (vrndaq_f32): ... this.
   (vrndqp_f32): Rename to...
   (vrndpq_f32): ... this.
   (vrndqm_f32): Rename to...
   (vrndmq_f32): ... this.
   (vrndx_f32): New intrinsic.
   (vrndxq_f32): Likewise.

2014-09-23  Kyrylo Tkachov  kyrylo.tkac...@arm.com

   * gcc.target/arm/simd/neon-vrndx_f32_1.c: New test.
   * gcc.target/arm/simd/neon-vrndxq_f32_1.c: Likewise.
   * gcc.target/arm/neon/vrndqaf32.c: Rename to...
   * gcc.target/arm/neon/vrndaqf32.c: ... This. Update intrinsic names.
   * gcc.target/arm/neon/vrndqmf32.c: Rename to...
   * gcc.target/arm/neon/vrndmqf32.c: ... This. Update intrinsic names.
   * gcc.target/arm/neon/vrndqnf32.c: Rename to...
   * gcc.target/arm/neon/vrndnqf32.c: ... This. Update intrinsic names.
   * gcc.target/arm/neon/vrndqpf32.c: Rename to...
   * gcc.target/arm/neon/vrndpqf32.c: ... This. Update intrinsic names.

Re: [PATCH] TYPE_OVERFLOW_* cleanup

2014-12-10 Thread Marek Polacek

On Tue, Dec 09, 2014 at 08:26:56PM +0100, Marc Glisse wrote:
 @@ -426,7 +426,8 @@ negate_expr_p (tree t)
 
 case VECTOR_CST:
   {
 -if (FLOAT_TYPE_P (TREE_TYPE (type)) || TYPE_OVERFLOW_WRAPS (type))
 +if (FLOAT_TYPE_P (TREE_TYPE (type))
 +|| (INTEGRAL_TYPE_P (type)  TYPE_OVERFLOW_WRAPS (type)))
return true;
 
 type is a vector type, so INTEGRAL_TYPE_P (type) is always false I think.
 
I guess that's true.

 
  int count = TYPE_VECTOR_SUBPARTS (type), i;
 @@ -558,7 +559,8 @@ fold_negate_expr (location_t loc, tree t)
 case INTEGER_CST:
   tem = fold_negate_const (t, type);
   if (TREE_OVERFLOW (tem) == TREE_OVERFLOW (t)
 -  || (!TYPE_OVERFLOW_TRAPS (type)
 +  || (INTEGRAL_TYPE_P (type)
 
 Can that be false for an INTEGER_CST?
 
Seems that I was too eager here adding checks.  (Or maybe I really saw
an ICE here.)

 @@ -10074,7 +10085,8 @@ fold_binary_loc (location_t loc,
/* Reassociate (plus (plus (mult) (foo)) (mult)) as
   (plus (plus (mult) (mult)) (foo)) so that we can
   take advantage of the factoring cases below.  */
 -  if (TYPE_OVERFLOW_WRAPS (type)
 +  if (INTEGRAL_TYPE_P (type)
 +   TYPE_OVERFLOW_WRAPS (type)
 (((TREE_CODE (arg0) == PLUS_EXPR
  || TREE_CODE (arg0) == MINUS_EXPR)
  TREE_CODE (arg1) == MULT_EXPR)
 
 Is there a particular reason to disable this for vectors? Note that we are
 already in a !FLOAT_TYPE_P block.

 Again, why disable this for vectors?
 (Btw tree_unary_nonnegative_warnv_p seems wrong for ABS of an integer vector)
 
 I am stopping here. I think by default you should allow integer vectors.
 A new macro that accepts both scalar and vector integer types would be
 helpful (like FLOAT_TYPE_P).

Yeah, let me prepare another version.  Thanks,

Marek

Re: [patch] Fix ICE on unaligned record field

2014-12-10 Thread Eric Botcazou

 I suppose that could be done by something like the following, which I
 have tested only very mildly so far, in particular I have not double
 checked that get_inner_reference is cfun-agnostic.

Thanks, this works fine on the testcase and I believe that get_inner_reference 
is indeed cfun-agnostic (for example it's called from front-ends).

 2014-12-03  Martin Jambor  mjam...@suse.cz
 
   * tree-sra.c (ipa_sra_check_caller_data): New type.
   (has_caller_p): Removed.
   (ipa_sra_check_caller): New function.
   (ipa_sra_preliminary_function_checks): Use it.

If Richard and you think it's the way to go, then fine by me.

-- 
Eric Botcazou

[PATCH] IPA ICF: refactoring + fix for PR ipa/63569

2014-12-10 Thread Martin Liška


Hello.

As suggested by Richard, I split compare_operand functions to various functions
related to a specific comparison. Apart from that I added fast check for
volatility flag that caused miscompilation mentioned in PR63569.

Patch can bootstrap on x86_64-linux-pc without any regression seen and I was
able to build Firefox with LTO.

Ready for trunk?
Thanks,
Martin
From 773308af2d2f93a3fca17f3c07030ec9762accc7 Mon Sep 17 00:00:00 2001
From: mliska mli...@suse.cz
Date: Wed, 10 Dec 2014 12:56:48 +0100
Subject: [PATCH] IPA ICF: compare_operand is split to multiple functions.

gcc/testsuite/ChangeLog:

2014-12-10  Martin Liska  mli...@suse.cz

	* gcc.dg/ipa/pr63569.c: New test.

gcc/ChangeLog:

2014-12-10  Martin Liska  mli...@suse.cz

	PR ipa/63569
	* ipa-icf-gimple.c (func_checker::compare_ssa_name): More complex
	comparison is moved to this function.
	(func_checker::compare_memory_operand): New function is responsible
	for comparison of a given memory operands.
	(func_checker::compare_operand): Global switch is reduced and more
	specific comparison functions are called.
	(func_checker::compare_cst_or_decl): New function compares declarations
	and contant types.
	* ipa-icf-gimple.h: Declaration of new function is added.
---
 gcc/ipa-icf-gimple.c   | 271 -
 gcc/ipa-icf-gimple.h   |   9 +-
 gcc/testsuite/gcc.dg/ipa/pr63569.c |  32 +
 3 files changed, 190 insertions(+), 122 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr63569.c

diff --git a/gcc/ipa-icf-gimple.c b/gcc/ipa-icf-gimple.c
index 8f2a438..4ee343d 100644
--- a/gcc/ipa-icf-gimple.c
+++ b/gcc/ipa-icf-gimple.c
@@ -110,6 +110,9 @@ func_checker::~func_checker ()
 bool
 func_checker::compare_ssa_name (tree t1, tree t2)
 {
+  gcc_assert (TREE_CODE (t1) == SSA_NAME);
+  gcc_assert (TREE_CODE (t2) == SSA_NAME);
+
   unsigned i1 = SSA_NAME_VERSION (t1);
   unsigned i2 = SSA_NAME_VERSION (t2);
 
@@ -123,6 +126,20 @@ func_checker::compare_ssa_name (tree t1, tree t2)
   else if (m_target_ssa_names[i2] != (int) i1)
 return false;
 
+  if (SSA_NAME_IS_DEFAULT_DEF (t1))
+{
+  tree b1 = SSA_NAME_VAR (t1);
+  tree b2 = SSA_NAME_VAR (t2);
+
+  if (b1 == NULL  b2 == NULL)
+	return true;
+
+  if (b1 == NULL || b2 == NULL || TREE_CODE (b1) != TREE_CODE (b2))
+	return return_false ();
+
+  return compare_cst_or_decl (b1, b2);
+}
+
   return true;
 }
 
@@ -178,9 +195,10 @@ func_checker::compare_decl (tree t1, tree t2)
 }
 
 /* Return true if types are compatible from perspective of ICF.  */
-bool func_checker::compatible_types_p (tree t1, tree t2,
-   bool compare_polymorphic,
-   bool first_argument)
+bool
+func_checker::compatible_types_p (tree t1, tree t2,
+  bool compare_polymorphic,
+  bool first_argument)
 {
   if (TREE_CODE (t1) != TREE_CODE (t2))
 return return_false_with_msg (different tree types);
@@ -211,76 +229,17 @@ bool func_checker::compatible_types_p (tree t1, tree t2,
   return true;
 }
 
-/* Function responsible for comparison of handled components T1 and T2.
-   If these components, from functions FUNC1 and FUNC2, are equal, true
-   is returned.  */
+/* Function compare for equality given memory operands T1 and T2.  */
 
 bool
-func_checker::compare_operand (tree t1, tree t2)
+func_checker::compare_memory_operand (tree t1, tree t2)
 {
-  tree base1, base2, x1, x2, y1, y2, z1, z2;
-  HOST_WIDE_INT offset1 = 0, offset2 = 0;
-  bool ret;
-
-  if (!t1  !t2)
-return true;
-  else if (!t1 || !t2)
-return false;
-
-  tree tt1 = TREE_TYPE (t1);
-  tree tt2 = TREE_TYPE (t2);
-
-  if (!func_checker::compatible_types_p (tt1, tt2))
-return false;
-
-  base1 = get_addr_base_and_unit_offset (t1, offset1);
-  base2 = get_addr_base_and_unit_offset (t2, offset2);
-
-  if (base1  base2)
-{
-  if (offset1 != offset2)
-	return return_false_with_msg (base offsets are different);
-
-  t1 = base1;
-  t2 = base2;
-}
+  bool ret = false;
 
-  if (TREE_CODE (t1) != TREE_CODE (t2))
-return return_false ();
+  tree x1, x2, y1, y2;
 
   switch (TREE_CODE (t1))
 {
-case CONSTRUCTOR:
-  {
-	unsigned length1 = vec_safe_length (CONSTRUCTOR_ELTS (t1));
-	unsigned length2 = vec_safe_length (CONSTRUCTOR_ELTS (t2));
-
-	if (length1 != length2)
-	  return return_false ();
-
-	for (unsigned i = 0; i  length1; i++)
-	  if (!compare_operand (CONSTRUCTOR_ELT (t1, i)-value,
-CONSTRUCTOR_ELT (t2, i)-value))
-	return return_false();
-
-	return true;
-  }
-case ARRAY_REF:
-case ARRAY_RANGE_REF:
-  x1 = TREE_OPERAND (t1, 0);
-  x2 = TREE_OPERAND (t2, 0);
-  y1 = TREE_OPERAND (t1, 1);
-  y2 = TREE_OPERAND (t2, 1);
-
-  if (!compare_operand (array_ref_low_bound (t1),
-			array_ref_low_bound (t2)))
-	return return_false_with_msg ();
-  if (!compare_operand (array_ref_element_size (t1),
-			array_ref_element_size (t2)))
-	return return_false_with_msg ();
-  if

Re: [RFC] diagnostics.c: For terminals, restrict messages to terminal width?

2014-12-10 Thread Manuel López-Ibáñez

On 10 December 2014 at 12:10, Dodji Seketeli do...@redhat.com wrote:

 Note that -fmessage-length= applies to the error message (wraps) _and_
 the caret diagnostic (truncates) while the COLUMNS variable _only_
 applies to the caret diagnostic. (BTW: The documentation currently
 does not mention COLUMNS.)

 I guess we should adjust the documentation to mention COLUMNS.

 Manuel, was there a particular reason to avoid mentioning the COLUMNS
 environment variable in the documentation?

Not that I remember. Perhaps the documentation should say something
like: The line is truncated to fit into n characters only if the
option -fmessage-length=n is given, or if the output is a TTY and the
COLUMNS environment variable is set.

 (b) Should ioctl be always used or only for Fortran?

 I'd go for using it in the common diagnostics framework, unless there is
 a sound motivated reason.  Manuel, do you remember why we didn't query the
 TIOCGWINSZ ioctl property to get the terminal size when that capability
 was available?

I was not aware this possibility even existed.

 Comments?

Note that Fortran has this:

#ifdef GWINSZ_IN_SYS_IOCTL
# include sys/ioctl.h
#endif

Not sure if this is needed for diagnostics.c or whether it needs some
configure magick.

I also agree with FX that the function should be named something like
get_terminal_width().

In fact, I would argue that the Fortran version should be a wrapper
around this one to make the output consistent between the new Fortran
diagnostics and the old ones (*_1 variants) while the transition is in
progress, even if that means changing the current ordering for
Fortran. So far, there does not seem to be any reason to prefer one
ordering over the other, but whatever it is chosen, it would be better
to be consistent. Thus something like:

Index: error.c
===
--- error.c(revision 218410)
+++ error.c(working copy)
@@ -78,7 +78,7 @@
 /* Determine terminal width (for trimming source lines in output).  */

 static int
-get_terminal_width (void)
+gfc_get_terminal_width (void)
 {
   /* Only limit the width if we're outputting to a terminal.  */
 #ifdef HAVE_UNISTD_H
@@ -85,26 +85,7 @@
   if (!isatty (STDERR_FILENO))
 return INT_MAX;
 #endif
-
-  /* Method #1: Use ioctl (not available on all systems).  */
-#ifdef TIOCGWINSZ
-  struct winsize w;
-  w.ws_col = 0;
-  if (ioctl (0, TIOCGWINSZ, w) == 0  w.ws_col  0)
-return w.ws_col;
-#endif
-
-  /* Method #2: Query environment variable $COLUMNS.  */
-  const char *p = getenv (COLUMNS);
-  if (p)
-{
-  int value = atoi (p);
-  if (value  0)
-return value;
-}
-
-  /* If both fail, use reasonable default.  */
-  return 80;
+  return get_terminal_width ();
 }


@@ -113,7 +94,7 @@
 void
 gfc_error_init_1 (void)
 {
-  terminal_width = get_terminal_width ();
+  terminal_width = gfc_get_terminal_width ();
   errors = 0;
   warnings = 0;
   buffer_flag = 0;


The other conflict is that Fortran's default terminal_width if
everything fails is 80, whereas the common diagnostics defaults to
INT_MAX. If Fortran devs do not mind this new default (which is
already in place for all diagnostics using the new functions), then
the wrapper will make the output consistent. If they really want to
default to 80, then the Fortran code can still use the wrapper but do
If (terminal_width == INT_MAX) terminal_width = 80 and call
diagnostic_set_caret_max_width(, terminal_width) to set the same value
to the new and old functions.

Finally, there is another use of getenv (COLUMNS) in opts.c that
could use this new function:

Index: opts.c
===
--- opts.c(revision 218410)
+++ opts.c(working copy)
@@ -1231,18 +1231,8 @@
  the desired maximum width of the output.  */
   if (opts-x_help_columns == 0)
 {
-  const char *p;
-
-  p = getenv (COLUMNS);
-  if (p != NULL)
-{
-  int value = atoi (p);
-
-  if (value  0)
-opts-x_help_columns = value;
-}
-
-  if (opts-x_help_columns == 0)
+  opts-x_help_columns = get_terminal_width ();
+  if (opts-x_help_columns == INT_MAX)
 /* Use a reasonable default.  */
 opts-x_help_columns = 80;
 }

Re: [PATCH 8/9] Negative numbers added for sreal class.

2014-12-10 Thread Martin Liška


On 12/09/2014 03:15 PM, Richard Biener wrote:

On Mon, Dec 8, 2014 at 5:52 PM, Martin Liška mli...@suse.cz wrote:

On 11/28/2014 10:32 AM, Richard Biener wrote:


On Thu, Nov 27, 2014 at 6:08 PM, Martin Liška mli...@suse.cz wrote:


On 11/21/2014 04:21 PM, Martin Liška wrote:



On 11/21/2014 04:02 PM, Richard Biener wrote:



On Fri, Nov 21, 2014 at 3:39 PM, Martin Liška mli...@suse.cz wrote:


Hello.

Ok, this is simplified, one can use sreal a = 12345 and it works ;)


that's a  new API, right?  There is no max () and I think that using
LONG_MIN here is asking for trouble (host dependence).  The
comment in the file says the max should be
sreal (SREAL_MAX_SIG, SREAL_MAX_EXP) and the min
sreal (-SREAL_MAX_SIG, SREAL_MAX_EXP)?



Sure, sreal can store much bigger(smaller) numbers :)


Where do you need sreal::to_double?  The host shouldn't perform
double calculations so it can be only for dumping?  In which case
the user should have used sreal::dump (), maybe with extra
arguments.



That new function was request from Honza, only for debugging purpose.
I agree that dump should this kind of job.

If no other problem, I will run tests once more and commit it.
Thanks,
Martin




-#define SREAL_MAX_EXP (INT_MAX / 4)
+#define SREAL_MAX_EXP (INT_MAX / 8)

this change doesn't look necessary anymore?

Btw, it's also odd that...

#define SREAL_PART_BITS 32
...
#define SREAL_MIN_SIG ((uint64_t) 1  (SREAL_PART_BITS - 1))
#define SREAL_MAX_SIG (((uint64_t) 1  SREAL_PART_BITS) - 1)

thus all m_sig values fit in 32bits but we still use a uint64_t m_sig
...
(the implementation uses 64bit for internal computations, but still
the storage is wasteful?)

Of course the way normalize() works requires that storage to be
64bits to store unnormalized values.

I'd say ok with the SREAL_MAX_EXP change reverted.



Hi.

You are right, this change was done because I used one bit for
m_negative
(bitfield), not needed any more.

Final version attached.

Thank you,
Martin


Thanks,
Richard.





Otherwise looks good to me and sorry for not noticing the above
earlier.

Thanks,
Richard.


Thanks,
Martin



  };

  extern void debug (sreal ref);
@@ -76,12 +133,12 @@ inline sreal operator+= (sreal a, const
sreal
b)

  inline sreal operator-= (sreal a, const sreal b)
  {
-return a = a - b;
+  return a = a - b;
  }

  inline sreal operator/= (sreal a, const sreal b)
  {
-return a = a / b;
+  return a = a / b;
  }

  inline sreal operator*= (sreal a, const sreal b)
--
2.1.2










Hello.

After IRC discussions, I decided to give sreal another refactoring where
I
use int64_t for m_sig.

This approach looks much easier and straightforward. I would like to
ask folk for comments?



I think you want to exclude LONG_MIN (how do you know that LONG_MIN
is min(int64_t)?) as a valid value for m_sig - after all SREAL_MIN_SIG
makes it symmetric.  Or simply use abs_hwi at



Hello.

I decided to use abs_hwi.


That will ICE if you do

   sreal x (- __LONG_MAX__ - 1);

maybe that's the only case though.

  sreal::normalize ()
  {
+  int64_t s = m_sig  0 ? -1 : 1;
+  HOST_WIDE_INT sig = abs_hwi (m_sig);
+
if (m_sig == 0)
...
  }
+
+  m_sig = s * sig;
  }

it's a bit awkward to strip the sign and then put it back on this way.  Also
now using a signed 'sig' where it was unsigned before.  And keeping
the first test using m_sig instead of sig.

I'd simply have used 'unsigned HOST_WIDE_INT sig = absu_hwi (m_sig);'
instead.

The rest of the patch is ok with the above change.

Thanks,
Richard.


Hello.

You are right that unsigned type would be more suitable.
I send final version that I plan to commit.

Thanks,
Martin






+  int64_t s = m_sig  0 ? -1 : 1;
+  uint64_t sig = m_sig == LONG_MIN ? LONG_MAX : std::abs (m_sig);

-#define SREAL_MIN_SIG ((uint64_t) 1  (SREAL_PART_BITS - 1))
-#define SREAL_MAX_SIG (((uint64_t) 1  SREAL_PART_BITS) - 1)
+#define SREAL_MIN_SIG ((uint64_t) 1  (SREAL_PART_BITS - 2))
+#define SREAL_MAX_SIG (((uint64_t) 1  (SREAL_PART_BITS - 1)) - 1)

shouldn't this also be -2 in the last line?



It's correct, in the first line, I s/'SREAL_PART_BITS - 1'/'SREAL_PART_BITS
- 2' and
second one is also decremented: s/'SREAL_PART_BITS'/'REAL_PART_BITS - 1'.



That is, you effectively use the MSB as a sign bit?



Yes. It uses signed integer with MSB as a sign bit.



Btw, a further change would be to make m_sig 'signed int', matching
the real storage requirements according to SREAL_PART_BITS.
This would of course still require temporaries used for computation
to be 64bits and it would require normalization to work on the
temporaries.  But then we'd get down to 8 bytes storage ...



Agree, we can shrink the size for future.

I've been running tests, I hope this implementation is much nicer than
having bool m_negative. What do you think about trunk acceptation?

Thanks,
Martin




Richard.



I am able to run profiled bootstrap on x86_64-linux-pc and ppc64-linux-pc
and new regression is

Re: [PATCH fortran/diagnostics] Move gfc_error (buffered) to common diagnostics

2014-12-10 Thread Dodji Seketeli

Hello Manuel,

Manuel López-Ibáñez lopeziba...@gmail.com writes:

 New version of the patch. Tobias noticed several problems with the
 previous version:

 * Due to the use of placement-new for the buffered output_buffers
 pp_warning_buffer and pp_error_buffer, the pretty-printer destructor
 may end up trying to free something that it can't. Fixed here by not
 using placement new.

So:

[...]


  /* Report the number of warnings and errors that occurred to the caller.  */
  
 @@ -1525,11 +1625,14 @@ gfc_diagnostics_init (void)
  {
diagnostic_starter (global_dc) = gfc_diagnostic_starter;
diagnostic_finalizer (global_dc) = gfc_diagnostic_finalizer;
diagnostic_format_decoder (global_dc) = gfc_format_decoder;
global_dc-caret_char = '^';
 -  new (pp_warning_buffer) output_buffer ();
 +  pp_warning_buffer = new output_buffer ();

When I look at the code of the destructor the pretty_printer type
(pretty_printer::~pretty_printer) in gcc/pretty-print.c, I see that the
memory for the output buffer is de-allocated using XDELETE.  So I think
the memory for the output buffer should be allocated using XNEW and the
output_buffer type should instantiated using a placement new operator
that uses that XNEWed allocated memory.

Ultimately, I should sit down and make sure the memory management of the
pretty_printer type does away with this surprising placement new
business for good.

 +  pp_warning_buffer-flush_p = false;
 +  pp_error_buffer = new output_buffer ();
 +  pp_error_buffer-flush_p = false;
  }

Cheers,

-- 
Dodji

[PATCH][AARCH64]Fix AArch64 CLZ_DEFINED_AT_ZERO and CTZ_DEFINED_AT_ZERO definition.

2014-12-10 Thread Renlin Li


Hi all,

This patch update the CTZ_DEFINED_VALUE_AT_ZERO definition to support 
more modes. In addition, those two macros should both return 2 in 
aarch64 back-end.


Here are the explanations from GCC documentation:

CLZ_DEFINED_VALUE_AT_ZERO (mode, value)
CTZ_DEFINED_VALUE_AT_ZERO (mode, value)
A C expression that indicates whether the architecture defines a value
for @code{clz} or @code{ctz} with a zero operand.
A result of 0 indicates the value is undefined.
If the value is defined for only the RTL expression, the macro should
evaluate to 1; if the value applies also to the corresponding optab
entry (which is normally the case if it expands directly into
the corresponding RTL), then the macro should evaluate to 2.
In the cases where the value is defined, @var{value} should be set to
this value.


aarch64-none-elf has been test on the model, no new issue.
Okay for trunk?

Regards,
Renlin Li

gcc/ChangeLog:

2014-12-10 Renlin Li renlin...@arm.com

* config/aarch64/aarch64.h (CLZ_DEFINED_VALUE_AT_ZERO): make it 
return 2.

(CTZ_DEFINED_VALUE_AT_ZERO): Update to support more modes.diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index bbe33a9..3bc416a 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -804,9 +804,9 @@ do {	 \
: reverse_condition (CODE))
 
 #define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
-  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE))
+  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
 #define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
-  ((VALUE) = ((MODE) == SImode ? 32 : 64), 2)
+  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
 
 #define INCOMING_RETURN_ADDR_RTX gen_rtx_REG (Pmode, LR_REGNUM)

Re: [PATCH fortran/diagnostics] Move gfc_error (buffered) to common diagnostics

2014-12-10 Thread Manuel López-Ibáñez

On 10 December 2014 at 13:54, Dodji Seketeli do...@redhat.com wrote:
  /* Report the number of warnings and errors that occurred to the caller.  */

 @@ -1525,11 +1625,14 @@ gfc_diagnostics_init (void)
  {
diagnostic_starter (global_dc) = gfc_diagnostic_starter;
diagnostic_finalizer (global_dc) = gfc_diagnostic_finalizer;
diagnostic_format_decoder (global_dc) = gfc_format_decoder;
global_dc-caret_char = '^';
 -  new (pp_warning_buffer) output_buffer ();
 +  pp_warning_buffer = new output_buffer ();

 When I look at the code of the destructor the pretty_printer type
 (pretty_printer::~pretty_printer) in gcc/pretty-print.c, I see that the
 memory for the output buffer is de-allocated using XDELETE.  So I think
 the memory for the output buffer should be allocated using XNEW and the
 output_buffer type should instantiated using a placement new operator
 that uses that XNEWed allocated memory.

The reason we use this placement-new stuff is precisely the use of
XNEW and XDELETE, since those do not call the cons/des-tructors.
However, a lot of code in GCC is directly using new/delete already. I
don't see any reason to not do so in the pretty-printer/diagnostics
code. In fact, David Malcom recently did a similar change for
tree-pretty-print.c:
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg03213.html

But I agree that using XNEW and placement-new is probably safer for
now. I'll try that and submit a new version.

Cheers,

Manuel.

Re: [PATCH] PR ipa/63909 ICE: SIGSEGV in ipa_icf_gimple::func_checker::compare_bb()

2014-12-10 Thread Richard Biener

On Tue, Dec 9, 2014 at 4:52 PM, Martin Liška mli...@suse.cz wrote:
 On 11/21/2014 01:23 PM, Richard Biener wrote:

 On Fri, Nov 21, 2014 at 12:52 PM, Martin Liška mli...@suse.cz wrote:

 On 11/20/2014 05:41 PM, Richard Biener wrote:


 On Thu, Nov 20, 2014 at 5:30 PM, Martin Liška mli...@suse.cz wrote:


 Hello.

 Following patch fixes ICE in IPA ICF. Problem was that number of
 non-debug
 statements in a BB can
 change (for instance by IPA split), so that the number is recomputed.



 Huh, so can it get different for both candidates?  I think the stmt
 compare
 loop should be terminated on gsi_end_p of either iterator and return
 false for any remaining non-debug-stmts on the other.

 Thus, not walk all stmts twice here.



 Hello.

 Sorry for the previous patch, you are right it can be fixed in purer way.
 Please take a look at attached patch.


 As IPA split is run early I don't see how it should affect a real IPA
 pass though?





 Sorry for non precise information, the problematic BB is changed here:
 #0  gsi_split_seq_before (i=0x7fffd550, pnew_seq=0x7fffd528) at
 ../../gcc/gimple-iterator.c:429
 #1  0x00b95a2a in gimple_split_block (bb=0x76c41548,
 stmt=0x0)
 at ../../gcc/tree-cfg.c:5707
 #2  0x007563cf in split_block (bb=0x76c41548, i=i@entry=0x0)
 at
 ../../gcc/cfghooks.c:508
 #3  0x00756b44 in split_block_after_labels (bb=optimized out)
 at
 ../../gcc/cfghooks.c:549
 #4  make_forwarder_block (bb=optimized out,
 redirect_edge_p=redirect_edge_p@entry=0x75d4e0
 mfb_keep_just(edge_def*),
 new_bb_cbk=new_bb_cbk@entry=0x0) at ../../gcc/cfghooks.c:842
 #5  0x0076085a in create_preheader (loop=0x76d56948,
 flags=optimized out) at ../../gcc/cfgloopmanip.c:1563
 #6  0x00760aea in create_preheaders (flags=1) at
 ../../gcc/cfgloopmanip.c:1613
 #7  0x009bc6b0 in apply_loop_flags (flags=15) at
 ../../gcc/loop-init.c:75
 #8  0x009bc7d3 in loop_optimizer_init (flags=15) at
 ../../gcc/loop-init.c:136
 #9  0x00957914 in estimate_function_body_sizes
 (node=0x76c47620,
 early=false) at ../../gcc/ipa-inline-analysis.c:2480
 #10 0x0095948b in compute_inline_parameters (node=0x76c47620,
 early=false) at ../../gcc/ipa-inline-analysis.c:2907
 #11 0x0095bd88 in inline_analyze_function (node=0x76c47620)
 at
 ../../gcc/ipa-inline-analysis.c:3994
 #12 0x0095bed3 in inline_generate_summary () at
 ../../gcc/ipa-inline-analysis.c:4045
 #13 0x00a70b71 in execute_ipa_summary_passes (ipa_pass=0x1dcb9e0)
 at


 So inline_summary is generated after IPA-ICF does its job?

 But the bug is obviously that an IPA analysis phase does a code transform
 (here initializes loops without AVOID_CFG_MANIPULATIONS).
 Honza - if that is really needed then I think we should make sure
 loops are initialized at the start of the IPA analysis phase, not randomly
 inbetween.

 Thanks,
 Richard.


 Hello.

 Even thought the root of problem is hidden somewhere in loop creation, I
 would
 like to apply the patch which makes iteration of non-debug gimple statement
 more clearly?

 What do you think Richard?

Works for me - but we have to address the underlying issue.

Richard.

 Thanks,
 Martin



 ../../gcc/passes.c:2137
 #14 0x00777a15 in ipa_passes () at ../../gcc/cgraphunit.c:2074
 #15 symbol_table::compile (this=this@entry=0x76c3a000) at
 ../../gcc/cgraphunit.c:2187
 #16 0x00778bcd in symbol_table::finalize_compilation_unit
 (this=0x76c3a000) at ../../gcc/cgraphunit.c:2340
 #17 0x006580ee in c_write_global_declarations () at
 ../../gcc/c/c-decl.c:10777
 #18 0x00b5bb8b in compile_file () at ../../gcc/toplev.c:584
 #19 0x00b5def1 in do_compile () at ../../gcc/toplev.c:2041
 #20 0x00b5e0fa in toplev::main (this=0x7fffdc9f, argc=20,
 argv=0x7fffdd98) at ../../gcc/toplev.c:2138
 #21 0x0063f1d9 in main (argc=20, argv=0x7fffdd98) at
 ../../gcc/main.c:38


 Patch can bootstrap on x86_64-linux-pc and no regression has been seen.
 Ready for trunk?


 Thanks,
 Martin


 Thanks,
 Richard.

 Patch can bootstrap on x86_64-linux-pc and no regression has been seen.
 Ready for trunk?

 Thanks,
 Martin

RE: [PATCH] Fix IRA register preferencing

2014-12-10 Thread Wilco Dijkstra

 Jeff Law wrote:
 On 12/09/14 12:21, Wilco Dijkstra wrote:
  With the fix it uses a floating point register as expected. Given a similar 
  issue in
  https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html, would it not be 
  better to change
 the
  initialization values of reg_pref to illegal register classes so this kind 
  of issue can be
 trivially
  found with an assert? Also would it not be a good idea to have a single 
  register copy
 function that
  ensures all data is copied?
 But there are other times when you really don't want to copy, say when
 the original had a small class, but the copy can go into a larger class.
 
 I banged my head on this when I was doing similar work on range
 splitting a few years back and ended up recomputing the preferred and
 alternate class information.  That was much better than copying the
 classes.

If recomputing is best does that mean that record_reg_classes should not
give a boost to the preferred class in the 2nd pass? I don't understand
what purpose this has - if the preferred class is from the first pass, it
is already correct, so all it does is boost the preferred class further. 
And if the preferred class is wrong (eg. after live range splitting), it 
will boost the incorrect class even harder, so in the end you never get 
a different class.

 I pondered heuristics to expand the preferred class, but never
 implemented anything IIRC.  A trivial heuristic would be to bump to the
 next larger class if the given class was a singleton, otherwise copy the
 class.
 
 The obvious counter to that heuristic is an allocno that has two ranges
 (or N ranges) where we would prefer a different singleton class for each
 range.  In fact, I'm pretty sure I ran into this kind of situation and
 that led me down the just recompute it path.
 
 I'd hazard a guess that the simple heuristic would do better than what
 we're doing now with GENERAL_REGS though or what you're doing with copying.

From what you're saying, recomputing seems best, and I'd be happy to submit
a patch to remove all the preferred class code from record_reg_classes.

However there is still the possibility the preferred class is queried before
the recomputation happens (I think that is a case Renlin fixed). Either these
should be faulted and fixed by forcing recomputation, or we need to provide a 
correct preferred class. That should be a copy of the original class.

Wilco

Re: [RFC] diagnostics.c: For terminals, restrict messages to terminal width?

2014-12-10 Thread Dodji Seketeli

Manuel López-Ibáñez lopeziba...@gmail.com writes:

[...]

 On 10 December 2014 at 12:10, Dodji Seketeli do...@redhat.com wrote:

[...]

 Manuel, was there a particular reason to avoid mentioning the COLUMNS
 environment variable in the documentation?

 Not that I remember. Perhaps the documentation should say something
 like: The line is truncated to fit into n characters only if the
 option -fmessage-length=n is given, or if the output is a TTY and the
 COLUMNS environment variable is set.

Agreed.  Thank you.

 (b) Should ioctl be always used or only for Fortran?

 I'd go for using it in the common diagnostics framework, unless there is
 a sound motivated reason.  Manuel, do you remember why we didn't query the
 TIOCGWINSZ ioctl property to get the terminal size when that capability
 was available?

 I was not aware this possibility even existed.

Ok :-) Let's go for this then.

[...]

 Note that Fortran has this:

 #ifdef GWINSZ_IN_SYS_IOCTL
 # include sys/ioctl.h
 #endif

 Not sure if this is needed for diagnostics.c or whether it needs some
 configure magick.

I would guess that it should work to use that same #ifdef
GWINSZ_IN_SYS_IOCTL ... #endif snippet in diagnostics.c because, looking
at gcc/configure.ac, I see that we use the autoconf macro
AC_HEADER_TIOCGWINSZ and the autoconf documentation for that macro
reads:

 -- Macro: AC_HEADER_TIOCGWINSZ
 If the use of `TIOCGWINSZ' requires `sys/ioctl.h', then define
 `GWINSZ_IN_SYS_IOCTL'.  Otherwise `TIOCGWINSZ' can be found in
 `termios.h'.

 Use:

  #ifdef HAVE_TERMIOS_H
  # include termios.h
  #endif

  #ifdef GWINSZ_IN_SYS_IOCTL
  # include sys/ioctl.h
  #endif

I am not sure why we were not using the termios.h case though.

 I also agree with FX that the function should be named something like
 get_terminal_width().

Agreed as well.

 In fact, I would argue that the Fortran version should be a wrapper
 around this one to make the output consistent between the new Fortran
 diagnostics and the old ones (*_1 variants) while the transition is in
 progress, even if that means changing the current ordering for
 Fortran. So far, there does not seem to be any reason to prefer one
 ordering over the other, but whatever it is chosen, it would be better
 to be consistent.

I prefer that the environment variable taking precedent is better from a
usability standpoint because it's easier for the user to force the
behaviour she wants by setting an environment variable than by messing
up with some system-wide configuration what would change the output of
querying the TIOCGWINSZ property using an ioctl.

So the patch you (Manual) are proposing looks fine to me, with the
environment variable taking precedence, *if* that is fine for Fortran,
of course.

[...]

Cheers,

-- 
Dodji

Re: [PATCH] Do not download packages for graphite loop optimizations by default when using ./contrib/download_prerequisites

2014-12-10 Thread Richard Biener

On Wed, Dec 10, 2014 at 6:16 AM, Chung-Ju Wu jasonw...@gmail.com wrote:
 2014-12-09 21:16 GMT+08:00 Richard Biener richard.guent...@gmail.com:
 On Tue, Dec 9, 2014 at 6:36 AM, Chung-Ju Wu jasonw...@gmail.com wrote:
 Hi, all,

 In the discussion thread last year:
   https://gcc.gnu.org/ml/gcc-patches/2013-05/msg01334.html

 I extended the script ./contrib/download_prerequisites so that it can
 download isl and cloog packages for graphite loop optimizations.
 The patch was proposed to use GRAPHITE_LOOP_OPT=no by default.
 However, the change I committed into trunk is setting GRAPHITE_LOOP_OPT=yes:
   https://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=199297

 I am sorry about my carelessness and I would like to propose a new patch
 to fix it.  The plaintext ChangeLog and patch are as follow:

 I'd say keep it as =yes (now that we only need ISL) and adjust the comment
 instead.

 Richard.


 Thanks for the suggestion.
 The followings are proposed patch to adjust comment:

 Index: contrib/ChangeLog
 ===
 --- contrib/ChangeLog   (revision 218558)
 +++ contrib/ChangeLog   (working copy)
 @@ -1,3 +1,7 @@
 +2014-12-10  Chung-Ju Wu  jasonw...@gmail.com
 +
 +   * download_prerequisites: Modify the comment for GRAPHITE_LOOP_OPT.
 +
  2014-12-09  Laurynas Biveinis  laurynas.bivei...@gmail.com
 Yury Gribov  y.gri...@samsung.com


 Index: contrib/download_prerequisites
 ===
 --- contrib/download_prerequisites  (revision 218558)
 +++ contrib/download_prerequisites  (working copy)
 @@ -19,9 +19,9 @@
  # You should have received a copy of the GNU General Public License
  # along with this program. If not, see http://www.gnu.org/licenses/.

 -# If you want to build GCC with the Graphite loop optimizations,
 -# set GRAPHITE_LOOP_OPT=yes to download optional prerequisties
 -# ISL Library and CLooG.
 +# If you want to disable Graphite loop optimizations while building GCC,
 +# DO NOT set GRAPHITE_LOOP_OPT as yes so that the ISL package will not
 +# be downloaded.
  GRAPHITE_LOOP_OPT=yes


 Is this OK for trunk?

Ok.

Thanks,
Richard.


 Best regards,
 jasonwucj

Re: [PATCH] Fix PR 61225

2014-12-10 Thread Segher Boessenkool

On Tue, Dec 09, 2014 at 12:15:30PM -0700, Jeff Law wrote:
 @@ -3323,7 +3396,11 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn
 *i1, rtx_insn *i0,
   rtx old = newpat;
   total_sets = 1 + extra_sets;
   newpat = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (total_sets));
 - XVECEXP (newpat, 0, 0) = old;
 +
 + if (to_combined_insn)
 +   XVECEXP (newpat, 0, --total_sets) = old;
 + else
 +   XVECEXP (newpat, 0, 0) = old;
 }
 
 Is this correct?  If so, it needs a big fat comment, because it is
 not exactly obvious :-)
 
 Also, it doesn't handle at all the case where the new pattern already is
 a PARALLEL; can that never happen?
 I'd convinced myself it was.  But yes, a comment here would be good.
 
 Presumably you're thinking about a PARALLEL that satisfies single_set_p?

I wasn't thinking about anything in particular; this code does not handle
a PARALLEL newpat with to_combined_insn correctly, and it doesn't say it
cannot happen.

But yes, I don't see why it could not happen?  E.g. a parallel of multiple
sets with all but one of those dead?

Why should it be single_set here anyway?  (Maybe I need more coffee, sorry
if so).


Segher

Re: [PATCH PR62178]Improve candidate selecting in IVOPT, 2nd try.

2014-12-10 Thread Richard Biener

On Fri, Dec 5, 2014 at 1:15 PM, Bin Cheng bin.ch...@arm.com wrote:
 Hi,
 Though PR62178 is hidden by recent cost change in aarch64 backend, the ivopt
 issue still exists.

 Current candidate selecting algorithm tends to select fewer candidates given
 below reasons:
   1) to better handle loops with many induction uses but the best choice is
 one generic basic induction variable;
   2) to keep compilation time low.

 One fundamental weakness of the strategy is the opposite situation can't be
 handled properly sometimes.  For these cases the best choice is each
 induction variable has its own candidate.
 This patch fixes the problem by shuffling candidate set after fix-point is
 reached by current implementation.  The reason why this strategy works is it
 replaces candidate set by selecting local optimal candidate for some
 induction uses, and the new candidate set (has lower cost) is exact what we
 want in the mentioned case.  Instrumentation data shows this can find better
 candidates set for ~6% loops in spec2006 on x86_64, and ~4% on aarch64.

 This patch actually is extension to the first version patch posted at
 https://gcc.gnu.org/ml/gcc-patches/2014-09/msg02620.html, that only adds
 another selecting pass with special seed set (more or less like the shuffled
 set in this patch).  Data also confirms this patch can find optimal sets for
 most loops found by the first one, as well as optimal sets for many new
 loops.

 Bootstrap and test on x86_64, no regression on benchmarks.  Bootstrap and
 test on aarch64.
 Since this patch only selects candidate set with lower cost, any regressions
 revealed are latent bugs of other components in GCC.
 I also collected GCC bootstrap time on x86_64, no regression either.
 Is this OK?

The algorithm seems to be quadratic in the number of IV candidates
(at least):

+ for (i = 0; i  n_iv_cands (data); i++)
+   {
...
+ iv_ca_replace (data, ivs, cand, act_delta, tmp_delta);
...

and

+static void
+iv_ca_replace (struct ivopts_data *data, struct iv_ca *ivs,
+  struct iv_cand *cand, struct iv_ca_delta *act_delta,
+  struct iv_ca_delta **delta)
+{
...
+  for (i = 0; i  ivs-upto; i++)
+{
...
+  if (data-consider_all_candidates)
+   {
+ for (j = 0; j  n_iv_cands (data); j++)
+   {

possibly cubic if ivs-upto is of similar value.

I wonder if it is possible to restrict this to the single IV with
the largest delta?  After all we are iterating try_improve_iv_set.
Alternatively move the handling out of iteration completey,
thus into the caller of try_improve_iv_set?

Note that compile-time issues always arise in auto-generated code,
not during GCC bootstrap.

Richard.


 2014-12-03  Bin Cheng  bin.ch...@arm.com

   PR tree-optimization/62178
   * tree-ssa-loop-ivopts.c (iv_ca_replace): New function.
   (try_improve_iv_set): Shuffle candidates set in order to handle
   case in which candidate wrto each iv use should be selected.

 gcc/testsuite/ChangeLog
 2014-12-03  Bin Cheng  bin.ch...@arm.com

   PR tree-optimization/62178
   * gcc.target/aarch64/pr62178.c: New test.

Re: [RFC] diagnostics.c: For terminals, restrict messages to terminal width?

2014-12-10 Thread FX

 So the patch you (Manual) are proposing looks fine to me, with the
 environment variable taking precedence, *if* that is fine for Fortran,
 of course.

That seems fine to me, from the Fortran standpoint. COLUMNS is a bit of a 
special environment variable, which the shell (when it provides it) keeps 
synchronized to the terminal width anyway (as is the case for bash and zsh).

FX

Re: [PATCH] PR ipa/63909 ICE: SIGSEGV in ipa_icf_gimple::func_checker::compare_bb()

2014-12-10 Thread Martin Liška


On 12/10/2014 02:26 PM, Richard Biener wrote:

On Tue, Dec 9, 2014 at 4:52 PM, Martin Liška mli...@suse.cz wrote:

On 11/21/2014 01:23 PM, Richard Biener wrote:


On Fri, Nov 21, 2014 at 12:52 PM, Martin Liška mli...@suse.cz wrote:


On 11/20/2014 05:41 PM, Richard Biener wrote:



On Thu, Nov 20, 2014 at 5:30 PM, Martin Liška mli...@suse.cz wrote:



Hello.

Following patch fixes ICE in IPA ICF. Problem was that number of
non-debug
statements in a BB can
change (for instance by IPA split), so that the number is recomputed.




Huh, so can it get different for both candidates?  I think the stmt
compare
loop should be terminated on gsi_end_p of either iterator and return
false for any remaining non-debug-stmts on the other.

Thus, not walk all stmts twice here.




Hello.

Sorry for the previous patch, you are right it can be fixed in purer way.
Please take a look at attached patch.



As IPA split is run early I don't see how it should affect a real IPA
pass though?






Sorry for non precise information, the problematic BB is changed here:
#0  gsi_split_seq_before (i=0x7fffd550, pnew_seq=0x7fffd528) at
../../gcc/gimple-iterator.c:429
#1  0x00b95a2a in gimple_split_block (bb=0x76c41548,
stmt=0x0)
at ../../gcc/tree-cfg.c:5707
#2  0x007563cf in split_block (bb=0x76c41548, i=i@entry=0x0)
at
../../gcc/cfghooks.c:508
#3  0x00756b44 in split_block_after_labels (bb=optimized out)
at
../../gcc/cfghooks.c:549
#4  make_forwarder_block (bb=optimized out,
redirect_edge_p=redirect_edge_p@entry=0x75d4e0
mfb_keep_just(edge_def*),
new_bb_cbk=new_bb_cbk@entry=0x0) at ../../gcc/cfghooks.c:842
#5  0x0076085a in create_preheader (loop=0x76d56948,
flags=optimized out) at ../../gcc/cfgloopmanip.c:1563
#6  0x00760aea in create_preheaders (flags=1) at
../../gcc/cfgloopmanip.c:1613
#7  0x009bc6b0 in apply_loop_flags (flags=15) at
../../gcc/loop-init.c:75
#8  0x009bc7d3 in loop_optimizer_init (flags=15) at
../../gcc/loop-init.c:136
#9  0x00957914 in estimate_function_body_sizes
(node=0x76c47620,
early=false) at ../../gcc/ipa-inline-analysis.c:2480
#10 0x0095948b in compute_inline_parameters (node=0x76c47620,
early=false) at ../../gcc/ipa-inline-analysis.c:2907
#11 0x0095bd88 in inline_analyze_function (node=0x76c47620)
at
../../gcc/ipa-inline-analysis.c:3994
#12 0x0095bed3 in inline_generate_summary () at
../../gcc/ipa-inline-analysis.c:4045
#13 0x00a70b71 in execute_ipa_summary_passes (ipa_pass=0x1dcb9e0)
at



So inline_summary is generated after IPA-ICF does its job?

But the bug is obviously that an IPA analysis phase does a code transform
(here initializes loops without AVOID_CFG_MANIPULATIONS).
Honza - if that is really needed then I think we should make sure
loops are initialized at the start of the IPA analysis phase, not randomly
inbetween.

Thanks,
Richard.



Hello.

Even thought the root of problem is hidden somewhere in loop creation, I
would
like to apply the patch which makes iteration of non-debug gimple statement
more clearly?

What do you think Richard?


Works for me - but we have to address the underlying issue.

Richard.


Thank you, I've just create another issue that is related to inliner analysis,
where the transformation occurs: PR64253.

Thanks,
Martin





Thanks,
Martin





../../gcc/passes.c:2137
#14 0x00777a15 in ipa_passes () at ../../gcc/cgraphunit.c:2074
#15 symbol_table::compile (this=this@entry=0x76c3a000) at
../../gcc/cgraphunit.c:2187
#16 0x00778bcd in symbol_table::finalize_compilation_unit
(this=0x76c3a000) at ../../gcc/cgraphunit.c:2340
#17 0x006580ee in c_write_global_declarations () at
../../gcc/c/c-decl.c:10777
#18 0x00b5bb8b in compile_file () at ../../gcc/toplev.c:584
#19 0x00b5def1 in do_compile () at ../../gcc/toplev.c:2041
#20 0x00b5e0fa in toplev::main (this=0x7fffdc9f, argc=20,
argv=0x7fffdd98) at ../../gcc/toplev.c:2138
#21 0x0063f1d9 in main (argc=20, argv=0x7fffdd98) at
../../gcc/main.c:38


Patch can bootstrap on x86_64-linux-pc and no regression has been seen.
Ready for trunk?


Thanks,
Martin



Thanks,
Richard.


Patch can bootstrap on x86_64-linux-pc and no regression has been seen.
Ready for trunk?

Thanks,
Martin

[PATCH][AARCH64][4.9]Backport Use selected cpu's tuning when no tuning parameter is specified.

2014-12-10 Thread Renlin Li


On 04/12/14 10:27, Marcus Shawcroft wrote:

On 27 November 2014 at 11:27, Renlin Li renlin...@arm.com wrote:


gcc/ChangeLog:

2014-11-27  Renlin Li  renlin...@arm.com

 * config/aarch64/aarch64.c (aarch64_parse_cpu): Don't define
selected_tune.
 (aarch64_override_options): Use selected_cpu's tuning.


OK and this is also broken in 4.9, could you prepare a backport please. /Marcus



This is a backport patch of 
https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00287.html


aarch64-none-elf has been built and tested on the model, no issue.
Okay for branch 4.9?

Regards,
Renlin Li


gcc/ChangeLog:

2014-12-10 Renlin Li renlin...@arm.com

* config/aarch64/aarch64.c (aarch64_parse_cpu): Remove 
selected_tune

assignment as this will be done later.
(aarch64_override_options): Use selected_cpu's tuning.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 1809513..0a8c303 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6613,7 +6613,6 @@ aarch64_parse_cpu (void)
   if (strlen (cpu-name) == len  strncmp (cpu-name, str, len) == 0)
 	{
 	  selected_cpu = cpu;
-	  selected_tune = cpu;
 	  aarch64_isa_flags = selected_cpu-flags;
 
 	  if (ext != NULL)
@@ -6709,9 +6708,8 @@ aarch64_override_options (void)
 
   gcc_assert (selected_cpu);
 
-  /* The selected cpu may be an architecture, so lookup tuning by core ID.  */
   if (!selected_tune)
-selected_tune = all_cores[selected_cpu-core];
+selected_tune = selected_cpu;
 
   aarch64_tune_flags = selected_tune-flags;
   aarch64_tune = selected_tune-core;

Re: [PATCH 06/10] rs6000: New add/subf carry insns

2014-12-10 Thread Segher Boessenkool

On Mon, Dec 08, 2014 at 10:53:21AM -0500, David Edelsohn wrote:
 As we both noticed, there are a few problems with this patch, so I'll
 wait for a revised version.

Here it is.  It took a bit longer because of a latent problem in combine
(ugh!) that caused mysterious failures in guality (double ugh).

For the carry_in patterns I now have an expander that expands to the
_0 and _m1 cases directly.

Okay for mainline?

Cheers,


Segher



2014-12-10  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
PR target/64180
* config/rs6000/predicates.md (adde_operand): New.
* config/rs6000/rs6000.md (addmode3_carry): New.
(*addmode3_imm_carry_pos): New.
(*addmode3_imm_carry_0): New.
(*addmode3_imm_carry_m1): New.
(*addmode3_imm_carry_neg): New.
(addmode3_carry_in): New.
(*addmode3_carry_in): New.
(addmode3_carry_in_0): New.
(addmode3_carry_in_m1): New.
(subfmode3_carry): New.
(*subfmode3_imm_carry_0): New.
(*subfmode3_imm_carry_m1): New.
(subfmode3_carry_in): New.
(*subfmode3_carry_in): New.
(subfmode3_carry_in_0): New.
(subfmode3_carry_in_m1): New.
(subfmode3_carry_in_xx): New.

---
 gcc/config/rs6000/predicates.md |   6 ++
 gcc/config/rs6000/rs6000.md | 200 
 2 files changed, 206 insertions(+)

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index ea230a5..a19cb2f 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -788,6 +788,12 @@ (define_predicate add_operand
 || satisfies_constraint_L (op))
 (match_operand 0 gpc_reg_operand)))
 
+;; Return 1 if the operand is either a non-special register, or 0, or -1.
+(define_predicate adde_operand
+  (if_then_else (match_code const_int)
+(match_test INTVAL (op) == 0 || INTVAL (op) == -1)
+(match_operand 0 gpc_reg_operand)))
+
 ;; Return 1 if OP is a constant but not a valid add_operand.
 (define_predicate non_add_cint_operand
   (and (match_code const_int)
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index dcdb7c1..63ca3c2 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -1634,6 +1634,115 @@ (define_split
 FAIL;
 })
 
+
+(define_insn addmode3_carry
+  [(set (match_operand:P 0 gpc_reg_operand =r)
+   (plus:P (match_operand:P 1 gpc_reg_operand r)
+   (match_operand:P 2 reg_or_short_operand rI)))
+   (set (reg:P CA_REGNO)
+   (ltu:P (plus:P (match_dup 1)
+  (match_dup 2))
+  (match_dup 1)))]
+  
+  add%I2c %0,%1,%2
+  [(set_attr type add)])
+
+(define_insn *addmode3_imm_carry_pos
+  [(set (match_operand:P 0 gpc_reg_operand =r)
+   (plus:P (match_operand:P 1 gpc_reg_operand r)
+   (match_operand:P 2 short_cint_operand n)))
+   (set (reg:P CA_REGNO)
+   (geu:P (match_dup 1)
+  (match_operand:P 3 const_int_operand n)))]
+  INTVAL (operands[2])  0
+INTVAL (operands[2]) + INTVAL (operands[3]) == 0
+  addic %0,%1,%2
+  [(set_attr type add)])
+
+(define_insn *addmode3_imm_carry_0
+  [(set (match_operand:P 0 gpc_reg_operand =r)
+   (match_operand:P 1 gpc_reg_operand r))
+   (set (reg:P CA_REGNO)
+   (const_int 0))]
+  
+  addic %0,%1,0
+  [(set_attr type add)])
+
+(define_insn *addmode3_imm_carry_m1
+  [(set (match_operand:P 0 gpc_reg_operand =r)
+   (plus:P (match_operand:P 1 gpc_reg_operand r)
+   (const_int -1)))
+   (set (reg:P CA_REGNO)
+   (ne:P (match_dup 1)
+ (const_int 0)))]
+  
+  addic %0,%1,-1
+  [(set_attr type add)])
+
+(define_insn *addmode3_imm_carry_neg
+  [(set (match_operand:P 0 gpc_reg_operand =r)
+   (plus:P (match_operand:P 1 gpc_reg_operand r)
+   (match_operand:P 2 short_cint_operand n)))
+   (set (reg:P CA_REGNO)
+   (gtu:P (match_dup 1)
+  (match_operand:P 3 const_int_operand n)))]
+  INTVAL (operands[2])  0
+INTVAL (operands[2]) + INTVAL (operands[3]) == -1
+  addic %0,%1,%2
+  [(set_attr type add)])
+
+
+(define_expand addmode3_carry_in
+  [(parallel [
+ (set (match_operand:GPR 0 gpc_reg_operand)
+ (plus:GPR (plus:GPR (match_operand:GPR 1 gpc_reg_operand)
+ (match_operand:GPR 2 adde_operand))
+   (reg:GPR CA_REGNO)))
+ (clobber (reg:GPR CA_REGNO))])]
+  
+{
+  if (operands[2] == const0_rtx)
+{
+  emit_insn (gen_addmode3_carry_in_0 (operands[0], operands[1]));
+  DONE;
+}
+  if (operands[2] == constm1_rtx)
+{
+  emit_insn (gen_addmode3_carry_in_m1 (operands[0], operands[1]));
+  DONE;
+}
+})
+
+(define_insn *addmode3_carry_in
+  [(set (match_operand:GPR 0 gpc_reg_operand =r)
+   (plus:GPR (plus:GPR (match_operand:GPR 1 gpc_reg_operand r)
+   (match_operand:GPR 2 gpc_reg_operand r))
+ (reg:GPR CA_REGNO)))
+   (clobber

Re: locales fixes

2014-12-10 Thread Jonathan Wakely


On 09/12/14 23:24 +, Jonathan Wakely wrote:

On 09/12/14 20:34 +0100, Marc Glisse wrote:

On Tue, 9 Dec 2014, Jonathan Wakely wrote:


On 08/12/14 23:53 +0100, François Dumont wrote:
 After having installed all necessary locales on my system I 
end up with 4 failures. Here is a patch to fix them all.


Did you discover why only you are seeing failures?


Not just him, anyone with a debian-based system (assuming we are 
talking about the same thing). I believe it was documented 
somewhere, Paolo should know.


Ah OK. I've requested the locale data to be installed on gcc20, which
is the only Debian machine I have access to.


And I can see the failures now.

Re: [patch] Fix ICE on unaligned record field

2014-12-10 Thread Richard Biener

On Wed, Dec 3, 2014 at 3:02 PM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 On Mon, Dec 01, 2014 at 12:00:14PM +0100, Richard Biener wrote:
 On Fri, Nov 28, 2014 at 5:20 PM, Eric Botcazou ebotca...@adacore.com wrote:
  Hi,
 
  the attached Ada testcase triggers an assertion in the RTL expander for the
  address operator because the operator has been applied to a 
  non-byte-aligned
  record field.  The problematic ADDR_EXPR is built by 
  ipa_modify_call_arguments
  which has a hole when get_addr_base_and_unit_offset returns NULL_TREE: the
  variable offset case is handled but not the non-byte-aligned case, which 
  can
  rountinely happen in Ada, hence the proposed fix.
 
  Tested on x86_64-suse-linux, OK for the mainline?

 Umm.  So you are doing a possibly aggregate copy here?  Or how
 are we sure we are dealing with a register op only?  (the function
 is always a twisted maze to me...)

 That said - I suppose this is called from IPA-SRA?  In that case,
 can't we please avoid doing the transform in the first place?


 I suppose that could be done by something like the following, which I
 have tested only very mildly so far, in particular I have not double
 checked that get_inner_reference is cfun-agnostic.

 Hope it helps,

 Martin



 2014-12-03  Martin Jambor  mjam...@suse.cz

 * tree-sra.c (ipa_sra_check_caller_data): New type.
 (has_caller_p): Removed.
 (ipa_sra_check_caller): New function.
 (ipa_sra_preliminary_function_checks): Use it.

 diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
 index f213c80..900f3c3 100644
 --- a/gcc/tree-sra.c
 +++ b/gcc/tree-sra.c
 @@ -4977,13 +4977,54 @@ modify_function (struct cgraph_node *node, 
 ipa_parm_adjustment_vec adjustments)
return cfg_changed;
  }

 -/* If NODE has a caller, return true.  */
 +/* Means of communication between ipa_sra_check_caller and
 +   ipa_sra_preliminary_function_checks.  */
 +
 +struct ipa_sra_check_caller_data
 +{
 +  bool has_callers;
 +  bool bad_arg_alignment;
 +};
 +
 +/* If NODE has a caller, mark that fact in DATA which is pointer to
 +   ipa_sra_check_caller_data.  Also check all aggregate arguments in all 
 known
 +   calls if they are unit aligned and if not, set the appropriate flag in 
 DATA
 +   too. */

  static bool
 -has_caller_p (struct cgraph_node *node, void *data ATTRIBUTE_UNUSED)
 +ipa_sra_check_caller (struct cgraph_node *node, void *data)
  {
 -  if (node-callers)
 -return true;
 +  if (!node-callers)
 +return false;
 +
 +  struct ipa_sra_check_caller_data *iscc;
 +  iscc = (struct ipa_sra_check_caller_data *) data;
 +  iscc-has_callers = true;
 +
 +  for (cgraph_edge *cs = node-callers; cs; cs = cs-next_caller)
 +{
 +  gimple call_stmt = cs-call_stmt;
 +  unsigned count = gimple_call_num_args (call_stmt);
 +  for (unsigned i = 0; i  count; i++)
 +   {
 + tree arg = gimple_call_arg (call_stmt, i);
 + if (is_gimple_reg (arg))

!handled_component_p (arg)

would better match what you are checking below.

Note that I think the place of the check is unfortunate as you for example
will not remove the argument if it is unused.  In fact I'm not yet sure
what transform exactly we are disabling.  I am guessing we are
passing an aggregate by value that resides at a bit-aligned offset
of some outer object:

  foo (x.aggr);

and the function then does

foo (Aggr a)
{
  int i = a.foo;
...
}

thus use only a part of the aggregate.  Then IPA SRA would like to
pass x.aggr.foo instead of x.aggr and thus tries to materialize a
load from x.aggr.foo at all callers but fails to do that in a valid way.

Erics fix did, at all callers

  Aggr tem = x.aggr;
  foo (tem.foo);

?

While we should be able to simply do

  foo (BIT_FIELD_REF x.aggr, .)

with the appropriate bit offset and size?  (if that's of register type
you need to do the load in a separate stmt of couse).

Thus similar to Erics fix but avoiding the aggregate copy.

But maybe I am not really understanding the issue (didn't look at the
testcase).

Richard.

 + continue;
 +
 + tree offset;
 + HOST_WIDE_INT bitsize, bitpos;
 + machine_mode mode;
 + int unsignedp, volatilep = 0;
 + get_inner_reference (arg, bitsize, bitpos, offset, mode,
 +  unsignedp, volatilep, false);
 + if (bitpos % BITS_PER_UNIT)
 +   {
 + iscc-bad_arg_alignment = true;
 + return true;
 +   }
 +   }
 +}
 +
return false;
  }

 @@ -5038,14 +5079,6 @@ ipa_sra_preliminary_function_checks (struct 
 cgraph_node *node)
return false;
  }

 -  if (!node-call_for_symbol_thunks_and_aliases (has_caller_p, NULL, true))
 -{
 -  if (dump_file)
 -   fprintf (dump_file,
 -Function has no callers in this compilation unit.\n);
 -  return false;
 -}
 -
if (cfun-stdarg)
  {
if (dump_file)
 @@ -5064,6 +5097,25 @@ ipa_sra_preliminary_function_checks

Reducing the amount of builtins by merging named patterns

2014-12-10 Thread Blumental Maxim

Hello everyone.

I'm working on reducing the amount of builtin's in i386. My approach
is to merge similar patterns into one with a fake argument which
determines which instruction to print.

We have ~30 groups of similar  (i.e.having similar sets of attributes)
named patterns. These groups together include ~230 template (i.e.
having substitution attributes in their names) named patterns in
total. So, we can reduce the amount of template named patterns by ~200
at best. Those template named patterns correspond to several specific
named patterns each. E.g. in my patch (see attached patch) I merged
two template named patterns into one and that allowed me to replace
four builtin's with only two.

Should I continue to work in that direction?


named_patts_merge.patch
Description: Binary data

Reducing the amount of builtins by merging named patterns

2014-12-10 Thread Blumental Maxim

Hello everyone.

I'm working on reducing the amount of builtin's in i386. My approach
is to merge similar patterns into one with a fake argument which
determines which instruction to print.

We have ~30 groups of similar  (i.e.having similar sets of attributes)
named patterns. These groups together include ~230 template (i.e.
having substitution attributes in their names) named patterns in
total. So, we can reduce the amount of template named patterns by ~200
at best. Those template named patterns correspond to several specific
named patterns each. E.g. in my patch (see attached patch) I merged
two template named patterns into one and that allowed me to replace
four builtin's with only two.

Should I continue to work in that direction?


named_patts_merge.patch
Description: Binary data

[PATCH] Fix LIM not clearing range-info

2014-12-10 Thread Richard Biener


While working on PR42108 I noticed that LIM fails to clear range-info
on SSA names whose defining statements it moves, possibly causing
wrong code generation later on.

The following fixes that.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2014-12-10  Richard Biener  rguent...@suse.de

 * tree-ssa-loop-im.c
 (move_computations_dom_walker::before_dom_children): Clear
 SSA_NAME_RANGE_INFO on moved stmts.

Index: gcc/tree-ssa-loop-im.c
===
--- gcc/tree-ssa-loop-im.c  (revision 218515)
+++ gcc/tree-ssa-loop-im.c  (working copy)
@@ -1232,6 +1286,11 @@ move_computations_dom_walker::before_dom
  COND_EXPR, t, arg0, arg1);
  todo_ |= TODO_cleanup_cfg;
}
+  if (INTEGRAL_TYPE_P (TREE_TYPE (gimple_assign_lhs (new_stmt)))
+  (!ALWAYS_EXECUTED_IN (bb)
+ || (ALWAYS_EXECUTED_IN (bb) != level
+  !flow_loop_nested_p (ALWAYS_EXECUTED_IN (bb), level
+   SSA_NAME_RANGE_INFO (gimple_assign_lhs (new_stmt)) = NULL;
   gsi_insert_on_edge (loop_preheader_edge (level), new_stmt);
   remove_phi_node (bsi, false);
 }
@@ -1291,6 +1350,13 @@ move_computations_dom_walker::before_dom
}
}
   gsi_remove (bsi, false);
+  if (gimple_has_lhs (stmt)
+  TREE_CODE (gimple_get_lhs (stmt)) == SSA_NAME
+  INTEGRAL_TYPE_P (TREE_TYPE (gimple_get_lhs (stmt)))
+  (!ALWAYS_EXECUTED_IN (bb)
+ || !(ALWAYS_EXECUTED_IN (bb) == level
+  || flow_loop_nested_p (ALWAYS_EXECUTED_IN (bb), level
+   SSA_NAME_RANGE_INFO (gimple_get_lhs (stmt)) = NULL;
   /* In case this is a stmt that is not unconditionally executed
  when the target loop header is executed and the stmt may
 invoke undefined integer or pointer overflow rewrite it to

Re: [patch] gdb python pretty printer for DIEs

2014-12-10 Thread David Malcolm

On Tue, 2014-12-09 at 13:10 -0800, Aldy Hernandez wrote:
  From: 
 Aldy Hernandez al...@redhat.com
To: 
 jason merrill ja...@redhat.com
Cc: 
 David Malcolm
 dmalc...@redhat.com, gcc-patches
 gcc-patches@gcc.gnu.org
   Subject: 
 [patch] gdb python pretty printer
 for DIEs
  Date: 
 Tue, 09 Dec 2014 13:10:57 -0800
 (12/09/2014 04:10:57 PM)

 I am tired of dumping entire DIEs just to see what type they are.
 With 
 this patch, we get:

 (gdb) print context_die
 $5 = dw_die_ref 0x76de0230 DW_TAG_module parent=0x76de 
 DW_TAG_compile_unit

 I know it's past the end of stage1, but this debugging aid can help
 in 
 fixing bugs in stage = 3.

 I am committing this to the [debug-early] branch, but I am hoping I
 can 
 also commit it to mainline and avoid dragging it along.

 OK for mainline?

 --- a/gcc/gdbhooks.py
 +++ b/gcc/gdbhooks.py
 @@ -253,6 +253,26 @@ class CGraphNodePrinter:
  return result

  ##
 +# Dwarf DIE pretty-printers
 +##
 +
 +class DWDieRefPrinter:
 +def __init__(self, gdbval):
 +self.gdbval = gdbval
 +
 +def to_string (self):
 +result = 'dw_die_ref 0x%x' % long(self.gdbval)

A minor nit: for the NULL case, you're doing slightly more work than
necessary: you start building result above...

 +if long(self.gdbval) == 0:
 +return 'dw_die_ref 0x0'

...then discard it at the return here.  You could move the result
= ... line to after the if ... == 0 conditional.

 +result += ' %s' % self.gdbval['die_tag']
 +if long(self.gdbval['die_parent']) != 0:
 +result += ' parent=0x%x %s' %
 (long(self.gdbval['die_parent']),
 +
 self.gdbval['die_parent']['die_tag'])
 + 
 +result += ''
 +return result

Re: [PATCH 2/3] Extended if-conversion

2014-12-10 Thread Richard Biener

On Wed, Dec 10, 2014 at 11:54 AM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Richard,

 Sorry that I forgot to delete debug dump from my fix.
 I have few questions about your comments.

 1. You wrote :
 You also still have two functions for PHI predication.  And the
 new extended variant doesn't commonize the 2-args and general
 path
  Did you mean that I must combine predicate_scalar_phi and
 predicate_extended scalar phi to one function?
 Please note that if additional flag was not set up (i.e.
 aggressive_if_conv is false) extended predication is required more
 compile time since it builds hash_map.

It's compile-time complexity is reasonable enough even for
non-aggressive if-conversion.

 2. About critical edge splitting.

 Did you mean that we should perform it (1) under aggressive_if_conv
 option only; (2) should we split all critical edges.
 Note that this leads to recomputing of topological order.

Well, I don't mind splitting all critical edges unconditionally, thus
do something like

Index: gcc/tree-if-conv.c
===
--- gcc/tree-if-conv.c  (revision 218515)
+++ gcc/tree-if-conv.c  (working copy)
@@ -2235,12 +2235,21 @@ pass_if_conversion::execute (function *f
   if (number_of_loops (fun) = 1)
 return 0;

+  bool critical_edges_split_p = false;
   FOR_EACH_LOOP (loop, 0)
 if (flag_tree_loop_if_convert == 1
|| flag_tree_loop_if_convert_stores == 1
|| ((flag_tree_loop_vectorize || loop-force_vectorize)
 !loop-dont_vectorize))
-  todo |= tree_if_conversion (loop);
+  {
+   if (!critical_edges_split_p)
+ {
+   split_critical_edges ();
+   critical_edges_split_p = true;
+   todo |= TODO_cleanup_cfg;
+ }
+   todo |= tree_if_conversion (loop);
+  }

 #ifdef ENABLE_CHECKING
   {

 It is worth noting that in current implementation bb's with 2
 predecessors and both are on critical edges are accepted without
 additional option.

Yes, I know.

tree-if-conv.c is a mess right now and if we can avoid adding more
to it and even fix the critical edge missed optimization with splitting
critical edges then I am all for that solution.

Richard.

 Thanks ahead.
 Yuri.
 2014-12-09 18:20 GMT+03:00 Richard Biener richard.guent...@gmail.com:
 On Tue, Dec 9, 2014 at 2:11 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Richard,

 Here is updated patch2 with the following changes:
 1. Delete functions  phi_has_two_different_args and find_insertion_point.
 2. Use only one function for extended predication -
 predicate_extended_scalar_phi.
 3. Save gsi before insertion of predicate computations for basic
 blocks if it has 2 predecessors and
 both incoming edges are critical or it gas more than 2 predecessors
 and at least one incoming edge
 is critical. This saved iterator can be used by extended phi predication.

 Here is motivated test-case which explains this point.
 Test-case is attached (t5.c) and it must be compiled with -O2
 -ftree-loop-vectorize -fopenmp options.
 The problem phi is in bb-7:

   bb_5 (preds = {bb_4 }, succs = {bb_7 bb_9 })
   {
 bb 5:
 xmax_edge_18 = xmax_edge_36 + 1;
 if (xmax_17 == xmax_27)
   goto bb 7;
 else
   goto bb 9;

   }
   bb_6 (preds = {bb_4 }, succs = {bb_7 bb_8 })
   {
 bb 6:
 if (xmax_17 == xmax_27)
   goto bb 7;
 else
   goto bb 8;

   }
   bb_7 (preds = {bb_6 bb_5 }, succs = {bb_11 })
   {
 bb 7:
 # xmax_edge_30 = PHI xmax_edge_36(6), xmax_edge_18(5)
 xmax_edge_19 = xmax_edge_39 + 1;
 goto bb 11;

   }

 Note that both incoming edges to bb_7 are critical. If we comment out
 restoring gsi in predicate_all_scalar_phi:
 #if 0
  if ((EDGE_COUNT (bb-preds) == 2  all_preds_critical_p (bb))
  || (EDGE_COUNT (bb-preds)  2  has_pred_critical_p (bb)))
gsi = bb_insert_point (bb);
  else
 #endif
gsi = gsi_after_labels (bb);

 we will get ICE:
 t5.c: In function 'foo':
 t5.c:9:6: error: definition in block 4 follows the use
  void foo (int n)
   ^
 for SSA_NAME: _1 in statement:
 _52 = _1  _3;
 t5.c:9:6: internal compiler error: verify_ssa failed

 smce predicate computations were inserted in bb_7.

 The issue is obviously that the predicates have already been emitted
 in the target BB - that's of course the wrong place.  This is done
 by insert_gimplified_predicates.

 This just shows how edge predicate handling is broken - we don't
 seem to have a sequence of gimplified stmts for edge predicates
 but push those to e-dest which makes this really messy.

 Rather than having a separate phase where we insert all
 gimplified bb predicates we should do that on-demand when
 predicating a PHI.

 Your patch writes to stderr - that's bad - use dump_file and guard
 the printfs properly.

 You also still have two functions for PHI predication.  And the
 new extended variant doesn't commonize the 2-args and general
 paths.

 I'm not at all happy with this code.  It may be

Re: [PATCH 06/10] rs6000: New add/subf carry insns

2014-12-10 Thread David Edelsohn

On Wed, Dec 10, 2014 at 9:00 AM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 On Mon, Dec 08, 2014 at 10:53:21AM -0500, David Edelsohn wrote:
 As we both noticed, there are a few problems with this patch, so I'll
 wait for a revised version.

 Here it is.  It took a bit longer because of a latent problem in combine
 (ugh!) that caused mysterious failures in guality (double ugh).

 For the carry_in patterns I now have an expander that expands to the
 _0 and _m1 cases directly.

 Okay for mainline?

 Cheers,


 Segher



 2014-12-10  Segher Boessenkool  seg...@kernel.crashing.org

 gcc/
 PR target/64180
 * config/rs6000/predicates.md (adde_operand): New.
 * config/rs6000/rs6000.md (addmode3_carry): New.
 (*addmode3_imm_carry_pos): New.
 (*addmode3_imm_carry_0): New.
 (*addmode3_imm_carry_m1): New.
 (*addmode3_imm_carry_neg): New.
 (addmode3_carry_in): New.
 (*addmode3_carry_in): New.

Please name this *addmode3_carry_in_internal

 (addmode3_carry_in_0): New.
 (addmode3_carry_in_m1): New.
 (subfmode3_carry): New.
 (*subfmode3_imm_carry_0): New.
 (*subfmode3_imm_carry_m1): New.
 (subfmode3_carry_in): New.
 (*subfmode3_carry_in): New.

Please name this *subfmode3_carry_in_internal

 (subfmode3_carry_in_0): New.
 (subfmode3_carry_in_m1): New.
 (subfmode3_carry_in_xx): New.

Okay with those changes.

Thanks for all of the great improvements!

- David

Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-10 Thread Andreas Schwab

Ilya Enkovich enkovich@gmail.com writes:

   * gcc.dg/lto/lto.exp: Load mpx-dg.exp.
   * gcc.dg/lto/chkp-privatize_0.c: New.
   * gcc.dg/lto/chkp-privatize_1.c: New.

xgcc: error: unrecognized command line option '-mmpx'

FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,  -fPIC -flto 
-flto-partition=max -fcheck-pointer-bounds -mmpx 

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.

Re: [PATCH] TYPE_OVERFLOW_* cleanup

2014-12-10 Thread Marek Polacek

On Wed, Dec 10, 2014 at 09:42:58AM +0100, Richard Biener wrote:
 On Tue, 9 Dec 2014, Marek Polacek wrote:
 
  The issue here is that TYPE_OVERFLOW_TRAPS, TYPE_OVERFLOW_UNDEFINED,
  and TYPE_OVERFLOW_WRAPS macros work on integral types only, yet we
  pass e.g. pointer_type/real_type to them.  This patch adds proper
  checking for these macros and adds missing guards to various places.
  This looks pretty straightforward, but I had to tweak a few places
  wrt vector_types (where I've used VECTOR_INTEGER_TYPE_P) to not to
  regress folding - and I'm afraid I missed places that aren't tested
  in our testsuite :/.
 
 Apart from what Marc already pointed out I think that for vectors
 and complex types of integral types the macros work ok (TYPE_UNSIGNED
 is well-defined for those).  It would be wrong to disable the
 tests for those (I probably mislead you here).  Similar to FLOAT_TYPE_P
 we probably want a ANY_INTEGRAL_TYPE_P () predicate (bah,
 INTEGRAL_TYPE_P should be really SCALAR_INTEGRAL_TYPE_P...).
 
 Thus the TYPE_OVERFLOW_* macros should be guarded with a tree check
 checking for that ANY_INTEGRAL_TYPE_P instead.

Ok, sounds good.  Should be implemented in the following.

Bootstrapped/regtested on x86_64-linux and ppc64-linux, ok for trunk?

2014-12-10  Marek Polacek  pola...@redhat.com

* fold-const.c (fold_negate_expr): Add ANY_INTEGRAL_TYPE_P check.
(extract_muldiv_1): Likewise.
(maybe_canonicalize_comparison_1): Likewise.
(fold_comparison): Likewise.
(tree_binary_nonnegative_warnv_p): Likewise.
(tree_binary_nonzero_warnv_p): Likewise.
* gimple-ssa-strength-reduction.c (legal_cast_p_1): Likewise.
* tree-scalar-evolution.c (simple_iv): Likewise.
(scev_const_prop): Likewise.
* tree-ssa-loop-niter.c (expand_simple_operations): Likewise.
* tree-vect-generic.c (expand_vector_operation): Likewise.
* tree.h (ANY_INTEGRAL_TYPE_CHECK): Define.
(ANY_INTEGRAL_TYPE_P): Define.
(TYPE_OVERFLOW_WRAPS, TYPE_OVERFLOW_UNDEFINED, TYPE_OVERFLOW_TRAPS):
Add ANY_INTEGRAL_TYPE_CHECK.
(any_integral_type_check): New function.

diff --git gcc/fold-const.c gcc/fold-const.c
index 0c4fe40..6840bde 100644
--- gcc/fold-const.c
+++ gcc/fold-const.c
@@ -558,7 +558,8 @@ fold_negate_expr (location_t loc, tree t)
 case INTEGER_CST:
   tem = fold_negate_const (t, type);
   if (TREE_OVERFLOW (tem) == TREE_OVERFLOW (t)
- || (!TYPE_OVERFLOW_TRAPS (type)
+ || (ANY_INTEGRAL_TYPE_P (type)
+  !TYPE_OVERFLOW_TRAPS (type)
   TYPE_OVERFLOW_WRAPS (type))
  || (flag_sanitize  SANITIZE_SI_OVERFLOW) == 0)
return tem;
@@ -5951,7 +5952,8 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
tree wide_type,
   || EXPRESSION_CLASS_P (op0))
  /* ... and has wrapping overflow, and its type is smaller
 than ctype, then we cannot pass through as widening.  */
-  ((TYPE_OVERFLOW_WRAPS (TREE_TYPE (op0))
+  (((ANY_INTEGRAL_TYPE_P (TREE_TYPE (op0))
+TYPE_OVERFLOW_WRAPS (TREE_TYPE (op0)))
(TYPE_PRECISION (ctype)
TYPE_PRECISION (TREE_TYPE (op0
  /* ... or this is a truncation (t is narrower than op0),
@@ -5966,7 +5968,8 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, 
tree wide_type,
  /* ... or has undefined overflow while the converted to
 type has not, we cannot do the operation in the inner type
 as that would introduce undefined overflow.  */
- || (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0))
+ || ((ANY_INTEGRAL_TYPE_P (TREE_TYPE (op0))
+   TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0)))
   !TYPE_OVERFLOW_UNDEFINED (type
break;
 
@@ -8497,7 +8500,8 @@ maybe_canonicalize_comparison_1 (location_t loc, enum 
tree_code code, tree type,
 
   /* Match A +- CST code arg1 and CST code arg1.  We can change the
  first form only if overflow is undefined.  */
-  if (!((TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0))
+  if (!(((ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg0))
+  TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0)))
 /* In principle pointers also have undefined overflow behavior,
but that causes problems elsewhere.  */
  !POINTER_TYPE_P (TREE_TYPE (arg0))
@@ -8712,7 +8716,9 @@ fold_comparison (location_t loc, enum tree_code code, 
tree type,
 
   /* Transform comparisons of the form X +- C1 CMP C2 to X CMP C2 -+ C1.  */
   if ((TREE_CODE (arg0) == PLUS_EXPR || TREE_CODE (arg0) == MINUS_EXPR)
-   (equality_code || TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0)))
+   (equality_code
+ || (ANY_INTEGRAL_TYPE_P (TREE_TYPE (arg0))
+  TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (arg0
TREE_CODE (TREE_OPERAND (arg0, 1)) == INTEGER_CST
!TREE_OVERFLOW (TREE_OPERAND (arg0, 1))
TREE_CODE

Merge from trunk to gccgo branch

2014-12-10 Thread Ian Lance Taylor

I merged trunk revision 218558 to the gccgo branch.

Ian

[PATCH] Fix forwprop-29.c testcase

2014-12-10 Thread Richard Biener


The testcase relies on inlining two functions but if ICF unifies
them (they are equal) then for some reason that doesn't happen.
Even more weird on x86_64 ICF doesn't unify them.

Well, the testcase isn't about ICF so simply disable it.

Richard.

2014-12-10  Richard Biener  rguent...@suse.de

* gcc.dg/tree-ssa/forwprop-29.c: Add -fno-ipa-icf.

Index: gcc/testsuite/gcc.dg/tree-ssa/forwprop-29.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/forwprop-29.c (revision 218515)
+++ gcc/testsuite/gcc.dg/tree-ssa/forwprop-29.c (working copy)
@@ -1,4 +1,4 @@
-/* { dg-options -O2 } */
+/* { dg-options -O2 -fno-ipa-icf } */
 
 void runtime_error (void) __attribute__ ((noreturn));
 void compiletime_error (void) __attribute__ ((noreturn, error ()));

Re: [PATCH 2/3] Extended if-conversion

2014-12-10 Thread Yuri Rumyantsev

Richard,

Thanks for your reply!

I didn't understand your point:

Well, I don't mind splitting all critical edges unconditionally

but you do it unconditionally in proposed patch. Also I assume that
call of split_critical_edges() can break ssa. For example, we can
split headers of loops, loop exit blocks etc. I prefer to do something
more loop-specialized, e.g. call edge_split() for critical edges
outgoing from bb ending with GIMPLE_COND stmt (assuming that edge
destination bb belongs to loop).


2014-12-10 17:31 GMT+03:00 Richard Biener richard.guent...@gmail.com:
 On Wed, Dec 10, 2014 at 11:54 AM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Richard,

 Sorry that I forgot to delete debug dump from my fix.
 I have few questions about your comments.

 1. You wrote :
 You also still have two functions for PHI predication.  And the
 new extended variant doesn't commonize the 2-args and general
 path
  Did you mean that I must combine predicate_scalar_phi and
 predicate_extended scalar phi to one function?
 Please note that if additional flag was not set up (i.e.
 aggressive_if_conv is false) extended predication is required more
 compile time since it builds hash_map.

 It's compile-time complexity is reasonable enough even for
 non-aggressive if-conversion.

 2. About critical edge splitting.

 Did you mean that we should perform it (1) under aggressive_if_conv
 option only; (2) should we split all critical edges.
 Note that this leads to recomputing of topological order.

 Well, I don't mind splitting all critical edges unconditionally, thus
 do something like

 Index: gcc/tree-if-conv.c
 ===
 --- gcc/tree-if-conv.c  (revision 218515)
 +++ gcc/tree-if-conv.c  (working copy)
 @@ -2235,12 +2235,21 @@ pass_if_conversion::execute (function *f
if (number_of_loops (fun) = 1)
  return 0;

 +  bool critical_edges_split_p = false;
FOR_EACH_LOOP (loop, 0)
  if (flag_tree_loop_if_convert == 1
 || flag_tree_loop_if_convert_stores == 1
 || ((flag_tree_loop_vectorize || loop-force_vectorize)
  !loop-dont_vectorize))
 -  todo |= tree_if_conversion (loop);
 +  {
 +   if (!critical_edges_split_p)
 + {
 +   split_critical_edges ();
 +   critical_edges_split_p = true;
 +   todo |= TODO_cleanup_cfg;
 + }
 +   todo |= tree_if_conversion (loop);
 +  }

  #ifdef ENABLE_CHECKING
{

 It is worth noting that in current implementation bb's with 2
 predecessors and both are on critical edges are accepted without
 additional option.

 Yes, I know.

 tree-if-conv.c is a mess right now and if we can avoid adding more
 to it and even fix the critical edge missed optimization with splitting
 critical edges then I am all for that solution.

 Richard.

 Thanks ahead.
 Yuri.
 2014-12-09 18:20 GMT+03:00 Richard Biener richard.guent...@gmail.com:
 On Tue, Dec 9, 2014 at 2:11 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Richard,

 Here is updated patch2 with the following changes:
 1. Delete functions  phi_has_two_different_args and find_insertion_point.
 2. Use only one function for extended predication -
 predicate_extended_scalar_phi.
 3. Save gsi before insertion of predicate computations for basic
 blocks if it has 2 predecessors and
 both incoming edges are critical or it gas more than 2 predecessors
 and at least one incoming edge
 is critical. This saved iterator can be used by extended phi predication.

 Here is motivated test-case which explains this point.
 Test-case is attached (t5.c) and it must be compiled with -O2
 -ftree-loop-vectorize -fopenmp options.
 The problem phi is in bb-7:

   bb_5 (preds = {bb_4 }, succs = {bb_7 bb_9 })
   {
 bb 5:
 xmax_edge_18 = xmax_edge_36 + 1;
 if (xmax_17 == xmax_27)
   goto bb 7;
 else
   goto bb 9;

   }
   bb_6 (preds = {bb_4 }, succs = {bb_7 bb_8 })
   {
 bb 6:
 if (xmax_17 == xmax_27)
   goto bb 7;
 else
   goto bb 8;

   }
   bb_7 (preds = {bb_6 bb_5 }, succs = {bb_11 })
   {
 bb 7:
 # xmax_edge_30 = PHI xmax_edge_36(6), xmax_edge_18(5)
 xmax_edge_19 = xmax_edge_39 + 1;
 goto bb 11;

   }

 Note that both incoming edges to bb_7 are critical. If we comment out
 restoring gsi in predicate_all_scalar_phi:
 #if 0
  if ((EDGE_COUNT (bb-preds) == 2  all_preds_critical_p (bb))
  || (EDGE_COUNT (bb-preds)  2  has_pred_critical_p (bb)))
gsi = bb_insert_point (bb);
  else
 #endif
gsi = gsi_after_labels (bb);

 we will get ICE:
 t5.c: In function 'foo':
 t5.c:9:6: error: definition in block 4 follows the use
  void foo (int n)
   ^
 for SSA_NAME: _1 in statement:
 _52 = _1  _3;
 t5.c:9:6: internal compiler error: verify_ssa failed

 smce predicate computations were inserted in bb_7.

 The issue is obviously that the predicates have already been emitted
 in the target BB - that's of course the wrong place.  This is done
 by insert_gimplified_predicates.

[PATCH][AArch64] Fix usage of +no in error message for aarch64_parse_extension

2014-12-10 Thread Kyrill Tkachov


Hi all,

The error message when parsing feature modifiers can be improved.
Currently, if the user gives something like -march=armv8-a+ the error 
message will read:

error: missing feature modifier after '+no' even though '+no' was not given.

With this patch we will now say:
error: missing feature modifier after '+'

Tested aarch64-none-elf.

Ok for trunk?

Thanks,
Kyrill

2014-12-10  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/aarch64/aarch64.c (aarch64_parse_extension): Update error
message to say +no only when removing extension.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index e682edd..cbf0842 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6506,7 +6506,8 @@ aarch64_parse_extension (char *str)
 
   if (len == 0)
 	{
-	  error (missing feature modifier after %qs, +no);
+	  error (missing feature modifier after %qs, adding_ext ? +
+	  : +no);
 	  return;
 	}

Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-10 Thread Ilya Enkovich

2014-12-10 17:49 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

   * gcc.dg/lto/lto.exp: Load mpx-dg.exp.
   * gcc.dg/lto/chkp-privatize_0.c: New.
   * gcc.dg/lto/chkp-privatize_1.c: New.

 xgcc: error: unrecognized command line option '-mmpx'

 FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,  -fPIC 
 -flto -flto-partition=max -fcheck-pointer-bounds -mmpx

What is the target?

Ilya


 Andreas.

 --
 Andreas Schwab, SUSE Labs, sch...@suse.de
 GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
 And now for something completely different.

[PATCH] Fix PR64048

2014-12-10 Thread Richard Biener


Committed.

Richard.

2014-12-10  Richard Biener  rguent...@suse.de

PR testsuite/64048
* gcc.dg/tree-prof/peel-1.c: Update dump scanning.

Index: gcc/testsuite/gcc.dg/tree-prof/peel-1.c
===
--- gcc/testsuite/gcc.dg/tree-prof/peel-1.c (revision 218582)
+++ gcc/testsuite/gcc.dg/tree-prof/peel-1.c (working copy)
@@ -20,7 +20,7 @@ main()
 t();
   return 0;
 }
-/* { dg-final-use { scan-rtl-dump Considering simply peeling loop 
loop2_unroll } } */
+/* { dg-final-use { scan-tree-dump Peeled loop ., 2 times cunroll } } */
 /* In fact one peeling is enough; we however mispredict number of iterations 
of the loop
at least until loop_ch is schedule ahead of profiling pass.  */
-/* { dg-final-use { cleanup-rtl-dump loop2_unroll } } */
+/* { dg-final-use { cleanup-tree-dump cunroll } } */

Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-10 Thread Andreas Schwab

Ilya Enkovich enkovich@gmail.com writes:

 2014-12-10 17:49 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

   * gcc.dg/lto/lto.exp: Load mpx-dg.exp.
   * gcc.dg/lto/chkp-privatize_0.c: New.
   * gcc.dg/lto/chkp-privatize_1.c: New.

 xgcc: error: unrecognized command line option '-mmpx'

 FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,  -fPIC 
 -flto -flto-partition=max -fcheck-pointer-bounds -mmpx

 What is the target?

Pick any.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.

Re: Reducing the amount of builtins by merging named patterns

2014-12-10 Thread Richard Henderson

On 12/10/2014 06:08 AM, Blumental Maxim wrote:
 We have ~30 groups of similar  (i.e.having similar sets of attributes)
 named patterns. These groups together include ~230 template (i.e.
 having substitution attributes in their names) named patterns in
 total. So, we can reduce the amount of template named patterns by ~200
 at best. Those template named patterns correspond to several specific
 named patterns each. E.g. in my patch (see attached patch) I merged
 two template named patterns into one and that allowed me to replace
 four builtin's with only two.

I don't find this particularly readable or maintainable.
What do you hope to gain here?


r~

[PATCH x86] Add march/mtune=knl

2014-12-10 Thread Ilya Tocar

Hi,

Patch bellow adds march/mtune/attribute=knl.
For now this is just silvermont tuning and avx/avx2/avx512 support.
Ok for trunk?

gcc/
* config.gcc: Support knl.
* config/i386/driver-i386.c (host_detect_local_cpu): Detect knl.
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
PROCESSOR_KNL.
* config/i386/i386.c (m_KNL): Define.
(processor_target_table): Add knl.
(PTA_KNL): Define.
(ix86_issue_rate): Add PROCESSOR_KNL.
(ix86_adjust_cost): Ditto.
(ia32_multipass_dfa_lookahead): Ditto.
(get_builtin_code_for_version): Handle knl.
(fold_builtin_cpu): Ditto.
* config/i386/i386.h (TARGET_KNL): Define.
(processor_type): Add PROCESSOR_KNL.
* config/i386/i386.md (attr cpu): Add knl.
* config/i386/x86-tune.def: Add m_KNL.

gcc/testsuite/
* gcc.target/i386/funcspec-5.c: Test avx512f and knl.

---
 gcc/config.gcc |  3 +-
 gcc/config/i386/driver-i386.c  |  6 +++-
 gcc/config/i386/i386-c.c   |  7 +
 gcc/config/i386/i386.c | 17 ++-
 gcc/config/i386/i386.h |  2 ++
 gcc/config/i386/i386.md|  2 +-
 gcc/config/i386/x86-tune.def   | 47 +++---
 gcc/testsuite/gcc.target/i386/funcspec-5.c |  3 ++
 8 files changed, 60 insertions(+), 27 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index fa3e1fc..8541274 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -591,7 +591,8 @@ pentium4 pentium4m pentiumpro prescott
 x86_64_archs=amdfam10 athlon64 athlon64-sse3 barcelona bdver1 bdver2 \
 bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \
 core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \
-sandybridge ivybridge haswell broadwell bonnell silvermont x86-64 native
+sandybridge ivybridge haswell broadwell bonnell silvermont knl x86-64 \
+native
 
 # Additional x86 processors supported by --with-cpu=.  Each processor
 # MUST be separated by exactly one space.
diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index a2248ce..69ebebd 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -747,7 +747,11 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
  if (arch)
{
  /* This is unknown family 0x6 CPU.  */
- if (has_adx)
+ /* Assume Knl.  */
+ if (has_avx512f)
+   cpu = knl;
+ /* Assume Broadwell.  */
+ else if (has_adx)
cpu = broadwell;
  else if (has_avx2)
/* Assume Haswell.  */
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 3ad7d49..1c604fc3 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -171,6 +171,10 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
   def_or_undef (parse_in, __silvermont);
   def_or_undef (parse_in, __silvermont__);
   break;
+case PROCESSOR_KNL:
+  def_or_undef (parse_in, __knl);
+  def_or_undef (parse_in, __knl__);
+  break;
 /* use PROCESSOR_max to not set/unset the arch macro.  */
 case PROCESSOR_max:
   break;
@@ -277,6 +281,9 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
   def_or_undef (parse_in, __tune_slm__);
   def_or_undef (parse_in, __tune_silvermont__);
   break;
+case PROCESSOR_KNL:
+  def_or_undef (parse_in, __tune_knl__);
+  break;
 case PROCESSOR_INTEL:
 case PROCESSOR_GENERIC:
   break;
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1e1716e..f0cbe48 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2040,6 +2040,7 @@ const struct processor_costs *ix86_cost = pentium_cost;
 #define m_CORE_ALL (m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE | m_HASWELL)
 #define m_BONNELL (1PROCESSOR_BONNELL)
 #define m_SILVERMONT (1PROCESSOR_SILVERMONT)
+#define m_KNL (1PROCESSOR_KNL)
 #define m_INTEL (1PROCESSOR_INTEL)
 
 #define m_GEODE (1PROCESSOR_GEODE)
@@ -2505,6 +2506,7 @@ static const struct ptt 
processor_target_table[PROCESSOR_max] =
   {haswell, core_cost, 16, 10, 16, 10, 16},
   {bonnell, atom_cost, 16, 15, 16, 7, 16},
   {silvermont, slm_cost, 16, 15, 16, 7, 16},
+  {knl, slm_cost, 16, 15, 16, 7, 16},
   {intel, intel_cost, 16, 15, 16, 7, 16},
   {geode, geode_cost, 0, 0, 0, 0, 0},
   {k6, k6_cost, 32, 7, 32, 7, 32},
@@ -3178,6 +3180,8 @@ ix86_option_override_internal (bool main_args_p,
| PTA_FMA | PTA_MOVBE | PTA_HLE)
 #define PTA_BROADWELL \
   (PTA_HASWELL | PTA_ADX | PTA_PRFCHW | PTA_RDSEED)
+#define PTA_KNL \
+  (PTA_BROADWELL | PTA_AVX512PF | PTA_AVX512ER | PTA_AVX512F | PTA_AVX512CD)
 #define PTA_BONNELL \
   (PTA_CORE2 | PTA_MOVBE)
 #define PTA_SILVERMONT \
@@ -3241,6 +3245,7 @@ ix86_option_override_internal (bool main_args_p,
   {atom, PROCESSOR_BONNELL,

Re: [PATCH, x86] Fix pblendv expand.

2014-12-10 Thread Uros Bizjak

On Wed, Dec 10, 2014 at 12:03 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Wed, Dec 10, 2014 at 02:37:13AM +0300, Evgeny Stupachenko wrote:
 2014-12-10  Evgeny Stupachenko  evstu...@gmail.com

 I went ahead and filed a PR, so we have something to refer to in the
 ChangeLog and name the testcases.

Thanks!

 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -47546,6 +47546,7 @@ expand_vec_perm_pblendv (struct expand_vec_perm_d *d)
  dcopy.op0 = dcopy.op1 = d-op1;
else
  dcopy.op0 = dcopy.op1 = d-op0;
 +  dcopy.target = gen_reg_rtx (vmode);
dcopy.one_operand_p = true;

for (i = 0; i  nelt; ++i)

 This is incorrect if d-testing_p is true (can happen e.g. on the testcase
 below; generally, when testing_p is true, we should not use gen_reg_rtx
 because it can be called from GIMPLE optimizers before init_emit is called.
 See PR57896 for details.

Yes, this was also my concern. We've had some nasty failures occurring
when vReg was generated after reload.

 If d-testing_p is true, target is never the same as any of the operands,
 all 3 are virtual registers, so there is no overlap (and even if there would
 be, it should not matter).

 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/vect/blend.c
 @@ -0,0 +1,63 @@
 +/* Test correctness of size 3 store groups permutation.  */
 +/* { dg-do run } */

 The testcase as is does not fail even with unfixed compiler.  I had to add
 some dg-additional-options.

 +  bar(2, q);
 +  for (i = 0; i  N; i++)
 +if (q[0].a[i].f != 0 || q[0].a[i].c != i || q[0].a[i].p != -1)
 +  return 1;

 Furthermore, tests in the testsuite should fail through abort ()
 / __builtin_abort () instead of just returning non-zero exit code.

 So, here is complete updated patch I'm going to bootstrap/regtest.

 2014-12-10  Jakub Jelinek  ja...@redhat.com
 Evgeny Stupachenko  evstu...@gmail.com

 * config/i386/i386.c (expand_vec_perm_pblendv): If not testing_p,
 set dcopy.target to a new pseudo.

 * gcc.dg/vect/pr64252.c: New test.
 * gcc.dg/pr64252.c: New test.
 * gcc.target/i386/avx2-pr64252.c: New test.

OK.

Thanks,
Uros.


 --- gcc/config/i386/i386.c.jj   2014-12-10 09:45:15.0 +0100
 +++ gcc/config/i386/i386.c  2014-12-10 11:38:12.530795610 +0100
 @@ -47554,6 +47554,8 @@ expand_vec_perm_pblendv (struct expand_v
  dcopy.op0 = dcopy.op1 = d-op1;
else
  dcopy.op0 = dcopy.op1 = d-op0;
 +  if (!d-testing_p)
 +dcopy.target = gen_reg_rtx (vmode);
dcopy.one_operand_p = true;

for (i = 0; i  nelt; ++i)
 --- gcc/testsuite/gcc.dg/vect/pr64252.c.jj  2014-12-10 11:42:47.669991028 
 +0100
 +++ gcc/testsuite/gcc.dg/vect/pr64252.c 2014-12-10 11:47:43.895818223 +0100
 @@ -0,0 +1,66 @@
 +/* PR target/64252 */
 +/* Test correctness of size 3 store groups permutation.  */
 +/* { dg-do run } */
 +/* { dg-additional-options -O3 } */
 +/* { dg-additional-options -mavx { target avx_runtime } } */
 +
 +#include tree-vect.h
 +
 +#define N 50
 +
 +enum num3
 +{
 +  a, b, c
 +};
 +
 +struct flags
 +{
 +  enum num3 f;
 +  unsigned int c;
 +  unsigned int p;
 +};
 +
 +struct flagsN
 +{
 +  struct flags a[N];
 +};
 +
 +void
 +bar (int n, struct flagsN *ff)
 +{
 +  struct flagsN *fc;
 +  for (fc = ff + 1; fc  (ff + n); fc++)
 +{
 +  int i;
 +  for (i = 0; i  N; ++i)
 +   {
 + ff-a[i].f = 0;
 + ff-a[i].c = i;
 + ff-a[i].p = -1;
 +   }
 +  for (i = 0; i  n; i++)
 +   {
 + int j;
 + for (j = 0; j  N - n; ++j)
 +   {
 + fc-a[i + j].f = 0;
 + fc-a[i + j].c = j + i;
 + fc-a[i + j].p = -1;
 +   }
 +   }
 +}
 +}
 +
 +struct flagsN q[2];
 +
 +int main()
 +{
 +  int i;
 +  check_vect ();
 +  bar(2, q);
 +  for (i = 0; i  N; i++)
 +if (q[0].a[i].f != 0 || q[0].a[i].c != i || q[0].a[i].p != -1)
 +  abort ();
 +  return 0;
 +}
 +/* { dg-final { cleanup-tree-dump vect } } */
 --- gcc/testsuite/gcc.dg/pr64252.c.jj   2014-12-10 11:26:05.649467180 +0100
 +++ gcc/testsuite/gcc.dg/pr64252.c  2014-12-10 11:25:27.057139759 +0100
 @@ -0,0 +1,30 @@
 +/* PR target/64252 */
 +/* { dg-do run } */
 +/* { dg-options -O2 } */
 +
 +typedef unsigned int V __attribute__((vector_size (32)));
 +
 +__attribute__((noinline, noclone)) void
 +foo (V *a, V *b, V *c, V *d, V *e)
 +{
 +  V t = __builtin_shuffle (*a, *b, *c);
 +  V v = __builtin_shuffle (t, (V) { ~0U, ~0U, ~0U, ~0U, ~0U, ~0U, ~0U, ~0U 
 }, (V) { 0, 1, 8, 3, 4, 5, 9, 7 });
 +  v = v + *d;
 +  *e = v;
 +}
 +
 +int
 +main ()
 +{
 +  V a, b, c, d, e;
 +  int i;
 +  a = (V) { 1, 2, 3, 4, 5, 6, 7, 8 };
 +  b = (V) { 9, 10, 11, 12, 13, 14, 15, 16 };
 +  c = (V) { 1, 3, 5, 7, 9, 11, 13, 15 };
 +  d = (V) { 0, 0, 0, 0, 0, 0, 0, 0 };
 +  foo (a, b, c, d, e);
 +  for (i = 0; i  8; i++)
 +if (e[i] != ((i == 2 || i == 6) ? ~0U : 2 + 2 * i))
 +  __builtin_abort ();
 +  return 0;
 +}
 --- gcc/testsuite/gcc.target/i386/avx2-pr64252.c.jj 2014-12-10

[PATCH][AARCH64]Use AARCH64_FL_FPSIMD flags for all cores in aarch64-cores.def

2014-12-10 Thread Renlin Li


Hi all,

This patch will change AARCH64_FL_FPSIMD flags in aarch64-cores.def to 
AARCH64_FL_FOR_ARCH8 for all cores. In addition, duplicated flags from 
AARCH64_CORE expansion macro is removed.


After the change, flags for a core should only be defined in 
aarch64-cores.def file.



aarch64-none-elf has been tested on the model. No new issue.
Okay for trunk?

Regards,
Renlin Li


gcc/ChangeLog:

2014-12-10  Renlin Li  renlin...@arm.com

* config/aarch64/aarch64-cores.def: Change all AARCH64_FL_FPSIMD to
AARCH64_FL_FOR_ARCH8.
* config/aarch64/aarch64.c (all_cores): Use FLAGS from 
aarch64-cores.def file

only.

diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index 312941f..110b41f 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -34,10 +34,10 @@
 
 /* V8 Architecture Processors.  */
 
-AARCH64_CORE(cortex-a53,  cortexa53, cortexa53, 8,  AARCH64_FL_FPSIMD | AARCH64_FL_CRC, cortexa53)
-AARCH64_CORE(cortex-a57,  cortexa15, cortexa15, 8,  AARCH64_FL_FPSIMD | AARCH64_FL_CRC, cortexa57)
-AARCH64_CORE(thunderx,thunderx,  thunderx, 8,  AARCH64_FL_FPSIMD | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx)
+AARCH64_CORE(cortex-a53,  cortexa53, cortexa53, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa53)
+AARCH64_CORE(cortex-a57,  cortexa15, cortexa15, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57)
+AARCH64_CORE(thunderx,thunderx,  thunderx, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx)
 
 /* V8 big.LITTLE implementations.  */
 
-AARCH64_CORE(cortex-a57.cortex-a53,  cortexa57cortexa53, cortexa53, 8,  AARCH64_FL_FPSIMD | AARCH64_FL_CRC, cortexa57)
+AARCH64_CORE(cortex-a57.cortex-a53,  cortexa57cortexa53, cortexa53, 8,  AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC, cortexa57)
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index e682edd..0e3b2be 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -374,12 +374,10 @@ struct processor
 static const struct processor all_cores[] =
 {
 #define AARCH64_CORE(NAME, IDENT, SCHED, ARCH, FLAGS, COSTS) \
-  {NAME, SCHED, #ARCH, ARCH,\
-FLAGS | AARCH64_FL_FOR_ARCH##ARCH, COSTS##_tunings},
+  {NAME, SCHED, #ARCH, ARCH, FLAGS, COSTS##_tunings},
 #include aarch64-cores.def
 #undef AARCH64_CORE
-  {generic, cortexa53, 8, 8,\
-AARCH64_FL_FPSIMD | AARCH64_FL_FOR_ARCH8, generic_tunings},
+  {generic, cortexa53, 8, 8, AARCH64_FL_FOR_ARCH8, generic_tunings},
   {NULL, aarch64_none, NULL, 0, 0, NULL}
 };

Re: [Ping] Port of VTV for Cygwin and MinGW

2014-12-10 Thread Patrick Wollgast

Ping.

https://gcc.gnu.org/ml/gcc-patches/2014-11/msg03368.html

On 27.11.2014 10:42, Patrick Wollgast wrote:
 On 12.11.2014 19:40, Kai Tietz wrote:
 TerminateProcess is actually bad, as it doesn't call any of the atexit
 handlers.  You simply nuke the process off.  For cygwin this behavior
 is inacceptable.  Why a classical abort, or a classical exit call
 cause for you that issues?  It seems to me more related to some other
 thing you try to paper over by this.

 
 It turns out the test program made some trouble. I rewrote it to the
 attached program (virtual_func_test_min_AW.cpp). I changed obstack.c and
 vtv_rts.cc to the C-runtime functions. For testing I used a program just
 containing an abort and all three tests in the attached test program.
 The call stack, passed parameters and behavior matched at the crucial
 parts (tested again on MinGW 32/64bit).
 

 Regarding the question, why I reimplemented mprotect, I also haven't
 changed anything in the patch but answered the question.

 And this doesn't make it better.  It is present in the static part of
 libgcc.  Have you tried to declare it with extern C (for C++ case)
 and simply use it?
 Cygwin provides its own version too.  So there seems to me no real
 need to re-implement it.

 
 You're right. I was stuck with the idea of importing it dynamically, but
 changed it to extern C now.
 
 Regards,
 Patrick

Re: [PATCH x86] Add march/mtune=knl

2014-12-10 Thread Uros Bizjak

On Wed, Dec 10, 2014 at 5:20 PM, Ilya Tocar tocarip.in...@gmail.com wrote:
 Hi,

 Patch bellow adds march/mtune/attribute=knl.
 For now this is just silvermont tuning and avx/avx2/avx512 support.
 Ok for trunk?

 gcc/
 * config.gcc: Support knl.
 * config/i386/driver-i386.c (host_detect_local_cpu): Detect knl.
 * config/i386/i386-c.c (ix86_target_macros_internal): Handle
 PROCESSOR_KNL.
 * config/i386/i386.c (m_KNL): Define.
 (processor_target_table): Add knl.
 (PTA_KNL): Define.
 (ix86_issue_rate): Add PROCESSOR_KNL.
 (ix86_adjust_cost): Ditto.
 (ia32_multipass_dfa_lookahead): Ditto.
 (get_builtin_code_for_version): Handle knl.
 (fold_builtin_cpu): Ditto.
 * config/i386/i386.h (TARGET_KNL): Define.
 (processor_type): Add PROCESSOR_KNL.
 * config/i386/i386.md (attr cpu): Add knl.
 * config/i386/x86-tune.def: Add m_KNL.

 gcc/testsuite/
 * gcc.target/i386/funcspec-5.c: Test avx512f and knl.

OK with a small comment nit below.

Thanks,
Uros.


 ---
  gcc/config.gcc |  3 +-
  gcc/config/i386/driver-i386.c  |  6 +++-
  gcc/config/i386/i386-c.c   |  7 +
  gcc/config/i386/i386.c | 17 ++-
  gcc/config/i386/i386.h |  2 ++
  gcc/config/i386/i386.md|  2 +-
  gcc/config/i386/x86-tune.def   | 47 
 +++---
  gcc/testsuite/gcc.target/i386/funcspec-5.c |  3 ++
  8 files changed, 60 insertions(+), 27 deletions(-)

 diff --git a/gcc/config.gcc b/gcc/config.gcc
 index fa3e1fc..8541274 100644
 --- a/gcc/config.gcc
 +++ b/gcc/config.gcc
 @@ -591,7 +591,8 @@ pentium4 pentium4m pentiumpro prescott
  x86_64_archs=amdfam10 athlon64 athlon64-sse3 barcelona bdver1 bdver2 \
  bdver3 bdver4 btver1 btver2 k8 k8-sse3 opteron opteron-sse3 nocona \
  core2 corei7 corei7-avx core-avx-i core-avx2 atom slm nehalem westmere \
 -sandybridge ivybridge haswell broadwell bonnell silvermont x86-64 native
 +sandybridge ivybridge haswell broadwell bonnell silvermont knl x86-64 \
 +native

  # Additional x86 processors supported by --with-cpu=.  Each processor
  # MUST be separated by exactly one space.
 diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
 index a2248ce..69ebebd 100644
 --- a/gcc/config/i386/driver-i386.c
 +++ b/gcc/config/i386/driver-i386.c
 @@ -747,7 +747,11 @@ const char *host_detect_local_cpu (int argc, const char 
 **argv)
   if (arch)
 {
   /* This is unknown family 0x6 CPU.  */
 - if (has_adx)
 + /* Assume Knl.  */

/* Assume Knights Landing.  */

 + if (has_avx512f)
 +   cpu = knl;
 + /* Assume Broadwell.  */
 + else if (has_adx)
 cpu = broadwell;
   else if (has_avx2)
 /* Assume Haswell.  */
 diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
 index 3ad7d49..1c604fc3 100644
 --- a/gcc/config/i386/i386-c.c
 +++ b/gcc/config/i386/i386-c.c
 @@ -171,6 +171,10 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
def_or_undef (parse_in, __silvermont);
def_or_undef (parse_in, __silvermont__);
break;
 +case PROCESSOR_KNL:
 +  def_or_undef (parse_in, __knl);
 +  def_or_undef (parse_in, __knl__);
 +  break;
  /* use PROCESSOR_max to not set/unset the arch macro.  */
  case PROCESSOR_max:
break;
 @@ -277,6 +281,9 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
def_or_undef (parse_in, __tune_slm__);
def_or_undef (parse_in, __tune_silvermont__);
break;
 +case PROCESSOR_KNL:
 +  def_or_undef (parse_in, __tune_knl__);
 +  break;
  case PROCESSOR_INTEL:
  case PROCESSOR_GENERIC:
break;
 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 1e1716e..f0cbe48 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -2040,6 +2040,7 @@ const struct processor_costs *ix86_cost = pentium_cost;
  #define m_CORE_ALL (m_CORE2 | m_NEHALEM  | m_SANDYBRIDGE | m_HASWELL)
  #define m_BONNELL (1PROCESSOR_BONNELL)
  #define m_SILVERMONT (1PROCESSOR_SILVERMONT)
 +#define m_KNL (1PROCESSOR_KNL)
  #define m_INTEL (1PROCESSOR_INTEL)

  #define m_GEODE (1PROCESSOR_GEODE)
 @@ -2505,6 +2506,7 @@ static const struct ptt 
 processor_target_table[PROCESSOR_max] =
{haswell, core_cost, 16, 10, 16, 10, 16},
{bonnell, atom_cost, 16, 15, 16, 7, 16},
{silvermont, slm_cost, 16, 15, 16, 7, 16},
 +  {knl, slm_cost, 16, 15, 16, 7, 16},
{intel, intel_cost, 16, 15, 16, 7, 16},
{geode, geode_cost, 0, 0, 0, 0, 0},
{k6, k6_cost, 32, 7, 32, 7, 32},
 @@ -3178,6 +3180,8 @@ ix86_option_override_internal (bool main_args_p,
 | PTA_FMA | PTA_MOVBE | PTA_HLE)
  #define PTA_BROADWELL \
(PTA_HASWELL | PTA_ADX | PTA_PRFCHW | PTA_RDSEED)

Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-10 Thread Ilya Enkovich

2014-12-10 18:53 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

 2014-12-10 17:49 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

   * gcc.dg/lto/lto.exp: Load mpx-dg.exp.
   * gcc.dg/lto/chkp-privatize_0.c: New.
   * gcc.dg/lto/chkp-privatize_1.c: New.

 xgcc: error: unrecognized command line option '-mmpx'

 FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,  -fPIC 
 -flto -flto-partition=max -fcheck-pointer-bounds -mmpx

 What is the target?

 Pick any.

Here it is.

./gcc/xgcc -v
Using built-in specs.
COLLECT_GCC=./gcc/xgcc
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc/configure --enable-languages=c,c++,fortran
--disable-bootstrap
Thread model: posix
gcc version 5.0.0 20141210 (experimental) (GCC)
make check RUNTESTFLAGS=--target_board='unix{-m32,}' lto.exp=chkp-privatize*
...

=== gcc Summary ===

# of expected passes6
...
grep PASS ./gcc/testsuite/gcc/gcc.log
PASS: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,
-fPIC -flto -flto-partition=max -fcheck-pointer-bounds -mmpx
PASS: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_1.o assemble,
-fPIC -flto -flto-partition=max -fcheck-pointer-bounds -mmpx
PASS: gcc.dg/lto/chkp-privatize
c_lto_chkp-privatize_0.o-c_lto_chkp-privatize_1.o link,  -fPIC -flto
-flto-partition=max -fcheck-pointer-bounds -mmpx
PASS: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,
-fPIC -flto -flto-partition=max -fcheck-pointer-bounds -mmpx
PASS: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_1.o assemble,
-fPIC -flto -flto-partition=max -fcheck-pointer-bounds -mmpx
PASS: gcc.dg/lto/chkp-privatize
c_lto_chkp-privatize_0.o-c_lto_chkp-privatize_1.o link,  -fPIC -flto
-flto-partition=max -fcheck-pointer-bounds -mmpx
grep FAIL ./gcc/testsuite/gcc/gcc.log

Ilya


 Andreas.

 --
 Andreas Schwab, SUSE Labs, sch...@suse.de
 GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
 And now for something completely different.

Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-10 Thread Andreas Schwab

Ilya Enkovich enkovich@gmail.com writes:

 2014-12-10 18:53 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

 2014-12-10 17:49 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

   * gcc.dg/lto/lto.exp: Load mpx-dg.exp.
   * gcc.dg/lto/chkp-privatize_0.c: New.
   * gcc.dg/lto/chkp-privatize_1.c: New.

 xgcc: error: unrecognized command line option '-mmpx'

 FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,  -fPIC 
 -flto -flto-partition=max -fcheck-pointer-bounds -mmpx

 What is the target?

 Pick any.

 Here it is.

Try again.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.

Re: [PATCH x86] Enable v64qi permutations.

2014-12-10 Thread Richard Henderson

On 12/04/2014 01:49 AM, Ilya Tocar wrote:
 +  if (!TARGET_AVX512BW || !(d-vmode == V64QImode))

Please don't over-complicate the expression.
Use x != y instead of !(x == y).


r~

[PATCH][ARM][doc] Remove mention of Advanced RISC Machines

2014-12-10 Thread Kyrill Tkachov


Hi all,

The company ARM is not Advanced RISC Machines anymore and anyway the ARM 
architecture is distinct from the company.

This patch adjusts the documentation accordingly.

Ok?

Thanks,
Kyrill

2014-12-10  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* doc/invoke.texi (ARM options): Remove mention of Advanced RISC
Machines.diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index f8fe15f..baf3c9f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -12667,8 +12667,7 @@ Replaced by @samp{-mmultcost}.
 @subsection ARM Options
 @cindex ARM options
 
-These @samp{-m} options are defined for Advanced RISC Machines (ARM)
-architectures:
+These @samp{-m} options are defined for the ARM port:
 
 @table @gcctabopt
 @item -mabi=@var{name}

Re: [PATCH x86] Enable v64qi permutations.

2014-12-10 Thread Robert Dewar


On 12/10/2014 11:49 AM, Richard Henderson wrote:

On 12/04/2014 01:49 AM, Ilya Tocar wrote:

+  if (!TARGET_AVX512BW || !(d-vmode == V64QImode))


Please don't over-complicate the expression.
Use x != y instead of !(x == y).


To me the original reads more clearly, since it
is of the parallel form !X or !Y, I don't see it
as somehow more complicated???



r~

Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-10 Thread Ilya Enkovich

2014-12-10 19:49 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

 2014-12-10 18:53 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

 2014-12-10 17:49 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

   * gcc.dg/lto/lto.exp: Load mpx-dg.exp.
   * gcc.dg/lto/chkp-privatize_0.c: New.
   * gcc.dg/lto/chkp-privatize_1.c: New.

 xgcc: error: unrecognized command line option '-mmpx'

 FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,  -fPIC 
 -flto -flto-partition=max -fcheck-pointer-bounds -mmpx

 What is the target?

 Pick any.

 Here it is.

 Try again.

Same result.  And see no errors for this test in gcc-regression.

Ilya


 Andreas.

 --
 Andreas Schwab, SUSE Labs, sch...@suse.de
 GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
 And now for something completely different.

Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-10 Thread Andreas Schwab

Ilya Enkovich enkovich@gmail.com writes:

 2014-12-10 19:49 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

 2014-12-10 18:53 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

 2014-12-10 17:49 GMT+03:00 Andreas Schwab sch...@suse.de:
 Ilya Enkovich enkovich@gmail.com writes:

   * gcc.dg/lto/lto.exp: Load mpx-dg.exp.
   * gcc.dg/lto/chkp-privatize_0.c: New.
   * gcc.dg/lto/chkp-privatize_1.c: New.

 xgcc: error: unrecognized command line option '-mmpx'

 FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,  
 -fPIC -flto -flto-partition=max -fcheck-pointer-bounds -mmpx

 What is the target?

 Pick any.

 Here it is.

 Try again.

 Same result.  And see no errors for this test in gcc-regression.

Try again.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.

Re: [patch c++]: Fix PR/64127 ICE on invalid: tree check: exprected identifier_node, have template_id_expr in cp_parser_diagnose_invalid_type_name

2014-12-10 Thread Paolo Carlini


Hi,

On 12/04/2014 07:17 PM, Kai Tietz wrote:

So added testcase for this pr (its c++98 only)

So:

ChangeLog testsuite

2014-12-04  Kai Tietz  kti...@redhat.com

   PR c++/64127
   * g++.dg/cpp/pr64127.C: New file.

Tested on x86_64-unknown-linux-gnu.

Ok to apply prior posted patch plus this new testcase?
The testcase is now in and is spuriously failing for everybody because 
in the testsuite, thus with -pedantic-errors, in one case an error 
instead of a warning is emitted plus the dg- directives are 
unnecessarily complicated and slightly wrong. Barring objections, I mean 
to commit the below pretty soon.


Thanks,
Paolo.

/
Index: g++.dg/cpp/pr64127.C
===
--- g++.dg/cpp/pr64127.C(revision 218586)
+++ g++.dg/cpp/pr64127.C(working copy)
@@ -1,9 +1,4 @@
 /* { dg-do compile { target c++98_only } } */
 
-template 0 int __copy_streambufs_eof; // { dg-error  }
-// { dg-error numeric constant  { target *-*-* } 3 }
-// { dg-warning variable templates  { target *-*-* } 3 }
-__copy_streambufs_eof  // { dg-error  }
-// { dg-error parse error  { target *-*-* } 6 }
-// { dg-error not name a type  { target *-*-* } 6 }
-
+template 0 int __copy_streambufs_eof; // { dg-error expected 
identifier|numeric constant|variable templates }
+__copy_streambufs_eof  // { dg-error template argument|parse error|not name 
a type }

[PATCH][libstdc++][testsuite][2/2] Mark tests that don't fit into memory as UNSUPPORTED

2014-12-10 Thread Kyrill Tkachov


Hi all,

Here is the second part that includes the new target-utils.exp in the 
libstdc++ testsuite and uses the check_unsupported_p procedure to mark 
tests that are too large for memory as unsupported.


Ok?

Thanks,
Kyrill

2014-12-10  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* testsuite/lib/libstdc++.exp: Include target-utils.exp.
(v3_target_compile): Check if test is unsupported.
(v3_target_compile_as_c): Likewise.commit 13abe3bbb6deab3de44935dea6c5fd9d62509ae7
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Wed Dec 3 10:33:44 2014 +

[libstdc++][testsuite] Check for programs not fitting into tiny memory models

diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp b/libstdc++-v3/testsuite/lib/libstdc++.exp
index 3d9913b..56649bb 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -57,6 +57,7 @@ load_gcc_lib target-libpath.exp
 load_gcc_lib timeout.exp
 load_gcc_lib timeout-dg.exp
 load_gcc_lib wrapper.exp
+load_gcc_lib target-utils.exp
 
 # Useful for debugging.  Pass the name of a variable and the verbosity
 # threshold (number of -v's on the command line).
@@ -455,6 +457,7 @@ proc v3_target_compile { source dest type options } {
 global cxxldflags
 global includes
 global STATIC_LIBCXXFLAGS
+global tool
 
 if { [target_info needs_status_wrapper] !=   [info exists gluefile] } {
 lappend options libs=${gluefile}
@@ -483,7 +486,14 @@ proc v3_target_compile { source dest type options } {
 lappend options compiler=$cxx_final
 lappend options timeout=[timeout_value]
 
-return [target_compile $source $dest $type $options]
+set comp_output [target_compile $source $dest $type $options]
+set unsupported_message [${tool}_check_unsupported_p $comp_output]
+
+if { $unsupported_message !=  } {
+  unsupported $dest: $unsupported_message
+  return 
+}
+return $comp_output
 }
 
 
@@ -498,6 +508,7 @@ proc v3_target_compile_as_c { source dest type options } {
 global cc
 global cxxflags
 global STATIC_LIBCXXFLAGS
+global tool
 
 if { [target_info needs_status_wrapper] !=   [info exists gluefile] } {
 lappend options libs=${gluefile}
@@ -551,7 +562,14 @@ proc v3_target_compile_as_c { source dest type options } {
 lappend options compiler=$cc_final
 lappend options timeout=[timeout_value]
 
-return [target_compile $source $dest $type $options]
+set comp_output [target_compile $source $dest $type $options]
+set unsupported_message [${tool}_check_unsupported_p $comp_output]
+
+if { $unsupported_message !=  } {
+  unsupported $dest: $unsupported_message
+  return 
+}
+return $comp_output
 }
 
 # Build the support objects linked in with the libstdc++ tests.  In

Re: [PATCH][libstdc++][testsuite] Mark as UNSUPPORTED tests that don't fit into tiny memory model

2014-12-10 Thread Kyrill Tkachov



On 09/12/14 20:14, Mike Stump wrote:

On Dec 9, 2014, at 3:17 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

In the gcc and g++ testsuite we already catch such cases and mark them as 
UNSUPPORTED. In libstdc++.exp there is no such functionality.
I'm not very happy that it had to be copied, but I couldn't find a way to 
include the gcc definition sanely.
Is this a sane approach to what I'm trying to solve?

Ok.

If you would like, you can try and pull the common parts out into a new file, 
and include (load) that file from the two places that currently do that.  If 
they are exactly identical, should be trivial enough.  If not exactly the same, 
I’d do two patches, once to make them the same, then, the second one to split 
them out.


Thanks for the guidance. I've moved the definitions into a separate file 
and included that in the places that use it (more than 2 places in my 
count). This is the patch attached.


The second patch (will send shortly after this) adds the logic to libstdc++.

Ok?

Kyrilll


2014-12-10  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* lib/target-utils.exp: New file.
* lib/gcc-defs.exp (${tool}_check_unsupported_p): Delete.
Include target-utils.exp.
* lib/objc.exp: Likewise.
* lib/mike-gcc.exp: Load target-utils.exp.
* lib/mike-g++.exp: Likewise.
* lib/go-torture.exp: Likewise.
* lib/fortran-torture.exp: Likewise.
* lib/c-torture.exp: Likewise.

commit e0a9ae608b48901cc97efa21ac330c6b0dcf8110
Author: Kyrylo Tkachov kyrylo.tkac...@arm.com
Date:   Wed Dec 3 10:33:44 2014 +

[libstdc++][testsuite] Check for programs not fitting into tiny memory models

diff --git a/gcc/testsuite/lib/c-torture.exp b/gcc/testsuite/lib/c-torture.exp
index fde76fd..3e33962 100644
--- a/gcc/testsuite/lib/c-torture.exp
+++ b/gcc/testsuite/lib/c-torture.exp
@@ -19,6 +19,7 @@
 load_lib target-supports.exp
 load_lib file-format.exp
 load_lib target-libpath.exp
+load_lib target-utils.exp
 
 # The default option list can be overridden by
 # TORTURE_OPTIONS={ { list1 } ... { listN } }
diff --git a/gcc/testsuite/lib/fortran-torture.exp b/gcc/testsuite/lib/fortran-torture.exp
index e7abac8..cbc3427 100644
--- a/gcc/testsuite/lib/fortran-torture.exp
+++ b/gcc/testsuite/lib/fortran-torture.exp
@@ -22,6 +22,7 @@
 
 load_lib target-supports.exp
 load_lib fortran-modules.exp
+load_lib target-utils.exp
 
 # Return the list of options to use for fortran torture tests.
 # The default option list can be overridden by
diff --git a/gcc/testsuite/lib/gcc-defs.exp b/gcc/testsuite/lib/gcc-defs.exp
index d479667..a9c0d61 100644
--- a/gcc/testsuite/lib/gcc-defs.exp
+++ b/gcc/testsuite/lib/gcc-defs.exp
@@ -18,6 +18,8 @@ load_lib target-libpath.exp
 
 load_lib wrapper.exp
 
+load_lib target-utils.exp
+
 #
 # ${tool}_check_compile -- Reports and returns pass/fail for a compilation
 #
@@ -145,34 +147,6 @@ proc ${tool}_exit { } {
 	unset gluefile
 }
 }
-
-#
-# ${tool}_check_unsupported_p -- Check the compiler(/assembler/linker) output 
-#	for text indicating that the testcase should be marked as unsupported
-#
-# Utility used by mike-gcc.exp and c-torture.exp.
-# When dealing with a large number of tests, it's difficult to weed out the
-# ones that are too big for a particular cpu (eg: 16 bit with a small amount
-# of memory).  There are various ways to deal with this.  Here's one.
-# Fortunately, all of the cases where this is likely to happen will be using
-# gld so we can tell what the error text will look like.
-#
-
-proc ${tool}_check_unsupported_p { output } {
-if [regexp (^|\n)\[^\n\]*: region \[^\n\]* is full $output] {
-	return memory full
-}
-if { [regexp (^|\n)\[^\n\]*: relocation truncated to fit $output]
-   [check_effective_target_tiny] } {
-return memory full
- }
-
-if { [istarget spu-*-*]  \
-	 [string match *exceeds local store* $output] } {
-	return memory full
-}
-return 
-}
 
 #
 # runtest_file_p -- Provide a definition for older dejagnu releases
diff --git a/gcc/testsuite/lib/go-torture.exp b/gcc/testsuite/lib/go-torture.exp
index d37d475..fc2f559 100644
--- a/gcc/testsuite/lib/go-torture.exp
+++ b/gcc/testsuite/lib/go-torture.exp
@@ -22,6 +22,8 @@
 
 load_lib target-supports.exp
 
+load_lib target-utils.exp
+
 # The default option list can be overridden by
 # TORTURE_OPTIONS={ { list1 } ... { listN } }
 
diff --git a/gcc/testsuite/lib/mike-g++.exp b/gcc/testsuite/lib/mike-g++.exp
index d5f31a8..e60dff8 100644
--- a/gcc/testsuite/lib/mike-g++.exp
+++ b/gcc/testsuite/lib/mike-g++.exp
@@ -16,6 +16,8 @@
 
 # This file was written by Mike Stump m...@cygnus.com
 
+load_lib target-utils.exp
+
 #
 # mike_cleanup -- remove any files that are created by the testcase
 #
diff --git a/gcc/testsuite/lib/mike-gcc.exp b/gcc/testsuite/lib/mike-gcc.exp
index 68cca23..b2705e6 100644
--- a/gcc/testsuite/lib/mike-gcc.exp
+++ b/gcc/testsuite/lib/mike-gcc.exp
@@ -16,6 +16,8 @@
 
 # This file was derived from mike-g++.exp written by

Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-10 Thread James Greenhalgh

On Wed, Dec 10, 2014 at 05:13:22PM +, Andreas Schwab wrote:
 Ilya Enkovich enkovich@gmail.com writes:
 
  2014-12-10 19:49 GMT+03:00 Andreas Schwab sch...@suse.de:
  Ilya Enkovich enkovich@gmail.com writes:
 
  2014-12-10 18:53 GMT+03:00 Andreas Schwab sch...@suse.de:
  Ilya Enkovich enkovich@gmail.com writes:
 
  2014-12-10 17:49 GMT+03:00 Andreas Schwab sch...@suse.de:
  Ilya Enkovich enkovich@gmail.com writes:
 
* gcc.dg/lto/lto.exp: Load mpx-dg.exp.
* gcc.dg/lto/chkp-privatize_0.c: New.
* gcc.dg/lto/chkp-privatize_1.c: New.
 
  xgcc: error: unrecognized command line option '-mmpx'
 
  FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,  
  -fPIC -flto -flto-partition=max -fcheck-pointer-bounds -mmpx
 
  What is the target?
 
  Pick any.
 
  Here it is.
 
  Try again.
 
  Same result.  And see no errors for this test in gcc-regression.
 
 Try again.

In the interests of breaking us out of this loop...

At least: {arm-none-linux-gnueabihf, arm-none-eabi,
aarch64-none-linux-gnueabi, aarch64-none-elf, aarch64_be-none-elf}
fail for me.

Cheers,
James

Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-10 Thread Ilya Enkovich

2014-12-10 21:07 GMT+03:00 James Greenhalgh james.greenha...@arm.com:
 On Wed, Dec 10, 2014 at 05:13:22PM +, Andreas Schwab wrote:
 Ilya Enkovich enkovich@gmail.com writes:

  2014-12-10 19:49 GMT+03:00 Andreas Schwab sch...@suse.de:
  Ilya Enkovich enkovich@gmail.com writes:
 
  2014-12-10 18:53 GMT+03:00 Andreas Schwab sch...@suse.de:
  Ilya Enkovich enkovich@gmail.com writes:
 
  2014-12-10 17:49 GMT+03:00 Andreas Schwab sch...@suse.de:
  Ilya Enkovich enkovich@gmail.com writes:
 
* gcc.dg/lto/lto.exp: Load mpx-dg.exp.
* gcc.dg/lto/chkp-privatize_0.c: New.
* gcc.dg/lto/chkp-privatize_1.c: New.
 
  xgcc: error: unrecognized command line option '-mmpx'
 
  FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,  
  -fPIC -flto -flto-partition=max -fcheck-pointer-bounds -mmpx
 
  What is the target?
 
  Pick any.
 
  Here it is.
 
  Try again.
 
  Same result.  And see no errors for this test in gcc-regression.

 Try again.

 In the interests of breaking us out of this loop...

 At least: {arm-none-linux-gnueabihf, arm-none-eabi,
 aarch64-none-linux-gnueabi, aarch64-none-elf, aarch64_be-none-elf}
 fail for me.

Hello James,

Could you please attach a gcc.log file for one of your runs?

Thanks,
Ilya


 Cheers,
 James

Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-10 Thread James Greenhalgh

On Wed, Dec 10, 2014 at 06:19:08PM +, Ilya Enkovich wrote:
 2014-12-10 21:07 GMT+03:00 James Greenhalgh james.greenha...@arm.com:
  In the interests of breaking us out of this loop...
 
  At least: {arm-none-linux-gnueabihf, arm-none-eabi,
  aarch64-none-linux-gnueabi, aarch64-none-elf, 
  aarch64_be-none-elf}
  fail for me.
 
 Hello James,
 
 Could you please attach a gcc.log file for one of your runs?

Sure, see below.

However, the problem is your dg-require-effective-target line for the test.

From the documentation:

{ dg-require-effective-target keyword [{ selector }] }
Skip the test if the test target, including current multilib flags, is not
covered by the effective-target keyword. If the directive includes the
optional ‘{ selector }’ then the effective-target test is only performed if
the target system matches the selector. 

So what you've written is really the opposite of what you meant :)

i.e.

/* { dg-require-effective-target mpx { target { i?86-*-* x86_64-*-* } } } */

Should look more like:

/* { dg-require-effective-target mpx } */

Which gets me the UNSUPPORTED I expect to see for this test.

Cheers,
James

---
Executing on host: /work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/xgcc 
-B/work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/  
-fno-diagnostics-show-caret -fdiagnostics-color=never   -fPIC -flto 
-flto-partition=max -fcheck-pointer-bounds -mmpx   -c -specs=aem-ve.specs
-mcmodel=small  -o c_lto_chkp-privatize_0.o 
/work/gcc-clean/src/gcc/gcc/testsuite/gcc.dg/lto/chkp-privatize_0.c(timeout 
= 300)
spawn /work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/xgcc 
-B/work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/ 
-fno-diagnostics-show-caret -fdiagnostics-color=never -fPIC -flto 
-flto-partition=max -fcheck-pointer-bounds -mmpx -c -specs=aem-ve.specs 
-mcmodel=small -o c_lto_chkp-privatize_0.o 
/work/gcc-clean/src/gcc/gcc/testsuite/gcc.dg/lto/chkp-privatize_0.c
xgcc: error: unrecognized command line option '-mmpx'
compiler exited with status 1
output is:
xgcc: error: unrecognized command line option '-mmpx'

FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,  -fPIC -flto 
-flto-partition=max -fcheck-pointer-bounds -mmpx 
Executing on host: /work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/xgcc 
-B/work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/  
-fno-diagnostics-show-caret -fdiagnostics-color=never   -fPIC -flto 
-flto-partition=max -fcheck-pointer-bounds -mmpx   -c -specs=aem-ve.specs
-mcmodel=small  -o c_lto_chkp-privatize_1.o 
/work/gcc-clean/src/gcc/gcc/testsuite/gcc.dg/lto/chkp-privatize_1.c(timeout 
= 300)
spawn /work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/xgcc 
-B/work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/ 
-fno-diagnostics-show-caret -fdiagnostics-color=never -fPIC -flto 
-flto-partition=max -fcheck-pointer-bounds -mmpx -c -specs=aem-ve.specs 
-mcmodel=small -o c_lto_chkp-privatize_1.o 
/work/gcc-clean/src/gcc/gcc/testsuite/gcc.dg/lto/chkp-privatize_1.c
xgcc: error: unrecognized command line option '-mmpx'
compiler exited with status 1
output is:
xgcc: error: unrecognized command line option '-mmpx'

FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_1.o assemble,  -fPIC -flto 
-flto-partition=max -fcheck-pointer-bounds -mmpx 
UNRESOLVED: gcc.dg/lto/chkp-privatize 
c_lto_chkp-privatize_0.o-c_lto_chkp-privatize_1.o link  -fPIC -flto 
-flto-partition=max -fcheck-pointer-bounds -mmpx 
UNRESOLVED: gcc.dg/lto/chkp-privatize 
c_lto_chkp-privatize_0.o-c_lto_chkp-privatize_1.o execute  -fPIC -flto 
-flto-partition=max -fcheck-pointer-bounds -mmpx

[PATCH, committed] Document the JIT C++ bindings (libgccjit++.h)

2014-12-10 Thread David Malcolm

The JIT documentation only covered libgccjit.h, not
libgccjit++.h.

The following patch expands the JIT docs to cover these C++ bindings. 

These docs began as a copy of the C documentation, but have numerous
changes and additional material to reflect the differences between the
APIs (e.g. overloaded operators, default params, use of actual
inheritance, etc).

I also ported the examples from the C tutorial to the C++ API, so e.g.
for tut02-square.c there's now also a tut02-square.cc showing the same
material but using the C++ API.

The patch also contains some fixes for the C documentation I noticed at
the same time.

I've verified the build of HTML and PDF docs.

Committed to trunk as r218588.

2014-12-10  David Malcolm  dmalc...@redhat.com

* docs/cp/index.rst: New file.
* docs/cp/intro/index.rst: New file.
* docs/cp/intro/tutorial01.rst: New file.
* docs/cp/intro/tutorial02.rst: New file.
* docs/cp/intro/tutorial03.rst: New file.
* docs/cp/intro/tutorial04.rst: New file.
* docs/cp/topics/contexts.rst: New file.
* docs/cp/topics/expressions.rst: New file.
* docs/cp/topics/functions.rst: New file.
* docs/cp/topics/index.rst: New file.
* docs/cp/topics/locations.rst: New file.
* docs/cp/topics/objects.rst: New file.
* docs/cp/topics/results.rst: New file.
* docs/cp/topics/types.rst: New file.
* docs/examples/tut01-hello-world.cc: New file.
* docs/examples/tut02-square.c: Fix missing newline in output.
* docs/examples/tut02-square.cc: New file.
* docs/examples/tut03-sum-of-squares.cc: New file.
* docs/examples/tut04-toyvm/toyvm.cc: New file.
* docs/index.rst: Move summary to above the table of contents.
Add text about the C vs C++ APIs.
* docs/topics/contexts.rst: Fix a typo.

* docs/_build/texinfo/libgccjit.texi: Regenerate.
* docs/_build/texinfo/factorial1.png: New file.
* docs/_build/texinfo/sum-of-squares1.png: New file.



r218588.patch.gz
Description: GNU Zip compressed data

Re: locales fixes

2014-12-10 Thread François Dumont


On 10/12/2014 00:24, Jonathan Wakely wrote:

I still say the change to if __GLIBC__  2 || __GLIBC_MINOR__ = 20
is simply wrong. It works for me with 2.18, but if it doesn't work
with 2.19 on Debian then why should it work any better with 2.20?


Yes, it was indeed a wild guess because I didn't understand what 
should be the relation between the glibc version and locale data 
returned by the system. I shouldn't have done that.


But what to do then ? Consider those tests as simply unsupported on 
systems where glibc version doesn't give any info regarding what to expect ?


François

Re: [PATCH] TYPE_OVERFLOW_* cleanup

2014-12-10 Thread Marc Glisse


On Wed, 10 Dec 2014, Marek Polacek wrote:


@@ -482,6 +487,15 @@ extern void omp_clause_range_check_failed (const_tree, 
const char *, int,
   || TREE_CODE (TYPE) == BOOLEAN_TYPE \
   || TREE_CODE (TYPE) == INTEGER_TYPE)

+/* Nonzero if TYPE represents an integral type, including complex
+   and vector integer types.  */
+
+#define ANY_INTEGRAL_TYPE_P(TYPE)  \
+  (INTEGRAL_TYPE_P (TYPE)  \
+   || ((TREE_CODE (TYPE) == COMPLEX_TYPE   \
+|| VECTOR_TYPE_P (TYPE))   \
+INTEGRAL_TYPE_P (TREE_TYPE (TYPE
+
/* Nonzero if TYPE represents a non-saturating fixed-point type.  */

#define NON_SAT_FIXED_POINT_TYPE_P(TYPE) \
@@ -771,7 +785,7 @@ extern void omp_clause_range_check_failed (const_tree, 
const char *, int,
/* True if overflow wraps around for the given integral type.  That
   is, TYPE_MAX + 1 == TYPE_MIN.  */
#define TYPE_OVERFLOW_WRAPS(TYPE) \
-  (TYPE_UNSIGNED (TYPE) || flag_wrapv)
+  (ANY_INTEGRAL_TYPE_CHECK(TYPE)-base.u.bits.unsigned_flag || flag_wrapv)

/* True if overflow is undefined for the given integral type.  We may
   optimize on the assumption that values in the type never overflow.
@@ -781,13 +795,14 @@ extern void omp_clause_range_check_failed (const_tree, 
const char *, int,
   it will be appropriate to issue the warning immediately, and in
   other cases it will be appropriate to simply set a flag and let the
   caller decide whether a warning is appropriate or not.  */
-#define TYPE_OVERFLOW_UNDEFINED(TYPE) \
-  (!TYPE_UNSIGNED (TYPE)  !flag_wrapv  !flag_trapv  flag_strict_overflow)
+#define TYPE_OVERFLOW_UNDEFINED(TYPE)  \
+  (!ANY_INTEGRAL_TYPE_CHECK(TYPE)-base.u.bits.unsigned_flag\
+!flag_wrapv  !flag_trapv  flag_strict_overflow)

/* True if overflow for the given integral type should issue a
   trap.  */
#define TYPE_OVERFLOW_TRAPS(TYPE) \
-  (!TYPE_UNSIGNED (TYPE)  flag_trapv)
+  (!ANY_INTEGRAL_TYPE_CHECK(TYPE)-base.u.bits.unsigned_flag  flag_trapv)

/* True if an overflow is to be preserved for sanitization.  */
#define TYPE_OVERFLOW_SANITIZED(TYPE)   \
@@ -2990,6 +3005,20 @@ omp_clause_elt_check (tree __t, int __i,
  return __t-omp_clause.ops[__i];
}

+/* These checks have to be special cased.  */
+
+inline tree
+any_integral_type_check (tree __t, const char *__f, int __l, const char *__g)
+{
+  if (!(INTEGRAL_TYPE_P (__t)
+   || ((TREE_CODE (__t) == COMPLEX_TYPE
+|| VECTOR_TYPE_P (__t))
+INTEGRAL_TYPE_P (TREE_TYPE (__t)
+tree_check_failed (__t, __f, __l, __g, BOOLEAN_TYPE, ENUMERAL_TYPE,
+  INTEGER_TYPE, 0);
+  return __t;
+}


Is there a particular reason why you are avoiding ANY_INTEGRAL_TYPE_P in
any_integral_type_check?

--
Marc Glisse

Re: [PATCH, CHKP] Fix instrumentation clones privatization

2014-12-10 Thread Ilya Enkovich

2014-12-10 21:29 GMT+03:00 James Greenhalgh james.greenha...@arm.com:
 On Wed, Dec 10, 2014 at 06:19:08PM +, Ilya Enkovich wrote:
 2014-12-10 21:07 GMT+03:00 James Greenhalgh james.greenha...@arm.com:
  In the interests of breaking us out of this loop...
 
  At least: {arm-none-linux-gnueabihf, arm-none-eabi,
  aarch64-none-linux-gnueabi, aarch64-none-elf, 
  aarch64_be-none-elf}
  fail for me.

 Hello James,

 Could you please attach a gcc.log file for one of your runs?

 Sure, see below.

 However, the problem is your dg-require-effective-target line for the test.

 From the documentation:

 { dg-require-effective-target keyword [{ selector }] }
 Skip the test if the test target, including current multilib flags, is not
 covered by the effective-target keyword. If the directive includes the
 optional ‘{ selector }’ then the effective-target test is only performed 
 if
 the target system matches the selector.

 So what you've written is really the opposite of what you meant :)

Yep, you are right!  I wanted to have this target selector in
dg-lto-do, but it doesn't support selectors and thus I moved it
assuming effect would be the same.  Thanks a lot for help!

Committed as obvious:

2014-12-10  Ilya Enkovich  ilya.enkov...@intel.com

* gcc.dg/lto/chkp-privatize_0.c: Remove unneeded selector
from target check.

diff --git a/gcc/testsuite/gcc.dg/lto/chkp-privatize_0.c
b/gcc/testsuite/gcc.dg/lto/chkp-privatize_0.c
index 4c899e8..ad9fdaa 100644
--- a/gcc/testsuite/gcc.dg/lto/chkp-privatize_0.c
+++ b/gcc/testsuite/gcc.dg/lto/chkp-privatize_0.c
@@ -1,5 +1,5 @@
 /* { dg-lto-do link } */
-/* { dg-require-effective-target mpx { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-require-effective-target mpx } */
 /* { dg-lto-options { { -fPIC -flto -flto-partition=max
-fcheck-pointer-bounds -mmpx } } } */

 static int


Thanks,
Ilya


 i.e.

 /* { dg-require-effective-target mpx { target { i?86-*-* x86_64-*-* } } } */

 Should look more like:

 /* { dg-require-effective-target mpx } */

 Which gets me the UNSUPPORTED I expect to see for this test.

 Cheers,
 James

 ---
 Executing on host: /work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/xgcc 
 -B/work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/  
 -fno-diagnostics-show-caret -fdiagnostics-color=never   -fPIC -flto 
 -flto-partition=max -fcheck-pointer-bounds -mmpx   -c -specs=aem-ve.specs
 -mcmodel=small  -o c_lto_chkp-privatize_0.o 
 /work/gcc-clean/src/gcc/gcc/testsuite/gcc.dg/lto/chkp-privatize_0.c
 (timeout = 300)
 spawn /work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/xgcc 
 -B/work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/ 
 -fno-diagnostics-show-caret -fdiagnostics-color=never -fPIC -flto 
 -flto-partition=max -fcheck-pointer-bounds -mmpx -c -specs=aem-ve.specs 
 -mcmodel=small -o c_lto_chkp-privatize_0.o 
 /work/gcc-clean/src/gcc/gcc/testsuite/gcc.dg/lto/chkp-privatize_0.c
 xgcc: error: unrecognized command line option '-mmpx'
 compiler exited with status 1
 output is:
 xgcc: error: unrecognized command line option '-mmpx'

 FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_0.o assemble,  -fPIC 
 -flto -flto-partition=max -fcheck-pointer-bounds -mmpx
 Executing on host: /work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/xgcc 
 -B/work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/  
 -fno-diagnostics-show-caret -fdiagnostics-color=never   -fPIC -flto 
 -flto-partition=max -fcheck-pointer-bounds -mmpx   -c -specs=aem-ve.specs
 -mcmodel=small  -o c_lto_chkp-privatize_1.o 
 /work/gcc-clean/src/gcc/gcc/testsuite/gcc.dg/lto/chkp-privatize_1.c
 (timeout = 300)
 spawn /work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/xgcc 
 -B/work/gcc-clean/build-aarch64-none-elf/obj/gcc2/gcc/ 
 -fno-diagnostics-show-caret -fdiagnostics-color=never -fPIC -flto 
 -flto-partition=max -fcheck-pointer-bounds -mmpx -c -specs=aem-ve.specs 
 -mcmodel=small -o c_lto_chkp-privatize_1.o 
 /work/gcc-clean/src/gcc/gcc/testsuite/gcc.dg/lto/chkp-privatize_1.c
 xgcc: error: unrecognized command line option '-mmpx'
 compiler exited with status 1
 output is:
 xgcc: error: unrecognized command line option '-mmpx'

 FAIL: gcc.dg/lto/chkp-privatize c_lto_chkp-privatize_1.o assemble,  -fPIC 
 -flto -flto-partition=max -fcheck-pointer-bounds -mmpx
 UNRESOLVED: gcc.dg/lto/chkp-privatize 
 c_lto_chkp-privatize_0.o-c_lto_chkp-privatize_1.o link  -fPIC -flto 
 -flto-partition=max -fcheck-pointer-bounds -mmpx
 UNRESOLVED: gcc.dg/lto/chkp-privatize 
 c_lto_chkp-privatize_0.o-c_lto_chkp-privatize_1.o execute  -fPIC -flto 
 -flto-partition=max -fcheck-pointer-bounds -mmpx

[C++ Patch] PR 60955

2014-12-10 Thread Paolo Carlini


Hi,

this regression, a spurious warning about taking the address of a 
register parameter, happens in C++14 mode due to the use of 
force_paren_expr, called by finish_parenthesized_expr, which ends up 
calling build_static_cast. Manuel mentioned in the audit trail that 
TREE_NO_WARNING can be used for DECLs too, and indeed I noticed today 
that we are *already* using it for EXPRs, at the beginning of 
finish_parenthesized_expr. I experimented a bit with restricting the 
setting, eg, to PARM_DECLs only, but in fact we have an identical issue 
for, eg, register VAR_DECLs. Tested x86_64-linux.


Thanks,
Paolo.


/cp
2014-12-10  Paolo Carlini  paolo.carl...@oracle.com

PR c++/60955
* semantics.c (finish_parenthesized_expr): Set TREE_NO_WARNING on
non-EXPRs too.
* typeck.c (cxx_mark_addressable): Check TREE_NO_WARNING.

/testsuite
2014-12-10  Paolo Carlini  paolo.carl...@oracle.com

PR c++/60955
* g++.dg/warn/register-parm-1.C: New.
Index: cp/semantics.c
===
--- cp/semantics.c  (revision 218586)
+++ cp/semantics.c  (working copy)
@@ -1674,10 +1674,13 @@ force_paren_expr (tree expr)
 tree
 finish_parenthesized_expr (tree expr)
 {
-  if (EXPR_P (expr))
-/* This inhibits warnings in c_common_truthvalue_conversion.  */
-TREE_NO_WARNING (expr) = 1;
+  if (expr == error_mark_node)
+return error_mark_node;
 
+  /* Inhibit warnings in c_common_truthvalue_conversion and in
+ cxx_mark_addressable.  */
+  TREE_NO_WARNING (expr) = 1;
+
   if (TREE_CODE (expr) == OFFSET_REF
   || TREE_CODE (expr) == SCOPE_REF)
 /* [expr.unary.op]/3 The qualified id of a pointer-to-member must not be
Index: cp/typeck.c
===
--- cp/typeck.c (revision 218586)
+++ cp/typeck.c (working copy)
@@ -6068,7 +6068,8 @@ cxx_mark_addressable (tree exp)
  (address of explicit register variable %qD requested, x);
return false;
  }
-   else if (extra_warnings)
+   else if (extra_warnings
+ !TREE_NO_WARNING (x))
  warning
(OPT_Wextra, address requested for %qD, which is declared 
%register%, x);
  }
Index: testsuite/g++.dg/warn/register-parm-1.C
===
--- testsuite/g++.dg/warn/register-parm-1.C (revision 0)
+++ testsuite/g++.dg/warn/register-parm-1.C (working copy)
@@ -0,0 +1,9 @@
+// PR c++/60955
+// { dg-options -Wextra }
+
+unsigned int erroneous_warning(register int a) {
+if ((a)  0xff) return 1; else return 0;
+}
+unsigned int no_erroneous_warning(register int a) {
+if (a  0xff) return 1; else return 0;
+}

[PATCH] jit.exp: support C++ testcases

2014-12-10 Thread David Malcolm

There is an ugly kludge here in jit.exp, but it works.

Is there a better way to do this? (see Kludge alert at top of
jit.exp)

gcc/jit/ChangeLog:
* TODO.rst (Test suite): Remove item about running C++ testcases.
* docs/internals/index.rst (Working on the JIT library): Add
c++ to the enabled languages in the suggested configure
invocation, and add a description of why this is necessary.

gcc/testsuite/ChangeLog:
* jit.dg/jit.exp: Load wrapper.exp with %{tool} set to g++
rather than jit.  Load g++.exp, and call g++_init.
Run test-*.cc files within the testsuite and *.cc files within
docs/examples.
(jit-dg-test): Drop the addition of -fgnu89-inline to
DEFAULT_CFLAGS in favor of adding it to additional_flags, only
doing it when compiling C testcases (since g++ does not handle
it).
---
 gcc/jit/TODO.rst |  2 --
 gcc/jit/docs/internals/index.rst | 13 --
 gcc/testsuite/jit.dg/jit.exp | 54 
 3 files changed, 55 insertions(+), 14 deletions(-)

diff --git a/gcc/jit/TODO.rst b/gcc/jit/TODO.rst
index 09c4d9d..ca0ddbb 100644
--- a/gcc/jit/TODO.rst
+++ b/gcc/jit/TODO.rst
@@ -81,8 +81,6 @@ Bugs
 
 Test suite
 ==
-* get DejaGnu to build and run C++ testcases
-
 * measure code coverage in testing of libgccjit.so
 
 Future milestones
diff --git a/gcc/jit/docs/internals/index.rst b/gcc/jit/docs/internals/index.rst
index 1d46818..50c55b0 100644
--- a/gcc/jit/docs/internals/index.rst
+++ b/gcc/jit/docs/internals/index.rst
@@ -31,7 +31,7 @@ the JIT library like this:
   cd build
   ../src/configure \
  --enable-host-shared \
- --enable-languages=jit \
+ --enable-languages=jit,c++ \
  --disable-bootstrap \
  --enable-checking=release \
  --prefix=$PREFIX
@@ -54,11 +54,20 @@ Here's what those configuration options mean:
   position-independent code, which incurs a slight performance hit,
   but it necessary for a shared library.
 
-.. option:: --enable-languages=jit
+.. option:: --enable-languages=jit,c++
 
   This specifies which frontends to build.  The JIT library looks like
   a frontend to the rest of the code.
 
+  The C++ portion of the JIT test suite requires the C++ frontend to be
+  enabled at configure-time, or you may see errors like this when
+  running the test suite:
+
+  .. code-block:: console
+
+xgcc: error: /home/david/jit/src/gcc/testsuite/jit.dg/test-quadratic.cc: 
C++ compiler not installed on this system
+c++: error trying to exec 'cc1plus': execvp: No such file or directory
+
 .. option:: --disable-bootstrap
 
   For hacking on the jit subdirectory, performing a full
diff --git a/gcc/testsuite/jit.dg/jit.exp b/gcc/testsuite/jit.dg/jit.exp
index a37ccc7..454e656 100644
--- a/gcc/testsuite/jit.dg/jit.exp
+++ b/gcc/testsuite/jit.dg/jit.exp
@@ -14,6 +14,20 @@
 # up into the Tcl world, reporting a summary of all results
 # across all of the executables.
 
+# Kludge alert:
+# We need g++_init so that it can find the stdlib include path.
+#
+# g++_init (in lib/g++.exp) uses g++_maybe_build_wrapper,
+# which normally comes from the definition of
+# ${tool}_maybe_build_wrapper within lib/wrapper.exp.
+#
+# However, for us, ${tool} is jit.
+# Hence we load wrapper.exp with tool == g++, so that
+# g++_maybe_build_wrapper is defined.
+set tool g++
+load_lib wrapper.exp
+set tool jit
+
 load_lib dg.exp
 load_lib prune.exp
 load_lib target-supports.exp
@@ -21,6 +35,7 @@ load_lib gcc-defs.exp
 load_lib timeout.exp
 load_lib target-libpath.exp
 load_lib gcc.exp
+load_lib g++.exp
 load_lib dejagnu.exp
 
 # Look for lines of the form:
@@ -264,17 +279,25 @@ if ![info exists GCC_UNDER_TEST] {
 set GCC_UNDER_TEST [find_gcc]
 }
 
+g++_init
+
 # Initialize dg.
 dg-init
 
 # Gather a list of all tests.
 
-# Tests within the testsuite: gcc/testsuite/jit.dg/test-*.c
-set tests [lsort [find $srcdir/$subdir test-*.c]]
+# C tests within the testsuite: gcc/testsuite/jit.dg/test-*.c
+set tests [find $srcdir/$subdir test-*.c]
+
+# C++ tests within the testsuite: gcc/testsuite/jit.dg/test-*.cc
+set tests [concat $tests [find $srcdir/$subdir test-*.cc]]
 
 # We also test the examples within the documentation, to ensure that
 # they compile:
-set tests [lsort [concat $tests [find $srcdir/../jit/docs/examples *.c]]]
+set tests [concat $tests [find $srcdir/../jit/docs/examples *.c]]
+set tests [concat $tests [find $srcdir/../jit/docs/examples *.cc]]
+
+set tests [lsort $tests]
 
 verbose tests: $tests
 
@@ -306,8 +329,24 @@ proc jit-dg-test { prog do_what extra_tool_flags } {
 verbose output_file: $output_file
 
 # Create the test executable:
-set comp_output [gcc_target_compile $prog $output_file $do_what \
-   {additional_flags=$extra_tool_flags}]
+set extension [file extension $prog]
+if {$extension == .cc} {
+   set compilation_function g++_target_compile
+   set options

[PATCH] combine: Do not allow asm as I2 in a special case

2014-12-10 Thread Segher Boessenkool

My rs6000 patch putting a clobber of the carry in every asm regressed
guality/pr41353-1.c.  This is because the asm (in f3 in that testcase,
for example) now is a PARALLEL, and the special case for I2 a parallel
and I3 a register move now triggers.  Before, when the asm was not a
parallel, can_combine_p disallowed combining I2.  The effect of the
change is that some debug info becomes invalid and is deleted later,
causing the testsuite regression.

Let's not allow combining an asm in this special case either.

Bootstrapped and tested on powerpc64-linux; okay for mainline?


Segher


2014-12-10  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* combine.c (try_combine): Do not allow combining a PARALLEL I2
with a register move I3 if that I2 is an asm.


diff --git a/gcc/combine.c b/gcc/combine.c
index f5ade9e..8995c1d3 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2748,6 +2748,13 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
SET_DEST (XVECEXP (p2, 0, i
  break;
 
+  /* Make sure this PARALLEL is not an asm.  We do not allow combining
+that usually (see can_combine_p), so do not here either.  */
+  for (i = 0; i  XVECLEN (p2, 0); i++)
+   if (GET_CODE (XVECEXP (p2, 0, i)) == SET
+GET_CODE (SET_SRC (XVECEXP (p2, 0, i))) == ASM_OPERANDS)
+ break;
+
   if (i == XVECLEN (p2, 0))
for (i = 0; i  XVECLEN (p2, 0); i++)
  if (GET_CODE (XVECEXP (p2, 0, i)) == SET
-- 
1.8.1.4


---
 gcc/combine.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/combine.c b/gcc/combine.c
index f5ade9e..8995c1d3 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -2748,6 +2748,13 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
rtx_insn *i0,
SET_DEST (XVECEXP (p2, 0, i
  break;
 
+  /* Make sure this PARALLEL is not an asm.  We do not allow combining
+that usually (see can_combine_p), so do not here either.  */
+  for (i = 0; i  XVECLEN (p2, 0); i++)
+   if (GET_CODE (XVECEXP (p2, 0, i)) == SET
+GET_CODE (SET_SRC (XVECEXP (p2, 0, i))) == ASM_OPERANDS)
+ break;
+
   if (i == XVECLEN (p2, 0))
for (i = 0; i  XVECLEN (p2, 0); i++)
  if (GET_CODE (XVECEXP (p2, 0, i)) == SET
-- 
1.8.1.4

[PATCH, libiberty, libcpp]: Introduce xvasprintf to libiberty and use it in libcpp

2014-12-10 Thread Uros Bizjak

On Tue, Dec 9, 2014 at 11:24 PM, Joseph Myers jos...@codesourcery.com wrote:

 Attached patch checks the return value and sets ptr to NULL in this case.

 2014-12-09  Uros Bizjak  ubiz...@gmail.com

 * directives.c (cpp_define_formatted): Check return value of
 vasprintf and in case of error set ptr to NULL.

 Bootstrapped on x86_64-linux-gnu.

 OK for mainline?

 No, this will just continue to pass NULL into cpp_define, and so into
 strlen, where it isn't a valid argument.  You need to give an error
 message for allocation failure and exit, much like xmalloc does (or put
 xvasprintf in libiberty and use that here - see
 https://gcc.gnu.org/ml/gcc-patches/2009-11/msg01448.html and
 https://gcc.gnu.org/ml/gcc-patches/2009-11/msg01449.html - I don't know
 if that's the latest version).

Thanks for the pointers to the above patches. I have adapted the
referred patches to introduce xvasprintf to libibierty in order to use
it in libccp.

libiberty/ChangeLog:

2014-12-10  Uros Bizjak  ubiz...@gmail.com
Ben Elliston  b...@au.ibm.com
Manuel Lopez-Ibanez  m...@gcc.gnu.org

* xvasprintf.c: New file.
* vprintf-support.h: Likewise.
* vprintf-support.c: Likewise.
* Makefile.in (CFILES): Add vprintf-support.c, xvasprintf.c.
(REQUIRED_OFILES): Add vprintf-support.$(objext), xvasprintf.$(objext).
(vprintf-support.$(objext), xvasprintf.$(objext)): New targets.
* functions.texi: Updated with documentation for xvasprintf.
* vasprintf.c (int_vasprintf): Use libiberty_vprintf_buffer_size.

include/ChangeLog:

2014-12-10  Uros Bizjak  ubiz...@gmail.com
Ben Elliston  b...@au.ibm.com
Manuel Lopez-Ibanez  m...@gcc.gnu.org

* libiberty.h (xvasprintf): Declare.

libcpp/ChangeLog:

2014-12-10  Uros Bizjak  ubiz...@gmail.com

* directives.c (cpp_define_formatted): Use xvasprintf.

Bootstrapped without warning on x86_64-linux-gnu and alphaev68-linux-gnu.

OK for mainline?
Index: include/libiberty.h
===
--- include/libiberty.h (revision 218585)
+++ include/libiberty.h (working copy)
@@ -636,6 +636,11 @@
 extern int vasprintf (char **, const char *, va_list) ATTRIBUTE_PRINTF(2,0);
 #endif
 
+/* Like vasprintf but allocates memory without fail. This works like
+   xmalloc.  */
+
+extern char * xvasprintf (const char *, va_list) ATTRIBUTE_MALLOC 
ATTRIBUTE_PRINTF(1,0);
+
 #if defined(HAVE_DECL_SNPRINTF)  !HAVE_DECL_SNPRINTF
 /* Like sprintf but prints at most N characters.  */
 extern int snprintf (char *, size_t, const char *, ...) ATTRIBUTE_PRINTF_3;
Index: libcpp/directives.c
===
--- libcpp/directives.c (revision 218592)
+++ libcpp/directives.c (working copy)
@@ -2404,11 +2404,11 @@
 void
 cpp_define_formatted (cpp_reader *pfile, const char *fmt, ...)
 {
-  char *ptr = NULL;
+  char *ptr;
 
   va_list ap;
   va_start (ap, fmt);
-  vasprintf (ptr, fmt, ap);
+  ptr = xvasprintf (fmt, ap);
   va_end (ap);
 
   cpp_define (pfile, ptr);
Index: libiberty/Makefile.in
===
--- libiberty/Makefile.in   (revision 218585)
+++ libiberty/Makefile.in   (working copy)
@@ -155,10 +155,11 @@
 strtoll.c strtoul.c strtoull.c strndup.c strnlen.c \
 strverscmp.c timeval-utils.c tmpnam.c  \
unlink-if-ordinary.c\
-   vasprintf.c vfork.c vfprintf.c vprintf.c vsnprintf.c vsprintf.c \
+   vasprintf.c vfork.c vfprintf.c vprintf.c vprintf-support.c  \
+vsnprintf.c vsprintf.c \
waitpid.c   \
xatexit.c xexit.c xmalloc.c xmemdup.c xstrdup.c xstrerror.c \
-xstrndup.c
+xstrndup.c xvasprintf.c
 
 # These are always included in the library.  The first four are listed
 # first and by compile time to optimize parallel builds.
@@ -180,7 +181,7 @@
./obstack.$(objext) \
./partition.$(objext) ./pexecute.$(objext) ./physmem.$(objext)  \
./pex-common.$(objext) ./pex-one.$(objext)  \
-   ./@pexecute@.$(objext)  \
+   ./@pexecute@.$(objext) ./vprintf-support.$(objext)  \
./safe-ctype.$(objext)  \
./simple-object.$(objext) ./simple-object-coff.$(objext)\
./simple-object-elf.$(objext) ./simple-object-mach-o.$(objext)  \
@@ -191,7 +192,7 @@
./timeval-utils.$(objext) ./unlink-if-ordinary.$(objext)\
./xatexit.$(objext) ./xexit.$(objext) ./xmalloc.$(objext)   \
./xmemdup.$(objext) ./xstrdup.$(objext) ./xstrerror.$(objext)   \
-   ./xstrndup.$(objext)
+   ./xstrndup.$(objext)

Go patch committed: Don't lower multi-valued arguments into temporaries

2014-12-10 Thread Ian Lance Taylor

This patch from Chris Manghane fixes a compiler crash when lowering a
multi-valued temporary (as in f(g()) when g returns multiple values).
This is GCC PR 61316.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian
diff -r fc51bd51a948 go/expressions.cc
--- a/go/expressions.cc Tue Dec 09 16:54:11 2014 -0800
+++ b/go/expressions.cc Wed Dec 10 11:40:53 2014 -0800
@@ -8525,6 +8525,7 @@
   || fntype-is_builtin()))
{
  Call_expression* call = this-args_-front()-call_expression();
+ call-set_is_multi_value_arg();
  Expression_list* args = new Expression_list;
  for (size_t i = 0; i  rc; ++i)
args-push_back(Expression::make_call_result(call, i));
diff -r fc51bd51a948 go/expressions.h
--- a/go/expressions.h  Tue Dec 09 16:54:11 2014 -0800
+++ b/go/expressions.h  Wed Dec 10 11:40:53 2014 -0800
@@ -1632,7 +1632,7 @@
   fn_(fn), args_(args), type_(NULL), results_(NULL), call_(NULL),
   call_temp_(NULL), expected_result_count_(0), is_varargs_(is_varargs),
   varargs_are_lowered_(false), types_are_determined_(false),
-  is_deferred_(false), issued_error_(false)
+  is_deferred_(false), issued_error_(false), is_multi_value_arg_(false)
   { }
 
   // The function to call.
@@ -1703,6 +1703,17 @@
   bool
   issue_error();
 
+  // Whether this call returns multiple results that are used as an
+  // multi-valued argument.
+  bool
+  is_multi_value_arg() const
+  { return this-is_multi_value_arg_; }
+
+  // Note this call is used as a multi-valued argument.
+  void
+  set_is_multi_value_arg()
+  { this-is_multi_value_arg_ = true; }
+
  protected:
   int
   do_traverse(Traverse*);
@@ -1806,6 +1817,8 @@
   // results and uses.  This is to avoid producing multiple errors
   // when there are multiple Call_result_expressions.
   bool issued_error_;
+  // True if this call is used as an argument that returns multiple results.
+  bool is_multi_value_arg_;
 };
 
 // An expression which represents a pointer to a function.
diff -r fc51bd51a948 go/statements.cc
--- a/go/statements.cc  Tue Dec 09 16:54:11 2014 -0800
+++ b/go/statements.cc  Wed Dec 10 11:40:53 2014 -0800
@@ -726,6 +726,17 @@
 
   if ((*pexpr)-must_eval_in_order())
 {
+  Call_expression* call = (*pexpr)-call_expression();
+  if (call != NULL  call-is_multi_value_arg())
+   {
+ // A call expression which returns multiple results as an argument
+ // to another call must be handled specially.  We can't create a
+ // temporary because there is no type to give it.  Instead, group
+ // the caller and this multi-valued call argument and use a temporary
+ // variable to hold them.
+ return TRAVERSE_SKIP_COMPONENTS;
+   }
+
   Location loc = (*pexpr)-location();
   Temporary_statement* temp = Statement::make_temporary(NULL, *pexpr, loc);
   this-block_-add_statement(temp);

Re: [PATCH 3/n] OpenMP 4.0 offloading infrastructure: offload tables

2014-12-10 Thread Ilya Verbin

On 10 Dec 09:22, Jakub Jelinek wrote:
 On Tue, Dec 09, 2014 at 03:32:33PM +0300, Ilya Verbin wrote:
  However, I don't see -flto option in the build log.  It seems that
  check_effective_target_lto isn't working inside libgomp/ directory.
  Maybe because ENABLE_LTO is defined only in gcc/configure.ac ?
  
  gcc/
  * varpool.c (varpool_node::get_create): Force output of vars with
  omp declare target attribute.
  libgomp/
  * testsuite/libgomp.c/target-9.c: New test.
 
 Ok, though please try to find out why effective target lto check doesn't
 work in libgomp.  Perhaps you just need to include some further *.exp
 file?

It lives in gcc/testsuite/lib/target-supports.exp, which is already included
into libgomp/testsuite/lib/libgomp.exp

proc check_effective_target_lto { } {
global ENABLE_LTO
if { [istarget nvptx-*-*] } {
return 0;
}
return [info exists ENABLE_LTO]
}

I'm not sure how it works, but ENABLE_LTO is defined only in gcc/configure.ac .
Maybe it's possible to move it to top-level configure, or to check for -flto
support instead.
However, I will be able to fix this only in late Dec, I'm going on vacation
without access to the computer :)

  -- Ilya

Re: [PATCH] Fix IRA register preferencing

2014-12-10 Thread Jeff Law


On 12/10/14 06:26, Wilco Dijkstra wrote:


If recomputing is best does that mean that record_reg_classes should not
give a boost to the preferred class in the 2nd pass?
Perhaps.  I haven't looked deeply at this part of IRA.  I was relaying 
my experiences with (ab)using the ira-reload callbacks to handle 
allocation after splitting -- where getting the costs and classes 
updated in a reasonable manner was clearly important to getting good 
code.  One could probably argue I should have kept testcases from that 
work :-)



I don't understand

what purpose this has - if the preferred class is from the first pass, it
is already correct, so all it does is boost the preferred class further.
And if the preferred class is wrong (eg. after live range splitting), it
will boost the incorrect class even harder, so in the end you never get
a different class.

It may be historical from the old regclass code, not really sure.


 From what you're saying, recomputing seems best, and I'd be happy to submit
a patch to remove all the preferred class code from record_reg_classes.

Recomputing certainly helped the cases I was looking at.


However there is still the possibility the preferred class is queried before
the recomputation happens (I think that is a case Renlin fixed). Either these
should be faulted and fixed by forcing recomputation, or we need to provide a
correct preferred class. That should be a copy of the original class.
I believe I had copied the original classes, then recomputed them to 
avoid any uninitialized memory reads and the like.  But looking at the 
old branch, I don't see the recomputation for classes (though I do see 
it for costs).  Of course all the backwards walk stuff isn't there 
either, so there's clearly code I worked on extensively, but never 
committed...


Jeff

Re: [PATCH, libiberty, libcpp]: Introduce xvasprintf to libiberty and use it in libcpp

2014-12-10 Thread Joseph Myers

On Wed, 10 Dec 2014, Uros Bizjak wrote:

 libcpp/ChangeLog:
 
 2014-12-10  Uros Bizjak  ubiz...@gmail.com
 
 * directives.c (cpp_define_formatted): Use xvasprintf.

The libcpp patch is OK once the libiberty patch is in.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [RFC] diagnostics.c: For terminals, restrict messages to terminal width?

2014-12-10 Thread Tobias Burnus


Hi all,

I have now updated the patch, based on all comments and Manuel's patch. 
And bootstrapped it on x86-64-gnu-linux. If there are no more comments, 
I intent to commit it tomorrow. If it gets approved earlier, I will 
commit it earlier ;-)


Some comments from me are below.

FX wrote:

So the patch you (Manual) are proposing looks fine to me, with the
environment variable taking precedence, *if* that is fine for Fortran,
of course.

That seems fine to me, from the Fortran standpoint. COLUMNS is a bit of a 
special environment variable, which the shell (when it provides it) keeps 
synchronized to the terminal width anyway (as is the case for bash and zsh).


Ditto for me, although my feeling is that COLUMNS is only rarely used as 
it is does not seem to get automatically exported by the shells.



Manuel López-Ibáñez wrote:

The other conflict is that Fortran's default terminal_width if
everything fails is 80, whereas the common diagnostics defaults to
INT_MAX. If Fortran devs do not mind this new default (which is
already in place for all diagnostics using the new functions), then
the wrapper will make the output consistent.


I think that's fine. Most Fortran source code is  132 characters as 
maximally 132 characters are permitted by the standard (for free-form 
source code);* the number of cases, where it is longer, should be quite 
small and on average is should be even shorter. And terminal/editor 
widths of  80 are also common, such that viewing a log there, also 
should work without wrapping.


(* gfortran doesn't count comments exceeding that limit.)

Tobias
2014-12-06  Tobias Burnus  bur...@net-b.de
	Manuel LÃ³pez-IbÃ¡Ã±ez  m...@gcc.gnu.org

gcc/
	* diagnostic.c (get_terminal_width): Renamed from getenv_columns,
	removed static, and additionally use ioctl to get width.
	(diagnostic_set_caret_max_width): Update call.
	* diagnostic.h (get_terminal_width): Add prototype.
	* opts.c (print_specific_help): Use it for x_help_columns.
	* doc/invoke.texi (fdiagnostics-show-caret): Document how the
	width is set.

gcc/fortran/
	* error.c (gfc_get_terminal_width): Renamed from
	get_terminal_width and use same-named common function.
	(gfc_error_init_1): Update call.

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 541e2fb..41101a2 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -33,6 +33,14 @@ along with GCC; see the file COPYING3.  If not see
 #include diagnostic.h
 #include diagnostic-color.h
 
+#ifdef HAVE_TERMIOS_H
+# include termios.h
+#endif
+
+#ifdef GWINSZ_IN_SYS_IOCTL
+# include sys/ioctl.h
+#endif
+
 #include new // For placement new.
 
 #define pedantic_warning_kind(DC)			\
@@ -81,9 +89,10 @@ file_name_as_prefix (diagnostic_context *context, const char *f)
 
 
 /* Return the value of the getenv(COLUMNS) as an integer. If the
-   value is not set to a positive integer, then return INT_MAX.  */
-static int
-getenv_columns (void)
+   value is not set to a positive integer, use ioctl to get the
+   terminal width. If it fails, return INT_MAX.  */
+int
+get_terminal_width (void)
 {
   const char * s = getenv (COLUMNS);
   if (s != NULL) {
@@ -91,6 +100,14 @@ getenv_columns (void)
 if (n  0)
   return n;
   }
+
+#ifdef TIOCGWINSZ
+  struct winsize w;
+  w.ws_col = 0;
+  if (ioctl (0, TIOCGWINSZ, w) == 0  w.ws_col  0)
+return w.ws_col;
+#endif
+
   return INT_MAX;
 }
 
@@ -101,7 +118,7 @@ diagnostic_set_caret_max_width (diagnostic_context *context, int value)
   /* One minus to account for the leading empty space.  */
   value = value ? value - 1 
 : (isatty (fileno (pp_buffer (context-printer)-stream))
-   ? getenv_columns () - 1: INT_MAX);
+   ? get_terminal_width () - 1: INT_MAX);
   
   if (value = 0) 
 value = INT_MAX;
diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index 807ce91..e699db8 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -298,6 +298,8 @@ void diagnostic_action_after_output (diagnostic_context *, diagnostic_t);
 
 void diagnostic_file_cache_fini (void);
 
+int get_terminal_width (void);
+
 /* Expand the location of this diagnostic. Use this function for consistency. */
 
 static inline expanded_location
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index d2f3c79..40eb8b6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -3187,7 +3187,10 @@ option is known to the diagnostic machinery).  Specifying the
 @opindex fdiagnostics-show-caret
 By default, each diagnostic emitted includes the original source line
 and a caret '^' indicating the column.  This option suppresses this
-information.
+information.  The source line is truncated to @var{n} characters, if
+the @option{-fmessage-length=n} is given.  When the output is done
+to the terminal, the width is limited to the width given by the
+@env{COLUMNS} environment variable or, if not set, to the terminal width.
 
 @end table
 
diff --git a/gcc/fortran/error.c b/gcc/fortran/error.c
index 541a799..75407dc 100644
--- a/gcc/fortran/error.c

Re: [PATCH][libstdc++][testsuite] Mark as UNSUPPORTED tests that don't fit into tiny memory model

2014-12-10 Thread Mike Stump

On Dec 10, 2014, at 10:05 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 Thanks for the guidance. I've moved the definitions into a separate file and 
 included that in the places that use it (more than 2 places in my count). 
 This is the patch attached.

 The second patch (will send shortly after this) adds the logic to libstdc++.
 
 Ok?

Ok.

If anyone else wants to refactor annoying to maintain code into a single place… 
 certainly the legacy of cut-n-paste programming is alive and well in the *.exp 
files.  It was never a design goal to replicate annoying to maintain code.  :-)

Re: [PATCH] combine: Do not allow asm as I2 in a special case

2014-12-10 Thread Jeff Law


On 12/10/14 13:12, Segher Boessenkool wrote:

My rs6000 patch putting a clobber of the carry in every asm regressed
guality/pr41353-1.c.  This is because the asm (in f3 in that testcase,
for example) now is a PARALLEL, and the special case for I2 a parallel
and I3 a register move now triggers.  Before, when the asm was not a
parallel, can_combine_p disallowed combining I2.  The effect of the
change is that some debug info becomes invalid and is deleted later,
causing the testsuite regression.

Let's not allow combining an asm in this special case either.

Bootstrapped and tested on powerpc64-linux; okay for mainline?


Segher


2014-12-10  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
* combine.c (try_combine): Do not allow combining a PARALLEL I2
with a register move I3 if that I2 is an asm.

OK.
jeff

Re: [patch] gdb python pretty printer for DIEs

2014-12-10 Thread Aldy Hernandez


On 12/10/14 06:21, David Malcolm wrote:

On Tue, 2014-12-09 at 13:10 -0800, Aldy Hernandez wrote:

  From:
Aldy Hernandez al...@redhat.com
To:
jason merrill ja...@redhat.com
Cc:
David Malcolm
dmalc...@redhat.com, gcc-patches
gcc-patches@gcc.gnu.org
   Subject:
[patch] gdb python pretty printer
for DIEs
  Date:
Tue, 09 Dec 2014 13:10:57 -0800
(12/09/2014 04:10:57 PM)


I am tired of dumping entire DIEs just to see what type they are.
With
this patch, we get:

(gdb) print context_die
$5 = dw_die_ref 0x76de0230 DW_TAG_module parent=0x76de
DW_TAG_compile_unit

I know it's past the end of stage1, but this debugging aid can help
in
fixing bugs in stage = 3.

I am committing this to the [debug-early] branch, but I am hoping I
can
also commit it to mainline and avoid dragging it along.

OK for mainline?




--- a/gcc/gdbhooks.py
+++ b/gcc/gdbhooks.py
@@ -253,6 +253,26 @@ class CGraphNodePrinter:
  return result

  ##
+# Dwarf DIE pretty-printers
+##
+
+class DWDieRefPrinter:
+def __init__(self, gdbval):
+self.gdbval = gdbval
+
+def to_string (self):
+result = 'dw_die_ref 0x%x' % long(self.gdbval)


A minor nit: for the NULL case, you're doing slightly more work than
necessary: you start building result above...


Thanks.  Tom also pointed the same thing.  Ooops.

I'm committing the attached patch.

Aldy

commit d3f07cb1cbae79b10dc01affb863302e3f5e444d
Author: Aldy Hernandez al...@redhat.com
Date:   Tue Dec 9 13:04:46 2014 -0800

* gdbhooks.py (class DWDieRefPrinter): New class.
(build_pretty_printer): Register dw_die_ref's.

diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
index a74e712..6d9e41e 100644
--- a/gcc/gdbhooks.py
+++ b/gcc/gdbhooks.py
@@ -253,6 +253,26 @@ class CGraphNodePrinter:
 return result
 
 ##
+# Dwarf DIE pretty-printers
+##
+
+class DWDieRefPrinter:
+def __init__(self, gdbval):
+self.gdbval = gdbval
+
+def to_string (self):
+if long(self.gdbval) == 0:
+return 'dw_die_ref 0x0'
+result = 'dw_die_ref 0x%x' % long(self.gdbval)
+result += ' %s' % self.gdbval['die_tag']
+if long(self.gdbval['die_parent']) != 0:
+result += ' parent=0x%x %s' % (long(self.gdbval['die_parent']),
+ 
self.gdbval['die_parent']['die_tag'])
+ 
+result += ''
+return result
+
+##
 
 class GimplePrinter:
 def __init__(self, gdbval):
@@ -455,6 +475,8 @@ def build_pretty_printer():
  'tree', TreePrinter)
 pp.add_printer_for_types(['cgraph_node *'],
  'cgraph_node', CGraphNodePrinter)
+pp.add_printer_for_types(['dw_die_ref'],
+ 'dw_die_ref', DWDieRefPrinter)
 pp.add_printer_for_types(['gimple', 'gimple_statement_base *',
 
   # Keep this in the same order as gimple.def:

Re: Reducing the amount of builtins by merging named patterns

2014-12-10 Thread Andi Kleen

Richard Henderson r...@redhat.com writes:

 On 12/10/2014 06:08 AM, Blumental Maxim wrote:
 We have ~30 groups of similar  (i.e.having similar sets of attributes)
 named patterns. These groups together include ~230 template (i.e.
 having substitution attributes in their names) named patterns in
 total. So, we can reduce the amount of template named patterns by ~200
 at best. Those template named patterns correspond to several specific
 named patterns each. E.g. in my patch (see attached patch) I merged
 two template named patterns into one and that allowed me to replace
 four builtin's with only two.

 I don't find this particularly readable or maintainable.
 What do you hope to gain here?

I assume it would make the compiler faster and maybe smaller?

But also users may be using the existing builtin names,
so some compat macros would be needed.

-Andi

Re: [PATCH, libiberty, libcpp]: Introduce xvasprintf to libiberty and use it in libcpp

2014-12-10 Thread Ian Lance Taylor

On Wed, Dec 10, 2014 at 12:17 PM, Uros Bizjak ubiz...@gmail.com wrote:

 libiberty/ChangeLog:

 2014-12-10  Uros Bizjak  ubiz...@gmail.com
 Ben Elliston  b...@au.ibm.com
 Manuel Lopez-Ibanez  m...@gcc.gnu.org

 * xvasprintf.c: New file.
 * vprintf-support.h: Likewise.
 * vprintf-support.c: Likewise.
 * Makefile.in (CFILES): Add vprintf-support.c, xvasprintf.c.
 (REQUIRED_OFILES): Add vprintf-support.$(objext), xvasprintf.$(objext).
 (vprintf-support.$(objext), xvasprintf.$(objext)): New targets.
 * functions.texi: Updated with documentation for xvasprintf.
 * vasprintf.c (int_vasprintf): Use libiberty_vprintf_buffer_size.

 include/ChangeLog:

 2014-12-10  Uros Bizjak  ubiz...@gmail.com
 Ben Elliston  b...@au.ibm.com
 Manuel Lopez-Ibanez  m...@gcc.gnu.org

 * libiberty.h (xvasprintf): Declare.

 libcpp/ChangeLog:

 2014-12-10  Uros Bizjak  ubiz...@gmail.com

 * directives.c (cpp_define_formatted): Use xvasprintf.

 Bootstrapped without warning on x86_64-linux-gnu and alphaev68-linux-gnu.

 OK for mainline?

 +extern char * xvasprintf (const char *, va_list) ATTRIBUTE_MALLOC 
 ATTRIBUTE_PRINTF(1,0);

No space after '*' -- see other declarations in the file.


 +  char * result;

No space after '*'.

+  int total_width =  libiberty_vprintf_buffer_size (format, args);

Only one space after '='.


The libiberty/include patches are OK with those changes.

Thanks.

Ian

Re: [AArch64, NEON] Improve vmulX intrinsics

2014-12-10 Thread Jiangjiji

Hi, Christophe Lyon
These testcases are not covered by the glorious testsuite.
If these cases are in your todo list , I will exclude them.

Thanks.

-邮件原件-
发件人: Christophe Lyon [mailto:christophe.l...@linaro.org] 
发送时间: 2014年12月9日 21:43
收件人: Jiangjiji
抄送: gcc-patches@gcc.gnu.org; Richard Earnshaw; Yangfei (Felix); Marcus Shawcroft
主题: Re: [AArch64, NEON] Improve vmulX intrinsics

On 9 December 2014 at 13:52, Jiangjiji jiangj...@huawei.com wrote:
 Hi,
  This patch converts more intrinsics to use builtin functions instead of
 the
 previous inline assembly syntax.
  Passed the glorious testsuite of Christophe Lyon.

  Three testcases are added for the testing of intriniscs which are not
 covered by the testsuite:
  gcc.target/aarch64/vmull_high.c
  gcc.target/aarch64/vmull_high_lane.c
  gcc.target/aarch64/vmull_high_n.c


As I said here:
https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01934.html
I am in tre process of converting my existing testsuite to GCC/Dejagnu.
Please do not duplicate work.


  Regtested with aarch64-linux-gnu on QEMU.
  This patch has no regressions for aarch64_be-linux-gnu big-endian
 target too.
  OK for the trunk?



 Index: gcc/ChangeLog
 ===
 --- gcc/ChangeLog   (revision 218464)
 +++ gcc/ChangeLog   (working copy)
 @@ -1,3 +1,38 @@
 +2014-12-09  Felix Yang  felix.y...@huawei.com
 +Jiji Jiang  jiangj...@huawei.com
 +
 +   * config/aarch64/aarch64-simd.md (aarch64_mul_nmode,
 +   aarch64_sumull_nmode, aarch64_sumullmode,
 +   aarch64_simd_sumull2_nmode, aarch64_sumull2_nmode,
 +   aarch64_sumull_lanemode, aarch64_sumull2_lanemode_internal,
 +   aarch64_sumull_laneqmode,
 aarch64_sumull2_laneqmode_internal,
 +   aarch64_smull2_lanemode, aarch64_umull2_lanemode,
 +   aarch64_smull2_laneqmode, aarch64_umull2_laneqmode,
 +   aarch64_fmulxmode, aarch64_fmulxmode, aarch64_fmulx_lanemode,
 +   aarch64_pmull2v16qi, aarch64_pmullv8qi): New patterns.
 +   * config/aarch64/aarch64-simd-builtins.def (vec_widen_smult_hi_,
 +   vec_widen_umult_hi_, umull, smull, smull_n, umull_n, mul_n,
 smull2_n,
 +   umull2_n, smull_lane, umull_lane, smull_laneq, umull_laneq, pmull,
 +   umull2_lane, smull2_laneq, umull2_laneq, fmulx, fmulx_lane, pmull2,
 +   smull2_lane): New builtins.
 +   * config/aarch64/arm_neon.h (vmul_n_f32, vmul_n_s16, vmul_n_s32,
 +   vmul_n_u16, vmul_n_u32, vmulq_n_f32, vmulq_n_f64, vmulq_n_s16,
 +   vmulq_n_s32, vmulq_n_u16, vmulq_n_u32, vmull_high_lane_s16,
 +   vmull_high_lane_s32, vmull_high_lane_u16, vmull_high_lane_u32,
 +   vmull_high_laneq_s16, vmull_high_laneq_s32, vmull_high_laneq_u16,
 +   vmull_high_laneq_u32, vmull_high_n_s16, vmull_high_n_s32,
 +   vmull_high_n_u16, vmull_high_n_u32, vmull_high_p8, vmull_high_s8,
 +   vmull_high_s16, vmull_high_s32, vmull_high_u8, vmull_high_u16,
 +   vmull_high_u32, vmull_lane_s16, vmull_lane_s32, vmull_lane_u16,
 +   vmull_lane_u32, vmull_laneq_s16, vmull_laneq_s32, vmull_laneq_u16,
 +   vmull_laneq_u32, vmull_n_s16, vmull_n_s32, vmull_n_u16, vmull_n_u32,
 +   vmull_p8, vmull_s8, vmull_s16, vmull_s32, vmull_u8, vmull_u16,
 +   vmull_u32, vmulx_f32, vmulx_lane_f32, vmulxd_f64, vmulxq_f32,
 +   vmulxq_f64, vmulxq_lane_f32, vmulxq_lane_f64, vmulxs_f32): Rewrite
 +   using builtin functions.
 +   * config/aarch64/iterators.md (UNSPEC_FMULX, UNSPEC_FMULX_LANE,
 +   VDQF_Q): New unspec and int iterator.
 +
  2014-12-07  Felix Yang  felix.y...@huawei.com
 Shanyao Chen  chenshan...@huawei.com
  Index: gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmull_high.c
 ===
 --- gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmull_high.c
 (revision 0)
 +++ gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmull_high.c
 (revision 0)
 @@ -0,0 +1,111 @@
 +#include arm_neon.h
 +#include arm-neon-ref.h
 +#include compute-ref-data.h
 +
 +
 +/* Expected results.  */
 +VECT_VAR_DECL(expected,int,16,8) [] = { 0xfc48, 0xfcbf, 0xfd36, 0xfdad,
 +0xfe24, 0xfe9b, 0xff12, 0xff89 };
 +VECT_VAR_DECL(expected,int,32,4) [] = { 0xf9a0, 0xfa28,
 +0xfab0, 0xfb38 };
 +VECT_VAR_DECL(expected,int,64,2) [] = { 0xf7a2,
 +0xf83b };
 +VECT_VAR_DECL(expected,uint,16,8) [] = { 0xa4b0, 0xa55a, 0xa604, 0xa6ae,
 + 0xa758, 0xa802, 0xa8ac, 0xa956 };
 +VECT_VAR_DECL(expected,uint,32,4) [] = { 0xbaf73c, 0xbaf7f7,
 + 0xbaf8b2, 0xbaf96d };
 +VECT_VAR_DECL(expected,uint,64,2) [] = { 0xcbf4d8,
 + 0xcbf5a4};
 +VECT_VAR_DECL(expected,poly,16,8) [] = { 0x6530, 0x659a, 0x6464,

Merge from trunk@218484 to debug-early branch

2014-12-10 Thread Aldy Hernandez

It's been a while since the last merge, so I encountered more pain than 
usual in this merge.


I noticed quite a few target library failures, but those seem to have 
been there already, thanks to my less than stellar testing on my last 
big set of patches in October :(.


I will be fixing the target library problems in subsequent patches.

Aldy

1 2 >

1 - 100 of 102 matches

Mail list logo