date:20111031

Re: [google] ThreadSanitizer instrumentation pass (issue 5303083)

2011-10-31 Thread davidxl


Have not done with reviewing. This is the first batch.

David


http://codereview.appspot.com/5303083/diff/1/gcc/passes.c
File gcc/passes.c (right):

http://codereview.appspot.com/5303083/diff/1/gcc/passes.c#newcode1423
gcc/passes.c:1423: NEXT_PASS (pass_tsan);
Move this to the same place as asan. Otherwise TARGET_MEM_REF won't be
handled.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c
File gcc/tree-tsan.c (right):

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode56
gcc/tree-tsan.c:56: The instrumentation module mainintains shadow call
stacks
s/mainitains/maintains/

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode60
gcc/tree-tsan.c:60: Instrumentation for shadow stack maintainance is as
follows:
s/maintainance/maintenance/

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode94
gcc/tree-tsan.c:94: #define RTL_STACK __tsan_shadow_stack
Please change RTL_ prefix to TSAN_. It is confusing to use RTL_

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode100
gcc/tree-tsan.c:100: enum tsan_ignore_e
better to be tsan_ignore_type or tsan_ignore_kind.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode110
gcc/tree-tsan.c:110: enum bb_state_e
A new empty line is needed. Same for other comments leading a decl, or
function.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode110
gcc/tree-tsan.c:110: enum bb_state_e
bb_state_e --bb_state

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode119
gcc/tree-tsan.c:119: struct bb_data_t
_t suffix is better removed. Same for other types with _t suffix.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode161
gcc/tree-tsan.c:161: tree __attribute__((weak))
Explain this.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode169
gcc/tree-tsan.c:169: extern __thread void **__tsan_shadow_stack; */
Need two white space before */.  Same for other instances.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode182
gcc/tree-tsan.c:182:
Better use varpool_get_node interface.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode186
gcc/tree-tsan.c:186: TREE_STATIC (def) = 1;
Why mark TREE_STATIC (def) = 1? Should the variable be defined in tsan
library?

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode189
gcc/tree-tsan.c:189: DECL_TLS_MODEL (def) = decl_default_tls_model
(def);
Check if targetm.have_tls -- though for those target, tsan won't be
used.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode200
gcc/tree-tsan.c:200: {
Refactor the code so that it can be shared with the above one.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode228
gcc/tree-tsan.c:228: {
The name of the function is very confusing. Change it to
get_tsan_mop_handler_decl or something like that.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode251
gcc/tree-tsan.c:251: /* Adds new ignore definition to the global list */
Add documentation on function parameters (in upper case) such as  TYPE
is the ignore type, and NAME is the name of the function to be ignored.
If there is return value, document it too.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode257
gcc/tree-tsan.c:257: desc = (struct tsan_ignore_desc_t*)xmalloc (sizeof
(*desc));
Use XCNEW to clear.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode264
gcc/tree-tsan.c:264: /* Checks as to whether identifier 'str' matches
template 'templ'.
Use STR instead of 'str'. 'templ' -- TEMPL.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode291
gcc/tree-tsan.c:291: if (spos == NULL)
Move the check up right after spos is computed.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode349
gcc/tree-tsan.c:349: printf (failed to open ignore file '%s'\n,
flag_tsan_ignore);
Use error (..)

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode360
gcc/tree-tsan.c:360: if (line [sz-1] == '\r' || line [sz-1] == '\n')
sz-1 -- sz - 1

Change other instances

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode391
gcc/tree-tsan.c:391: src_name =
expand_location(cfun-function_start_locus).file;
space before (

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode413
gcc/tree-tsan.c:413: static const char *
Missing documentation.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode443
gcc/tree-tsan.c:443: tree rtl_stack;
Do not use rtl_ prefix. Same for other instances.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode459
gcc/tree-tsan.c:459: s = NULL;
MODIFY_EXPR?  directly use gimple_build_assign.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode725
gcc/tree-tsan.c:725:
This is wrong. SSA_NAME expr should be skipped.

http://codereview.appspot.com/5303083/diff/1/gcc/tree-tsan.c#newcode730
gcc/tree-tsan.c:730: {
remove {} Same for

[PATCH] Slight improvements to vec_init code gen on sparc.

2011-10-31 Thread David Miller


There is definitely more than can be done in this area, but at least
this is a start.

Next we can start trying to use the ASI_FL{8,16,32}_P short floating
point loads which zero extend a 8, 16, or 32 bit integer value into a
double precision float register.

gcc/

* config/sparc/sparc.c (vector_init_bshuffle): New function.
(vector_init_fpmerge): New function.
(sparc_expand_vector_init): Use them to improve non-const cases.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180696 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog|4 ++
 gcc/config/sparc/sparc.c |  105 ++
 2 files changed, 109 insertions(+), 0 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 037138a..a851ba1 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,9 @@
 2011-10-30  David S. Miller  da...@davemloft.net
 
+   * config/sparc/sparc.c (vector_init_bshuffle): New function.
+   (vector_init_fpmerge): New function.
+   (sparc_expand_vector_init): Use them to improve non-const cases.
+
* dwarf2out.c (dwarf2out_var_location): When processing several
consecutive location notes, cache the result of next_real_insn().
 
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 3883dbd..fd1b190 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -11279,6 +11279,67 @@ output_v8plus_mult (rtx insn, rtx *operands, const 
char *name)
 }
 }
 
+static void
+vector_init_bshuffle (rtx target, rtx elt, enum machine_mode mode,
+ enum machine_mode inner_mode)
+{
+  rtx t1, final_insn;
+  int bmask;
+
+  t1 = gen_reg_rtx (mode);
+
+  elt = convert_modes (SImode, inner_mode, elt, true);
+  emit_move_insn (gen_lowpart(SImode, t1), elt);
+
+  switch (mode)
+   {
+   case V2SImode:
+ final_insn = gen_bshufflev2si_vis (target, t1, t1);
+ bmask = 0x45674567;
+ break;
+   case V4HImode:
+ final_insn = gen_bshufflev4hi_vis (target, t1, t1);
+ bmask = 0x67676767;
+ break;
+   case V8QImode:
+ final_insn = gen_bshufflev8qi_vis (target, t1, t1);
+ bmask = 0x;
+ break;
+   default:
+ gcc_unreachable ();
+   }
+
+  emit_insn (gen_bmasksi_vis (gen_reg_rtx (SImode), CONST0_RTX (SImode),
+ force_reg (SImode, GEN_INT (bmask;
+  emit_insn (final_insn);
+}
+
+static void
+vector_init_fpmerge (rtx target, rtx elt, enum machine_mode inner_mode)
+{
+  rtx t1, t2, t3, t3_low;
+
+  t1 = gen_reg_rtx (V4QImode);
+  elt = convert_modes (SImode, inner_mode, elt, true);
+  emit_move_insn (gen_lowpart (SImode, t1), elt);
+
+  t2 = gen_reg_rtx (V4QImode);
+  emit_move_insn (t2, t1);
+
+  t3 = gen_reg_rtx (V8QImode);
+  t3_low = gen_lowpart (V4QImode, t3);
+
+  emit_insn (gen_fpmerge_vis (t3, t1, t2));
+  emit_move_insn (t1, t3_low);
+  emit_move_insn (t2, t3_low);
+
+  emit_insn (gen_fpmerge_vis (t3, t1, t2));
+  emit_move_insn (t1, t3_low);
+  emit_move_insn (t2, t3_low);
+
+  emit_insn (gen_fpmerge_vis (gen_lowpart (V8QImode, target), t1, t2));
+}
+
 void
 sparc_expand_vector_init (rtx target, rtx vals)
 {
@@ -11286,13 +11347,18 @@ sparc_expand_vector_init (rtx target, rtx vals)
   enum machine_mode inner_mode = GET_MODE_INNER (mode);
   int n_elts = GET_MODE_NUNITS (mode);
   int i, n_var = 0;
+  bool all_same;
   rtx mem;
 
+  all_same = true;
   for (i = 0; i  n_elts; i++)
 {
   rtx x = XVECEXP (vals, 0, i);
   if (!CONSTANT_P (x))
n_var++;
+
+  if (i  0  !rtx_equal_p (x, XVECEXP (vals, 0, 0)))
+   all_same = false;
 }
 
   if (n_var == 0)
@@ -11301,6 +11367,45 @@ sparc_expand_vector_init (rtx target, rtx vals)
   return;
 }
 
+  if (GET_MODE_SIZE (inner_mode) == GET_MODE_SIZE (mode))
+{
+  if (GET_MODE_SIZE (inner_mode) == 4)
+   {
+ emit_move_insn (gen_lowpart (SImode, target),
+ gen_lowpart (SImode, XVECEXP (vals, 0, 0)));
+ return;
+   }
+  else if (GET_MODE_SIZE (inner_mode) == 8)
+   {
+ emit_move_insn (gen_lowpart (DImode, target),
+ gen_lowpart (DImode, XVECEXP (vals, 0, 0)));
+ return;
+   }
+}
+  else if (GET_MODE_SIZE (inner_mode) == GET_MODE_SIZE (word_mode)
+   GET_MODE_SIZE (mode) == 2 * GET_MODE_SIZE (word_mode))
+{
+  emit_move_insn (gen_highpart (word_mode, target),
+ gen_lowpart (word_mode, XVECEXP (vals, 0, 0)));
+  emit_move_insn (gen_lowpart (word_mode, target),
+ gen_lowpart (word_mode, XVECEXP (vals, 0, 1)));
+  return;
+}
+
+  if (all_same  GET_MODE_SIZE (mode) == 8)
+{
+  if (TARGET_VIS2)
+   {
+ vector_init_bshuffle (target, XVECEXP (vals, 0, 0), mode, inner_mode);
+ return;
+   }
+  if (mode == V8QImode)
+   {
+

Re: C++ PATCH to add -std=c++11 ??

2011-10-31 Thread Gabriel Dos Reis

On Mon, Oct 31, 2011 at 12:26 AM, Jason Merrill ja...@redhat.com wrote:
 Here's my start at adjusting things to use the C++11 name; feel free to run
 with it.

 Looking at it again, I think adding __GXX_EXPERIMENTAL_CXX11__ is a mistake,
 we should just set __cplusplus to the C++11 value.

I tend to agree.  Too many macros to control C++11 may not necessarily
be a feature.  Tricky.

RE: [patch tree-optimization]: Improve handling of conditional-branches on targets with high branch costs

2011-10-31 Thread Jiangning Liu

 -Original Message-
 From: Kai Tietz [mailto:ktiet...@googlemail.com]
 Sent: Thursday, October 27, 2011 5:36 PM
 To: Jiangning Liu
 Cc: Michael Matz; Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org;
 Richard Henderson
 Subject: Re: [patch tree-optimization]: Improve handling of
 conditional-branches on targets with high branch costs

 2011/10/27 Jiangning Liu jiangning@arm.com:

  -Original Message-
  From: Michael Matz [mailto:m...@suse.de]
  Sent: Wednesday, October 26, 2011 11:47 PM
  To: Kai Tietz
  Cc: Jiangning Liu; Richard Guenther; Kai Tietz; gcc-
 patc...@gcc.gnu.org;
  Richard Henderson
  Subject: Re: [patch tree-optimization]: Improve handling of
  conditional-branches on targets with high branch costs

  Hi,

  On Wed, 26 Oct 2011, Kai Tietz wrote:

   So you would mean that memory dereferencing shouldn't be
 considered
  as
   side-effect at all?

  No.  I haven't said this at all.  Of course it's a side-effect, but
  we're
  allowed to remove existing ones (under some circumstances).  We're
 not
  allowed to introduce new ones, which means that this ...

   So we would happily cause by code 'if (i  *i != 0) an crash, as
   memory-dereference has for you no side-effect?

  ... is not allowed.  But in the original example the memread was on
 the
  left side, hence occured always, therefore we can move it to the
 right
  side, even though it might occur less often.

   In you special case it might be valid that, if first (and C-fold-
  const
   doesn't know if the side-effect condition is really the first, as
 it
   might be a sub-sequence of a condition) condition might trap or
 not,
  to
   combine it.  But branching has to cover the general cases.  If you
  find
   a way to determine that left-hand operand in fold_const's
 branching
  code
   is really the left-most condition in chain, then we can add such a
   special case, but I don't see here an easy way to determine it.

  Hmm?  I don't see why it's necessary to check if it's the left-most
  condition in a chain.  If the left hand of '' is a memread it can
  always
  be moved to the right side (or the operator transformed into ''
 which
  can
  have the same effect), of course only if the original rhs is free of
  side
  effects, but then independed if the  was part of a larger
 expression.
  The memread will possibly be done fewer times than originally, but
 as
  said, that's okay.

  Agree. The point is for the small case I gave RHS doesn't have side
 effect
  at all, so the optimization of changing it to AND doesn't violate C
  specification. We need to recover something for this case, although
 it did
  improve a lot for some particular benchmarks.

  Thanks,
  -Jiangning

  Ciao,
  Michael.

 Hmm, so we can allow merging to AND, if the left-hand-side might trap
 but has no-side-effects and rhs has neither trapping nor side-effects.
 As for the case that left-hand side has side-effects but right-hand
 not, we aren't allowed to do this AND/OR merge.  For example 'if ((f =
 foo ()) != 0  f  24)' we aren't allowed to make this
 transformation.

 This shouldn't be that hard.  We need to provide to simple_operand_p_2
 an additional argument for checking trapping or not.

Would it be OK if I file a tracker in bugzilla against this?

 Regards,
 Kai

Re: resent2 [PATCH] Fix ICE in redirect_jump, at jump.c:1497 PR50496

2011-10-31 Thread Chung-Lin Tang

On 2011/10/25 02:04 AM, Bernd Schmidt wrote:
 On 10/24/11 20:02, Chung-Lin Tang wrote:
 On 2011/10/18 04:03 PM, Eric Botcazou wrote:
 thread_prologue_and_epilogue_insns should detect all cases where a
 return insn can be created. So any CFG cleanup that runs before it does
 not need this functionality.

 So we're left with CFG cleanups that run after it and could forward edges 
 to an 
 edge from a return insn to the exit block in order to build a new return 
 insn.
 
 We have no testcases to suggest that this ever happens.

Which does mean that, at least through the two call sites that my
original patch modified, it may be hard to ever find out later, if patch
applied.

 Bernd, why can't we simply remove the assertion? The pre-reload case
 will fail at validation and return 0, matching pre-reload,
 pre-shrink-wrap behavior, while any possible remaining post-reload
 redirection to the exit block can just use 'ret_rtx' as the rare
 fallback
 
 No, after prologue insertion we have to distinguish between ret_rtx and
 simple_return_rtx.

I'm suggesting a new patch, as attached. Before reload_completed, we
directly return 0 upon nlabel == NULL, which should be identical with
old behavior, while asserting fail if after reload (where we assume the
simple_return/return distinction is required).

This should ensure better that, if a post-prologue case of redirecting
to the exit block ever happens we will more easily know (by some future
PR :P)

Bootstrapped and tested on i686, and cross tested on ARM using QEMU.
Eric, is this approach okay?

Thanks,
Chung-Lin

2011-10-31  Chung-Lin Tang  clt...@codesourcery.com

* jump.c (redirect_jump): Assert fail on nlabel == NULL_RTX
only after reload. Add comments.
Index: jump.c
===
--- jump.c  (revision 180421)
+++ jump.c  (working copy)
@@ -1495,8 +1495,19 @@ redirect_jump (rtx jump, rtx nlabel, int delete_un
 {
   rtx olabel = JUMP_LABEL (jump);
 
-  gcc_assert (nlabel != NULL_RTX);
+  if (!nlabel)
+{
+  /* For nlabel == NULL_RTX cases, if reload_completed == 0,
+return/simple_return are not yet creatable, thus we return 0
+immediately;  if reload_completed, we do not accept !nlabel
+at all, either a non-null label, or return/simple_return RTX.
+In that case assert fail.  */
 
+  if (!reload_completed)
+   return 0;
+  gcc_unreachable ();
+}
+
   if (nlabel == olabel)
 return 1;

Re: resent2 [PATCH] Fix ICE in redirect_jump, at jump.c:1497 PR50496

2011-10-31 Thread Eric Botcazou

 I'm suggesting a new patch, as attached. Before reload_completed, we
 directly return 0 upon nlabel == NULL, which should be identical with
 old behavior, while asserting fail if after reload (where we assume the
 simple_return/return distinction is required).

 This should ensure better that, if a post-prologue case of redirecting
 to the exit block ever happens we will more easily know (by some future
 PR :P)

 Bootstrapped and tested on i686, and cross tested on ARM using QEMU.
 Eric, is this approach okay?

Don't you want epilogue_completed instead of reload_completed?  Otherwise,
yes, the approach is fine with me, but wait for Bernd's input.

 2011-10-31  Chung-Lin Tang  clt...@codesourcery.com

   * jump.c (redirect_jump): Assert fail on nlabel == NULL_RTX
   only after reload. Add comments.

Minor rewording of the comment below:

+  if (!nlabel)
+{

/* If there is no label, we are asked to redirect to the EXIT block.  Now,
   before the epilogue is emitted, return/simple_return cannot be created
   so we return 0 immediately.  After the epilogue is emitted, we always
   expect a label, either a non-null label, or a return/simple_return RTX.
 
+  if (!reload_completed)
+   return 0;
+  gcc_unreachable ();
+}

-- 
Eric Botcazou

[PATCH] Re: vector shift regression on sparc

2011-10-31 Thread Jakub Jelinek

On Sun, Oct 30, 2011 at 12:38:32AM -0400, David Miller wrote:
 gcc.dg/pr48616.c segfaults on sparc as of a day or two ago
 
 vectorizable_shift() crashes because op1_vectype is NULL and
 we hit this code path:
 
   /* Vector shifted by vector.  */
   if (!scalar_shift_arg)
 {
   optab = optab_for_tree_code (code, vectype, optab_vector);
   if (vect_print_dump_info (REPORT_DETAILS))
   fprintf (vect_dump, vector/vector shift/rotate found.);
 =if (TYPE_MODE (op1_vectype) != TYPE_MODE (vectype))
 
 dt[1] is vect_external_def and slp_node is non-NULL.
 
 Indeed, when the 'dt' arg to vect_is_simple_use_1() is
 vect_external_def *vectype will be set to NULL.

Here is a fix for that (and other issues that show up on these
testcases with -O3 -mxop if I disable all vector/scalar shift expanders
in sse.md).
For SLP it currently gives up more often than for loop vectorization,
I assume we could handle all dt[1] == vect_constant_def
and dt[2] == vect_external_def cases for SLP (and at least the former
even if the constants differ between nodes) by building the vectors by hand,
though the current vect_get_vec_defs/vect_get_vec_defs_for_stmt_copy can't
be used for that as is.

2011-10-28  Jakub Jelinek  ja...@redhat.com

* tree-vect-stmts.c (vectorizable_shift): If op1 is vect_external_def
in a loop and has different type from op0, cast it to op0's type
before the loop first.  For slp give up.  Don't crash if op1_vectype
is NULL.

* gcc.dg/vshift-3.c: New test.
* gcc.dg/vshift-4.c: New test.
* gcc.dg/vshift-5.c: New test.

--- gcc/tree-vect-stmts.c.jj2011-10-28 16:21:06.0 +0200
+++ gcc/tree-vect-stmts.c   2011-10-31 10:27:57.0 +0100
@@ -2446,7 +2446,10 @@ vectorizable_shift (gimple stmt, gimple_
   optab = optab_for_tree_code (code, vectype, optab_vector);
   if (vect_print_dump_info (REPORT_DETAILS))
 fprintf (vect_dump, vector/vector shift/rotate found.);
-  if (TYPE_MODE (op1_vectype) != TYPE_MODE (vectype))
+  if (!op1_vectype)
+   op1_vectype = get_same_sized_vectype (TREE_TYPE (op1), vectype_out);
+  if (op1_vectype == NULL_TREE
+ || TYPE_MODE (op1_vectype) != TYPE_MODE (vectype))
{
  if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, unusable type for last operand in
@@ -2480,9 +2483,28 @@ vectorizable_shift (gimple stmt, gimple_
   /* Unlike the other binary operators, shifts/rotates have
  the rhs being int, instead of the same type as the lhs,
  so make sure the scalar is the right type if we are
- dealing with vectors of short/char.  */
+dealing with vectors of long long/long/short/char.  */
   if (dt[1] == vect_constant_def)
 op1 = fold_convert (TREE_TYPE (vectype), op1);
+ else if (!useless_type_conversion_p (TREE_TYPE (vectype),
+  TREE_TYPE (op1)))
+   {
+ if (slp_node
+  TYPE_MODE (TREE_TYPE (vectype))
+!= TYPE_MODE (TREE_TYPE (op1)))
+   {
+ if (vect_print_dump_info (REPORT_DETAILS))
+ fprintf (vect_dump, unusable type for last operand in
+  vector/vector shift/rotate.);
+   return false;
+   }
+ if (vec_stmt  !slp_node)
+   {
+ op1 = fold_convert (TREE_TYPE (vectype), op1);
+ op1 = vect_init_vector (stmt, op1,
+ TREE_TYPE (vectype), NULL);
+   }
+   }
 }
 }
 }
--- gcc/testsuite/gcc.dg/vshift-3.c.jj  2011-10-31 10:00:57.0 +0100
+++ gcc/testsuite/gcc.dg/vshift-3.c 2011-10-31 10:00:42.0 +0100
@@ -0,0 +1,136 @@
+/* { dg-do run } */
+/* { dg-options -O3 } */
+
+#include stdlib.h
+
+#define N 64
+
+#ifndef TYPE1
+#define TYPE1 int
+#define TYPE2 long long
+#endif
+
+signed TYPE1 a[N], b, g[N];
+unsigned TYPE1 c[N], h[N];
+signed TYPE2 d[N], e, j[N];
+unsigned TYPE2 f[N], k[N];
+
+#ifndef S
+#define S(x) x
+#endif
+
+__attribute__((noinline)) void
+f1 (void)
+{
+  int i;
+  for (i = 0; i  N; i++)
+g[i] = a[i]  S (b);
+}
+
+__attribute__((noinline)) void
+f2 (void)
+{
+  int i;
+  for (i = 0; i  N; i++)
+g[i] = a[i]  S (b);
+}
+
+__attribute__((noinline)) void
+f3 (void)
+{
+  int i;
+  for (i = 0; i  N; i++)
+h[i] = c[i]  S (b);
+}
+
+__attribute__((noinline)) void
+f4 (void)
+{
+  int i;
+  for (i = 0; i  N; i++)
+j[i] = d[i]  S (e);
+}
+
+__attribute__((noinline)) void
+f5 (void)
+{
+  int i;
+  for (i = 0; i  N; i++)
+j[i] = d[i]  S (e);
+}
+
+__attribute__((noinline)) void
+f6 (void)
+{
+  int i;
+  for (i = 0; i  N; i++)
+k[i] = f[i]  S (e);
+}
+

Re: [PR50869] don't attempt to expand CFA within cselib

2011-10-31 Thread Jakub Jelinek

On Fri, Oct 28, 2011 at 07:07:18PM -0200, Alexandre Oliva wrote:
 for  gcc/ChangeLog
 from  Alexandre Oliva  aol...@redhat.com
 
   PR debug/50869
   * cselib.c (cfa_base_preserved_regno): Initialize.
   (cselib_expand_value_rtx_1): Don't expand it.
   * var-tracking.c (vt_expand_var_loc_chain): Initialize depth.
   Check it's only zero if result is NULL.

Ok for trunk, thanks.

Jakub

Re: C++ PATCH to add -std=c++11 ??

2011-10-31 Thread Paolo Carlini


On 10/31/2011 06:26 AM, Jason Merrill wrote:
Here's my start at adjusting things to use the C++11 name; feel free 
to run with it.
Great. When you commit it, you can as well add 'PR c++/50920' to the 
ChangeLog!


Paolo.

Re: [ARM] Fix PR49641

2011-10-31 Thread Sebastian Huber


On 10/25/2011 06:56 PM, Richard Earnshaw wrote:

On 24/10/11 14:30, Sebastian Huber wrote:

Hello,

what about the attached patch based on the original patch provided by Bernd
Schmidt with modifications suggested by Richard Earnshaw.



pr49641.patch


* config/arm/arm.c (store_multiple_sequence): Avoid cases where
the base reg is stored iff compiling for Thumb1.

* gcc.target/arm/pr49641.c: New test.


OK.

R.



Would someone mind committing it?  Thanks.

--
Sebastian Huber, embedded brains GmbH

Address : Obere Lagerstr. 30, D-82178 Puchheim, Germany
Phone   : +49 89 18 90 80 79-6
Fax : +49 89 18 90 80 79-9
E-Mail  : sebastian.hu...@embedded-brains.de
PGP : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.

Re: [PATCH] Handle many consecutive location notes more efficiently in dwarf2.

2011-10-31 Thread Jakub Jelinek

On Sun, Oct 30, 2011 at 09:55:42PM -0400, David Miller wrote:
 --- a/gcc/dwarf2out.c
 +++ b/gcc/dwarf2out.c
 @@ -20149,7 +20151,35 @@ dwarf2out_var_location (rtx loc_note)
if (var_loc_p  !DECL_P (NOTE_VAR_LOCATION_DECL (loc_note)))
  return;
  
 -  next_real = next_real_insn (loc_note);
 +  /* Optimize processing a large consecutive sequence of location
 + notes so we don't spend too much time in next_real_insn.  If the
 + next insn is another location note, remember the next_real_insn
 + calculation for next time.  */
 +  next_real = cached_next_real_insn;
 +  if (next_real)
 +{
 +  if (expected_next_loc_note != loc_note)
 + next_real = NULL_RTX;
 +}
 +
 +  next_note = NEXT_INSN (loc_note);
 +  if (! next_note
 +  || INSN_DELETED_P (next_note)
 +  || GET_CODE (next_note) != NOTE
 +  || (NOTE_KIND (next_note) != NOTE_INSN_VAR_LOCATION

I think for next_note being NOTE_INSN_VAR_LOCATION you want to set
next_note to NULL_RTX if !DECL_P (NOTE_VAR_LOCATION_DECL (next_note)).

Otherwise you risk that the above
  if (var_loc_p  !DECL_P (NOTE_VAR_LOCATION_DECL (loc_note)))
return;
will not clear the cache, you reach end of function and in the
next function when dwarf2out_var_location is called for the first time,
cached_next_real_insn will be non-NULL and if you have really bad luck
it will be called on insn that has the same address as
expected_next_loc_note (GC collection could happen in between).
Or alternatively you could remove the whole if (! !next_note ...) next_note = 
NULL_RTX;
stmt and move your cache to a global var and clear it when reaching end of
function (like e.g. last_var_location_insn is cleared in
dwarf2out_end_epilogue).

Jakub

Re: [PATCH] Re: vector shift regression on sparc

2011-10-31 Thread Ira Rosen

On 31 October 2011 11:53, Jakub Jelinek ja...@redhat.com wrote:
 On Sun, Oct 30, 2011 at 12:38:32AM -0400, David Miller wrote:
 gcc.dg/pr48616.c segfaults on sparc as of a day or two ago

 vectorizable_shift() crashes because op1_vectype is NULL and
 we hit this code path:

   /* Vector shifted by vector.  */
   if (!scalar_shift_arg)
     {
       optab = optab_for_tree_code (code, vectype, optab_vector);
       if (vect_print_dump_info (REPORT_DETAILS))
       fprintf (vect_dump, vector/vector shift/rotate found.);
 =    if (TYPE_MODE (op1_vectype) != TYPE_MODE (vectype))

 dt[1] is vect_external_def and slp_node is non-NULL.

 Indeed, when the 'dt' arg to vect_is_simple_use_1() is
 vect_external_def *vectype will be set to NULL.

 Here is a fix for that (and other issues that show up on these
 testcases with -O3 -mxop if I disable all vector/scalar shift expanders
 in sse.md).
 For SLP it currently gives up more often than for loop vectorization,
 I assume we could handle all dt[1] == vect_constant_def
 and dt[2] == vect_external_def cases for SLP (and at least the former
 even if the constants differ between nodes) by building the vectors by hand,
 though the current vect_get_vec_defs/vect_get_vec_defs_for_stmt_copy can't
 be used for that as is.

 2011-10-28  Jakub Jelinek  ja...@redhat.com

        * tree-vect-stmts.c (vectorizable_shift): If op1 is vect_external_def
        in a loop and has different type from op0, cast it to op0's type
        before the loop first.  For slp give up.  Don't crash if op1_vectype
        is NULL.

        * gcc.dg/vshift-3.c: New test.
        * gcc.dg/vshift-4.c: New test.
        * gcc.dg/vshift-5.c: New test.

 --- gcc/tree-vect-stmts.c.jj    2011-10-28 16:21:06.0 +0200
 +++ gcc/tree-vect-stmts.c       2011-10-31 10:27:57.0 +0100
 @@ -2446,7 +2446,10 @@ vectorizable_shift (gimple stmt, gimple_
       optab = optab_for_tree_code (code, vectype, optab_vector);
       if (vect_print_dump_info (REPORT_DETAILS))
         fprintf (vect_dump, vector/vector shift/rotate found.);
 -      if (TYPE_MODE (op1_vectype) != TYPE_MODE (vectype))
 +      if (!op1_vectype)
 +       op1_vectype = get_same_sized_vectype (TREE_TYPE (op1), vectype_out);
 +      if (op1_vectype == NULL_TREE
 +         || TYPE_MODE (op1_vectype) != TYPE_MODE (vectype))
        {
          if (vect_print_dump_info (REPORT_DETAILS))
            fprintf (vect_dump, unusable type for last operand in
 @@ -2480,9 +2483,28 @@ vectorizable_shift (gimple stmt, gimple_
               /* Unlike the other binary operators, shifts/rotates have
                  the rhs being int, instead of the same type as the lhs,
                  so make sure the scalar is the right type if we are
 -                 dealing with vectors of short/char.  */
 +                dealing with vectors of long long/long/short/char.  */
               if (dt[1] == vect_constant_def)
                 op1 = fold_convert (TREE_TYPE (vectype), op1);
 +             else if (!useless_type_conversion_p (TREE_TYPE (vectype),
 +                                                  TREE_TYPE (op1)))

What happens in case dt[1] == vect_internal_def?

Thanks,
Ira

 +               {
 +                 if (slp_node
 +                      TYPE_MODE (TREE_TYPE (vectype))
 +                        != TYPE_MODE (TREE_TYPE (op1)))
 +                   {
 +                     if (vect_print_dump_info (REPORT_DETAILS))
 +                     fprintf (vect_dump, unusable type for last operand in
 +                                          vector/vector shift/rotate.);
 +                       return false;
 +                   }
 +                 if (vec_stmt  !slp_node)
 +                   {
 +                     op1 = fold_convert (TREE_TYPE (vectype), op1);
 +                     op1 = vect_init_vector (stmt, op1,
 +                                             TREE_TYPE (vectype), NULL);
 +                   }
 +               }
             }
         }
     }



        Jakub

Re: [PATCH] Re: vector shift regression on sparc

2011-10-31 Thread Jakub Jelinek

On Mon, Oct 31, 2011 at 01:14:25PM +0200, Ira Rosen wrote:
  --- gcc/tree-vect-stmts.c.jj    2011-10-28 16:21:06.0 +0200
  +++ gcc/tree-vect-stmts.c       2011-10-31 10:27:57.0 +0100
  @@ -2446,7 +2446,10 @@ vectorizable_shift (gimple stmt, gimple_
        optab = optab_for_tree_code (code, vectype, optab_vector);
        if (vect_print_dump_info (REPORT_DETAILS))
          fprintf (vect_dump, vector/vector shift/rotate found.);
  -      if (TYPE_MODE (op1_vectype) != TYPE_MODE (vectype))
  +      if (!op1_vectype)
  +       op1_vectype = get_same_sized_vectype (TREE_TYPE (op1), vectype_out);
  +      if (op1_vectype == NULL_TREE
  +         || TYPE_MODE (op1_vectype) != TYPE_MODE (vectype))
         {
           if (vect_print_dump_info (REPORT_DETAILS))
             fprintf (vect_dump, unusable type for last operand in
  @@ -2480,9 +2483,28 @@ vectorizable_shift (gimple stmt, gimple_
                /* Unlike the other binary operators, shifts/rotates have
                   the rhs being int, instead of the same type as the lhs,
                   so make sure the scalar is the right type if we are
  -                 dealing with vectors of short/char.  */
  +                dealing with vectors of long long/long/short/char.  */
                if (dt[1] == vect_constant_def)
                  op1 = fold_convert (TREE_TYPE (vectype), op1);
  +             else if (!useless_type_conversion_p (TREE_TYPE (vectype),
  +                                                  TREE_TYPE (op1)))
 
 What happens in case dt[1] == vect_internal_def?

For !slp_node we can't reach this with dt1[1] == vect_internal_def,
because of:
  if (dt[1] == vect_internal_def  !slp_node)
scalar_shift_arg = false;
And for slp_node I'm just giving up if type modes don't match:

  +               {
  +                 if (slp_node
  +                      TYPE_MODE (TREE_TYPE (vectype))
  +                        != TYPE_MODE (TREE_TYPE (op1)))
  +                   {
  +                     if (vect_print_dump_info (REPORT_DETAILS))
  +                     fprintf (vect_dump, unusable type for last operand 
  in
  +                                          vector/vector shift/rotate.);
  +                       return false;
  +                   }

BTW, even the pre-existing if (dt[1] == vect_constant_def) doesn't seem to
be 100% correct for slp_node != NULL, I think vect_get_constant_vectors
will in that case create a VECTOR_CST with the desirable vector type
(same type mode as op0's vector type mode), but the constants in the
VECTOR_CST will have a wrong type (say V4DImode VECTOR_CST with
SImode constants in its constructor).  The expander doesn't ICE on it
though.

Jakub

Re: [PATCH] Re: vector shift regression on sparc

2011-10-31 Thread Ira Rosen

On 31 October 2011 13:23, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, Oct 31, 2011 at 01:14:25PM +0200, Ira Rosen wrote:
  --- gcc/tree-vect-stmts.c.jj    2011-10-28 16:21:06.0 +0200
  +++ gcc/tree-vect-stmts.c       2011-10-31 10:27:57.0 +0100
  @@ -2446,7 +2446,10 @@ vectorizable_shift (gimple stmt, gimple_
        optab = optab_for_tree_code (code, vectype, optab_vector);
        if (vect_print_dump_info (REPORT_DETAILS))
          fprintf (vect_dump, vector/vector shift/rotate found.);
  -      if (TYPE_MODE (op1_vectype) != TYPE_MODE (vectype))
  +      if (!op1_vectype)
  +       op1_vectype = get_same_sized_vectype (TREE_TYPE (op1), 
  vectype_out);
  +      if (op1_vectype == NULL_TREE
  +         || TYPE_MODE (op1_vectype) != TYPE_MODE (vectype))
         {
           if (vect_print_dump_info (REPORT_DETAILS))
             fprintf (vect_dump, unusable type for last operand in
  @@ -2480,9 +2483,28 @@ vectorizable_shift (gimple stmt, gimple_
                /* Unlike the other binary operators, shifts/rotates have
                   the rhs being int, instead of the same type as the lhs,
                   so make sure the scalar is the right type if we are
  -                 dealing with vectors of short/char.  */
  +                dealing with vectors of long long/long/short/char.  */
                if (dt[1] == vect_constant_def)
                  op1 = fold_convert (TREE_TYPE (vectype), op1);
  +             else if (!useless_type_conversion_p (TREE_TYPE (vectype),
  +                                                  TREE_TYPE (op1)))

 What happens in case dt[1] == vect_internal_def?

 For !slp_node we can't reach this with dt1[1] == vect_internal_def,
 because of:
  if (dt[1] == vect_internal_def  !slp_node)
    scalar_shift_arg = false;
 And for slp_node I'm just giving up if type modes don't match:

  +               {
  +                 if (slp_node
  +                      TYPE_MODE (TREE_TYPE (vectype))
  +                        != TYPE_MODE (TREE_TYPE (op1)))
  +                   {
  +                     if (vect_print_dump_info (REPORT_DETAILS))
  +                     fprintf (vect_dump, unusable type for last operand 
  in
  +                                          vector/vector shift/rotate.);
  +                       return false;
  +                   }


Ah, OK.

 BTW, even the pre-existing if (dt[1] == vect_constant_def) doesn't seem to
 be 100% correct for slp_node != NULL, I think vect_get_constant_vectors
 will in that case create a VECTOR_CST with the desirable vector type
 (same type mode as op0's vector type mode), but the constants in the
 VECTOR_CST will have a wrong type (say V4DImode VECTOR_CST with
 SImode constants in its constructor).  The expander doesn't ICE on it
 though.

Right. As you wrote before, we should probably change shift vectors
creation for SLP.

The patch is OK.

Thanks,
Ira


        Jakub

Re: [patch, Fortran] Fix PR 50690

2011-10-31 Thread Tobias Burnus


Tobias Burnus wrote:
I had also a glance at the patch - and it looks reasonable; in 
particular, I failed to generate a failing test case.


Actually, the test case is *not* OK.

If one compiles the original test case of the PR (or your 
workshare2.f90) with -O and looks at -fdump-tree-original, one finds:


#pragma omp parallel default(shared)
  {
{
  real(kind=4) __var_1;
  {
#pragma omp single
  {
__var_1 = __builtin_cosf (b[0])
  }
...
#pragma omp for schedule(static) nowait
for (S.1 = 1; S.1 = 5; S.1 = S.1 + 1)
  {
a[S.1 + -1] = a[S.1 + -1] * D.1730 + a[S.1 + -1] * 
D.1731;


Thus, __var_1 is a thread-local variable; however, COS() is not executed 
in all threads but only in one due to the omp single: The single 
construct specifies that the associated structured block is executed by 
only one of the threads in the team (2.5.3 single Construct, OpenMP 3.1).


Jakub remarks that omp single is what we expand to omp workshare if it 
is not simple enough for us.


 * * *

With the test case below, the dump looks OK, but the FE optimization 
does not combine the two cos() calls - I have no idea why. The dump 
looks as:


  #pragma omp parallel default(shared)
{
D.1743 = __builtin_cosf (b[0]);
D.1745 = __builtin_cosf (b[0]);
...
  #pragma omp for schedule(static) nowait
  for (S.2 = 1; S.2 = 10; S.2 = S.2 + 1)
  a[S.2 + D.1750] = a[S.2 + D.1748] * D.1743 + 
a[S.2 + D.1749] * D.1745;


Tobias

PS: The test case is:

program workshare
  implicit none
  real, parameter :: eps = 3e-7
  integer :: j
  real :: A(10,5), B(5)
  B(1) = 3.344
  call random_number(a)
  !$omp parallel default(shared)
  !$omp workshare
  forall (j=1:5)
A(:,j) = A(:,j)*cos(B(1))+A(:,j)*cos(B(1))
  end forall
  !$omp end workshare
  !$omp end parallel
  print *, A
end program workshare

subroutine parallel_workshare
  implicit none
  real, parameter :: eps = 3e-7
  integer :: j
  real :: A(10,5), B(5)
  B(1) = 3.344
  call random_number(a)
  !$omp parallel workshare default(shared)
  forall (j=1:5)
A(:,j) = A(:,j)*cos(B(1))+A(:,j)*cos(B(1))
  end forall
  !$omp end parallel workshare
  print *, A
end subroutine parallel_workshare

fixes after the review (issue5303083)

2011-10-31 Thread Dmitriy Vyukov

Fixes after davidxl review.
The patch is for google/main branch.

2011-10-31   Dmitriy Vyukov  dvyu...@google.com

* gcc/doc/invoke.texi:
* gcc/tree-tsan.c (enum tsan_ignore_type):
(struct bb_data):
(struct mop_desc):
(struct tsan_ignore_desc):
(lookup_name):
(build_var_decl):
(get_shadow_stack_decl):
(get_thread_ignore_decl):
(get_handle_mop_decl):
(ignore_append):
(ignore_match):
(ignore_load):
(tsan_ignore):
(decl_name):
(build_stack_op):
(build_rec_ignore_op):
(build_stack_assign):
(instr_mop):
(instr_vptr_store):
(instr_func):
(set_location):
(is_dtor_vptr_store):
(is_vtbl_read):
(is_load_of_const):
(handle_expr):
(handle_gimple):
(instrument_bblock):
(instrument_mops):
(instrument_function):
(tsan_pass):
(tsan_gate):
* gcc/tree-pass.h:
* gcc/testsuite/gcc.dg/tsan-ignore.ignore:
* gcc/testsuite/gcc.dg/tsan.h (__tsan_init):
(__tsan_expect_mop):
(__tsan_handle_mop):
* gcc/testsuite/gcc.dg/tsan-ignore.c (foo):
(int bar):
(int baz):
(int bla):
(int xxx):
(main):
* gcc/testsuite/gcc.dg/tsan-ignore.h (in_tsan_ignore_header):
* gcc/testsuite/gcc.dg/tsan-stack.c (foobar):
* gcc/testsuite/gcc.dg/tsan-mop.c:
* gcc/common.opt:
* gcc/Makefile.in:
* gcc/passes.c:

Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi (revision 180522)
+++ gcc/doc/invoke.texi (working copy)
@@ -308,6 +308,7 @@
 -fdump-tree-ssa@r{[}-@var{n}@r{]} -fdump-tree-pre@r{[}-@var{n}@r{]} @gol
 -fdump-tree-ccp@r{[}-@var{n}@r{]} -fdump-tree-dce@r{[}-@var{n}@r{]} @gol
 -fdump-tree-gimple@r{[}-raw@r{]} -fdump-tree-mudflap@r{[}-@var{n}@r{]} @gol
+-fdump-tree-tsan@r{[}-@var{n}@r{]} @gol
 -fdump-tree-dom@r{[}-@var{n}@r{]} @gol
 -fdump-tree-dse@r{[}-@var{n}@r{]} @gol
 -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol
@@ -381,8 +382,8 @@
 -floop-parallelize-all -flto -flto-compression-level @gol
 -flto-partition=@var{alg} -flto-report -fmerge-all-constants @gol
 -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
--fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg 
@gol
--fno-default-inline @gol
+-fmove-loop-invariants -fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg 
@gol
+-ftsan -ftsan-ignore -fno-default-inline @gol
 -fno-defer-pop -fno-function-cse -fno-guess-branch-probability @gol
 -fno-inline -fno-math-errno -fno-peephole -fno-peephole2 @gol
 -fno-sched-interblock -fno-sched-spec -fno-signed-zeros @gol
@@ -5896,6 +5897,11 @@
 Dump each function after adding mudflap instrumentation.  The file name is
 made by appending @file{.mudflap} to the source file name.
 
+@item tsan
+@opindex fdump-tree-tsan
+Dump each function after adding ThreadSanitizer instrumentation.  The file 
name is
+made by appending @file{.tsan} to the source file name.
+
 @item sra
 @opindex fdump-tree-sra
 Dump each function after performing scalar replacement of aggregates.  The
@@ -6674,6 +6680,12 @@
 some protection against outright memory corrupting writes, but allows
 erroneously read data to propagate within a program.
 
+@item -ftsan -ftsan-ignore
+@opindex ftsan
+@opindex ftsan-ignore
+Add ThreadSanitizer instrumentation. Use @option{-ftsan-ignore} to specify
+an ignore file. Refer to http://go/tsan for details.
+
 @item -fthread-jumps
 @opindex fthread-jumps
 Perform optimizations where we check to see if a jump branches to a
Index: gcc/tree-tsan.c
===
--- gcc/tree-tsan.c (revision 0)
+++ gcc/tree-tsan.c (revision 0)
@@ -0,0 +1,1125 @@
+/* ThreadSanitizer instrumentation pass.
+   http://code.google.com/p/data-race-test
+   Copyright (C) 2011
+   Free Software Foundation, Inc.
+   Contributed by Dmitry Vyukov dvyu...@google.com
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+http://www.gnu.org/licenses/.  */
+
+#include config.h
+#include system.h
+#include coretypes.h
+#include tree.h
+#include intl.h
+#include tm.h
+#include basic-block.h
+#include gimple.h
+#include function.h
+#include tree-flow.h
+#include tree-pass.h
+#include cfghooks.h
+#include

Re: [PATCH] Optimize in RTL vector AND { -1, -1, ... }, IOR { -1, -1, ... } and XOR { -1, -1, ... } (take 2)

2011-10-31 Thread Henderson, Stuart

2011-09-26  Jakub Jelinek  ja...@redhat.com

   * rtl.h (const_tiny_rtx): Change into array of 4 x MAX_MACHINE_MODE
   from 3 x MAX_MACHINE_MODE.
   (CONSTM1_RTX): Define.
   * emit-rtl.c (const_tiny_rtx): Change into array of 4 x MAX_MACHINE_MODE
   from 3 x MAX_MACHINE_MODE.
   (gen_rtx_CONST_VECTOR): Use CONSTM1_RTX if all inner constants are
   CONSTM1_RTX.
   (init_emit_once): Initialize CONSTM1_RTX for MODE_INT and
   MODE_VECTOR_INT modes.
   * simplify-rtx.c (simplify_binary_operation_1) case IOR, XOR, AND:
   Optimize if one operand is CONSTM1_RTX.
   * config/i386/i386.c (ix86_expand_sse_movcc): Optimize mask ? -1 : x
   into mask | x.

FYI - this patch (179238) breaks the Blackfin compiler build with an internal 
compiler error during configure of libgcc:
conftest.c:1:0: internal compiler error: in gen_const_vector, at emit-rtl.c:5491

which is the:
  gcc_assert (const_tiny_rtx[constant][(int) inner]);


gcc configured with:
../gcc-4.7/configure --build=x86_64-unknown-linux-gnu 
--host=x86_64-unknown-linux-gnu --target=bfin-elf 
--prefix=/home/shender/gnu/toolchain/bfin-elf --disable-libstdcxx-pch 
--enable-languages=c,c++ --with-newlib --enable-clocale=generic 
--disable-symvers --disable-libssp --disable-libffi --disable-libgcj 
--enable-version-specific-runtime-libs --enable-__cxa_atexit

Stu

Re: [RFC PATCH] update to libtool-2.4.2 and regenerate

2011-10-31 Thread Rainer Orth

Markus Trippelsdorf mar...@trippelsdorf.de writes:

 By popular demand, I've prepared a patch that updates the in-tree
 libtool to version 2.4.2. It is needed for lto-bootstrap with
 -fno-fat-lto-objects and FreeBSD10.x versions. 
 It's a pretty big update as you can see by the following diffstat. I
 cannot attach the patch even as a gzip file, because of its size:

  417745 Oct 28 00:47 0001-update-to-libtool-2.4.2-and-regenerate.patch.gz

 Bootstrapped on x86_64-pc-linux-gnu. 

 Comments? Stage 1 will end soon and it would be nice to get this in.

I've tried your patch on i386-pc-solaris2.11 this weekend in a variety
of configurations:

* using ld or gld 2.21.1,

* with the 32-bit default configuration (i386-pc-solaris2.11) and the
  64-bit default configuration (amd64-pc-solaris2.11).

This revealed a couple of problems:

* If Go support is included (off by default), bootstrap breaks like this
  while building libgo:

libtool: Version mismatch error.  This is libtool 2.4.2, but the
libtool: definition of this LT_INIT comes from libtool 2.2.7a.
libtool: You should recreate aclocal.m4 with macros from libtool 2.4.2
libtool: and run autoconf again.
make[4]: *** [go-assert.lo] Error 63

  To avoid this, I've run all bootstraps without Go.

* When building the 64-bit default gld configuration, building the
  64-bit libjava fails like this:

Error libtool: compile: not configured to build any kind of library
libtool: compile: See the libtool documentation for more information.
make[5]: libtool: compile: Fatal configuration error.

  I had already patched the copy of libtool.m4 to deal with the new
  configuration and now also submitted it upstream:

Support 64-bit default GCC on Solaris/x86
http://lists.gnu.org/archive/html/libtool-patches/2011-10/msg00021.html

  After applying the patch and regenerating all affected configure
  scripts, the bootstrap completed.

With those two changes, all four bootstraps completed without regressions.

I made a quick comparison of the libtool.m4 in libgo/config with the
2.4.2 version: the only relevant change seems to be an instance of
AC_PROG_GO, which also lives in go.m4.  Ian will know why that
additional copy is necessary.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [RFC PATCH] Gather vectorization (PR tree-optimization/50789)

2011-10-31 Thread Jakub Jelinek

On Sat, Oct 29, 2011 at 03:53:37PM +0200, Toon Moene wrote:
 I wonder whether it will work with the attached Fortran routine - it
 sure would mean a boost to the 18%+ heaviest CPU user in our code.

It didn't do anything, but only because I used a bad approach in
vect_check_gather.  I have been using DR_BASE_ADDRESS/DR_OFFSET/DR_INIT
into which dr_analyze_innermost splits the reference, but that split is
into something hopefully usable for alias analysis, variable and constant
offset using split_constant_offset that would create largish expressions in
your testcase, which add many values together.  But for gather vectorization
we'd either have to gimplify such expressions back before the load and
allow them to be vectorized, or, as done in this incremental patch,
I instead do something similar to what dr_analyze_innermost does,
but with different goal - to split stuff off into a loop invariant that
can be computed before the loop (and will be put into the scalar part of
gather), and a SSA_NAME defined in the loop which contains the rest (plus
optionally sign/zero extending that into a wider type and/or scaling by
2/4/8.  With this incremental patch I get 4 loops in this testcase
with -O3 -mavx2 vectorized (compared to 0 before), with 262 vgather* insns.

On x86_64 unfortunately this doesn't figure out that it could do all the
additions for the variable index in 32-bit type and then sign extend:
idx_202 = *kp_201(D)[D.1941_200];
idy_209 = *kq_208(D)[D.1941_200];
ilev_216 = *kr_215(D)[D.1941_200];
D.1955_229 = *pgama_228(D)[D.1941_200];
D.1960_237 = *pbeta_236(D)[D.1941_200];
D.1965_245 = *palfa_244(D)[D.1941_200];
D.1966_246 = ilev_216 + -1;
D.1967_247 = (integer(kind=8)) D.1966_246;
D.1968_248 = D.1967_247 * stride.32_141;
D.1969_249 = D.1968_248 + offset.33_155;
D.1970_250 = idy_209 + -1;
D.1971_251 = (integer(kind=8)) D.1970_250;
D.1972_252 = D.1971_251 * stride.30_129;
D.1973_253 = D.1969_249 + D.1972_252;
D.1974_254 = idx_202 + -1;
D.1975_255 = (integer(kind=8)) D.1974_254;
D.1976_256 = D.1973_253 + D.1975_255;
D.1977_258 = *parg_257(D)[D.1976_256];
so for -m64 it emits vgatherqps instructions (V4DImode indexes, loads
V4SFmode values) and then merges those, while for -m32 it emits
just 131 vgatherdps instructions (V8SImode indexes, V8SFmode values).

Would be nice to cut down slightly this testcase into just one or two loops
that are vectorized and turn it into a runtime testcase which verifies
the vectorization was correct.

2011-10-31  Jakub Jelinek  ja...@redhat.com

* tree-vect-stmts.c (vectorizable_load): Don't add
DR_INIT (dr) to ptr.
* tree-vect-data-refs.c (vect_check_gather): Rewritten not to use
DR_BASE_ADDRESS or DR_OFFSET, instead call get_inner_reference
and try to separate in between base and off.

--- gcc/tree-vect-stmts.c.jj2011-10-31 12:13:45.0 +0100
+++ gcc/tree-vect-stmts.c   2011-10-31 13:21:13.0 +0100
@@ -4452,8 +4452,6 @@ vectorizable_load (gimple stmt, gimple_s
   vec_dest = vect_create_destination_var (scalar_dest, vectype);
 
   ptr = fold_convert (ptrtype, gather_base);
-  ptr = fold_build2 (POINTER_PLUS_EXPR, ptrtype, ptr,
-fold_convert (sizetype, DR_INIT (dr)));
   if (!is_gimple_min_invariant (ptr))
{
  ptr = force_gimple_operand (ptr, seq, true, NULL_TREE);
--- gcc/tree-vect-data-refs.c.jj2011-10-31 12:13:45.0 +0100
+++ gcc/tree-vect-data-refs.c   2011-10-31 14:53:18.0 +0100
@@ -2504,109 +2504,156 @@ tree
 vect_check_gather (gimple stmt, loop_vec_info loop_vinfo, tree *basep,
   tree *offp, int *scalep)
 {
-  HOST_WIDE_INT scale = 1;
+  HOST_WIDE_INT scale = 1, pbitpos, pbitsize;
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
   tree offtype = NULL_TREE;
-  tree base = DR_BASE_ADDRESS (dr);
-  tree off = DR_OFFSET (dr);
-  tree decl;
-
-  if (TREE_CODE (base) == POINTER_PLUS_EXPR
-   integer_zerop (off)
-   TREE_CODE (TREE_OPERAND (base, 0)) == SSA_NAME
-   !chrec_contains_symbols_defined_in_loop (TREE_OPERAND (base, 0),
- loop-num))
+  tree decl, base, off;
+  enum machine_mode pmode;
+  int punsignedp, pvolatilep;
+
+  base = get_inner_reference (DR_REF (dr), pbitsize, pbitpos, off,
+ pmode, punsignedp, pvolatilep, false);
+  gcc_assert (base != NULL_TREE  (pbitpos % BITS_PER_UNIT) == 0);
+
+  if (TREE_CODE (base) == MEM_REF)
 {
-  off = TREE_OPERAND (base, 1);
+  if (!integer_zerop (TREE_OPERAND (base, 1)))
+   {
+ if (off == NULL_TREE)
+   {
+ double_int moff = mem_ref_offset (base);
+ off = double_int_to_tree (sizetype, moff);
+   }
+ else
+   off = size_binop (PLUS_EXPR, off, TREE_OPERAND (base, 1));
+   }
   base =

Re: PowerPC shrink-wrap support 3 of 3

2011-10-31 Thread Alan Modra

So I'm at the point where I'm reasonably happy with this work.  This
patch doesn't do anything particularly clever regarding our
shrink-wrap implementation.  We still only insert one copy of the
prologue, and one of the epilogue in thread_prologue_and_epilogue.
All it really does is replaces Bernd's !last_bb_active code (allowing
one tail block with no active insns to be shared by paths needing a
prologue and paths not needing a prologue), with what I think is
conceptually simpler, duplicating a shared tail block.  Then I extend
this to duplicating a chain of tail blocks.

That leads to some simplification as all the special cases and
restrictions of !last_bb_active disappear.  For example,
convert_jumps_to_returns looks much like the code in gcc-4.6.  We also
get many more functions being shrink-wrapped.  Some numbers from my
latest gcc bootstraps:

powerpc-linux
.../gcc-virgin/gcc grep 'Performing shrink' *.pro_and_epilogue | wc -l
453
.../gcc-curr/gcc grep 'Performing shrink' *.pro_and_epilogue | wc -l
648

i686-linux
.../gcc-virgin/gcc$ grep 'Performing shrink' *pro_and_epilogue | wc -l
329
.../gcc-curr/gcc$ grep 'Performing shrink' *.pro_and_epilogue | wc -l
416

Bits left to do
- limit size of duplicated tails
- don't duplicate sibling call blocks, but instead split the block
  after the sibling call epilogue has been added, redirecting
  non-prologue paths past the epilogue.

Is this OK to apply as is?

* function.c (bb_active_p): Delete.
(dup_block_and_redirect, active_insn_between): New functions.
(convert_jumps_to_returns, emit_return_for_exit): New functions,
split out from..
(thread_prologue_and_epilogue_insns): ..here.  Delete
shadowing variables.  Don't do prologue register clobber tests
when shrink wrapping already failed.  Delete all last_bb_active
code.  Instead compute tail block candidates for duplicating
exit path.  Remove these from antic set.  Duplicate tails when
reached from both blocks needing a prologue/epilogue and
blocks not needing such.

Index: gcc/function.c
===
*** gcc/function.c  (revision 180588)
--- gcc/function.c  (working copy)
*** set_return_jump_label (rtx returnjump)
*** 5514,5535 
  JUMP_LABEL (returnjump) = ret_rtx;
  }
  
! /* Return true if BB has any active insns.  */
  static bool
! bb_active_p (basic_block bb)
  {
rtx label;
  
!   /* Test whether there are active instructions in BB.  */
!   label = BB_END (bb);
!   while (label  !LABEL_P (label))
  {
!   if (active_insn_p (label))
!   break;
!   label = PREV_INSN (label);
  }
!   return BB_HEAD (bb) != label || !LABEL_P (label);
  }
  
  /* Generate the prologue and epilogue RTL if the machine supports it.  Thread
 this into place with notes indicating where the prologue ends and where
--- 5514,5698 
  JUMP_LABEL (returnjump) = ret_rtx;
  }
  
! #ifdef HAVE_simple_return
! /* Create a copy of BB instructions and insert at BEFORE.  Redirect
!preds of BB to COPY_BB if they don't appear in NEED_PROLOGUE.  */
! static void
! dup_block_and_redirect (basic_block bb, basic_block copy_bb, rtx before,
!   bitmap_head *need_prologue)
! {
!   edge_iterator ei;
!   edge e;
!   rtx insn = BB_END (bb);
! 
!   /* We know BB has a single successor, so there is no need to copy a
!  simple jump at the end of BB.  */
!   if (simplejump_p (insn))
! insn = PREV_INSN (insn);
! 
!   start_sequence ();
!   duplicate_insn_chain (BB_HEAD (bb), insn);
!   if (dump_file)
! {
!   unsigned count = 0;
!   for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
!   if (active_insn_p (insn))
! ++count;
!   fprintf (dump_file, Duplicating bb %d to bb %d, %u active insns.\n,
!  bb-index, copy_bb-index, count);
! }
!   insn = get_insns ();
!   end_sequence ();
!   emit_insn_before (insn, before);
! 
!   /* Redirect all the paths that need no prologue into copy_bb.  */
!   for (ei = ei_start (bb-preds); (e = ei_safe_edge (ei)); )
! if (!bitmap_bit_p (need_prologue, e-src-index))
!   {
!   redirect_edge_and_branch_force (e, copy_bb);
!   continue;
!   }
! else
!   ei_next (ei);
! }
! #endif
! 
! #if defined (HAVE_return) || defined (HAVE_simple_return)
! /* Return true if there are any active insns between HEAD and TAIL.  */
  static bool
! active_insn_between (rtx head, rtx tail)
! {
!   while (tail)
! {
!   if (active_insn_p (tail))
!   return true;
!   if (tail == head)
!   return false;
!   tail = PREV_INSN (tail);
! }
!   return false;
! }
! 
! /* LAST_BB is a block that exits, and empty of active instructions.
!Examine its predecessors for jumps that can be converted to
!(conditional) returns.  */
! static VEC (edge, heap) *
! convert_jumps_to_returns (basic_block last_bb, bool simple_p,
!

Re: [PATCH] Miscompilation of attribute((constructor)) functions.

2011-10-31 Thread Paul Brook

 Ok if you move the clearing to after
 
   /* Generate a new name for the new version. */
   DECL_NAME (new_decl) = clone_function_name (old_decl, clone_name);
   SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl));
   SET_DECL_RTL (new_decl, NULL);
 
 using new_decl directly, thus add
 
   /* When the old decl was a con-/destructor make sure the clone isn't.  */
   DECL_STATIC_CONSTRUCTOR(new_decl) = 0;
   DECL_STATIC_DESTRUCTOR(new_decl) = 0;

Done, and applied.

Paul

Re: [C++ preview patch] PR 44277

2011-10-31 Thread Jason Merrill


How does it work to warn in convert_like_real instead?

Jason

Re: [Patch, libfortran, 3/3] Update file position lazily

2011-10-31 Thread Janne Blomqvist

On Sun, Oct 30, 2011 at 01:29, Janne Blomqvist
blomqvist.ja...@gmail.com wrote:
 On Sat, Oct 29, 2011 at 18:35, Mikael Morin mikael.mo...@sfr.fr wrote:
 On Saturday 29 October 2011 14:43:22 Mikael Morin wrote:
  FWIW, it seems ifort 12.0 uses UNDEFINED in this case; I suppose a
  case could be made for using the same. Comments?

 Let's go for UNDEFINED then.
 On second thought, UNSPECIFIED is better as UNDEFINED is for another case.

 Hmm, indeed, on second thought I agree as well.

I just committed all the 3 parts of this patch series. Parts 1 and 2
verbatim, and 3 also verbatim except with the following for
inquire_5.f90:

Index: gcc/testsuite/gfortran.dg/inquire_5.f90
===
--- gcc/testsuite/gfortran.dg/inquire_5.f90 (revision 180700)
+++ gcc/testsuite/gfortran.dg/inquire_5.f90 (working copy)
@@ -1,11 +1,10 @@
 ! { dg-do run { target fd_truncate } }
-! { dg-options -std=legacy }
 !
 ! pr19314 inquire(..position=..) segfaults
 ! test by thomas.koe...@online.de
 ! bdavis9...@comcast.net
   implicit none
-  character*20 chr
+  character(len=20) chr
   open(7,STATUS='SCRATCH')
   inquire(7,position=chr)
   if (chr.NE.'ASIS') CALL ABORT
@@ -31,7 +30,7 @@
   write(7,*)'this is another record'
   backspace(7)
   inquire(7,position=chr)
-  if (chr.NE.'ASIS') CALL ABORT
+  if (chr .NE. 'UNSPECIFIED') CALL ABORT
   rewind(7)
   inquire(7,position=chr)
   if (chr.NE.'REWIND') CALL ABORT


(That is, test the returned value explicitly rather than test for
standards conformance as in the original patch)


-- 
Janne Blomqvist

Re: [RFC PATCH] Gather vectorization (PR tree-optimization/50789)

2011-10-31 Thread Jakub Jelinek

On Mon, Oct 31, 2011 at 03:23:32PM +0100, Jakub Jelinek wrote:
 Would be nice to cut down slightly this testcase into just one or two loops
 that are vectorized and turn it into a runtime testcase which verifies
 the vectorization was correct.

Here is one such testcase (though, in your case there are no loads
for the indexes, on the other side you have 3 of the IVs each multiplied
by some loop invariant and all added together.  Though, on the other
side in your case there are far more expressions.

2011-10-31  Jakub Jelinek  ja...@redhat.com

* gcc.target/i386/avx2-gather-4.c: New test.

--- gcc/testsuite/gcc.target/i386/avx2-gather-4.c.jj2011-10-31 
15:58:57.0 +0100
+++ gcc/testsuite/gcc.target/i386/avx2-gather-4.c   2011-10-31 
15:59:44.0 +0100
@@ -0,0 +1,38 @@
+/* { dg-do run } */
+/* { dg-require-effective-target avx2 } */
+/* { dg-options -O3 -mavx2 } */
+
+#include avx2-check.h
+
+#define N 1024
+int a[N], b[N], c[N], d[N];
+
+__attribute__((noinline, noclone)) void
+foo (float *__restrict p, float *__restrict q, float *__restrict r,
+ long s1, long s2, long s3)
+{
+  int i;
+  for (i = 0; i  N; i++)
+p[i] = q[a[i] * s1 + b[i] * s2 + s3] * r[c[i] * s1 + d[i] * s2 + s3];
+}
+
+static void
+avx2_test (void)
+{
+  int i;
+  float e[N], f[N], g[N];
+  for (i = 0; i  N; i++)
+{
+  a[i] = (i * 7)  (N / 8 - 1);
+  b[i] = (i * 13)  (N / 8 - 1);
+  c[i] = (i * 23)  (N / 8 - 1);
+  d[i] = (i * 5)  (N / 8 - 1);
+  e[i] = 16.5 + i;
+  f[i] = 127.5 - i;
+}
+  foo (g, e, f, 3, 2, 4);
+  for (i = 0; i  N; i++)
+if (g[i] != (float) ((20.5 + a[i] * 3 + b[i] * 2)
+* (123.5 - c[i] * 3 - d[i] * 2)))
+  abort ();
+}


Jakub

Re: Go patch committed: Update Go library

2011-10-31 Thread Rainer Orth

Ian Lance Taylor i...@google.com writes:

 This patch updates the Go library to the most recent weekly release.  I
 think the only potential portability issues here are the use of the
 ipv6_mreq struct.  I'm not entirely sure the new exp/terminal package is
 portable, but it might be.

 I have not included the entire patch here, because it is too large and
 it's just copying changes anyhow.  I've included all patches to files
 which are specific to the Go frontend version.

After this change, I'm seeing another issue: most 32-bit go execution
tests fail like this on Solaris 11/x86:

/vol/gcc/src/hg/trunk/local/libgo/runtime/malloc.goc:366: libgo assertion 
failure
FAIL: go.go-torture/execute/array-1.go execution,  -O0 

Running the test under truss, I find:

14261:  mmap(0xFF00, 805306368, PROT_NONE, MAP_PRIVATE|MAP_ANON, -1, 0) 
Err#12 ENOMEM

With truss -u (user function tracing), I see:

14285/1@1:  - libgo:runtime_mallocinit()
14285/1@1:- libgo:runtime_InitSizes()
14285/1@1:- libgo:runtime_InitSizes() = 2
14285/1@1:- libgo:runtime_SysReserve()
14285/1:mmap(0xFF00, 805306368, PROT_NONE, MAP_PRIVATE|MAP_ANON, 
-1, 0) Err#12 ENOMEM
14285/1@1:- libgo:runtime_SysReserve() = -1
14285/1@1:- libgo:__go_assert_fail()

If I remove the adjustment in runtime/malloc.goc (runtime_mallocinit),
the test passes:

14445/1:mmap(0xFEF78114, 805306368, PROT_NONE, MAP_PRIVATE|MAP_ANON, 
-1, 0) = 0xCE00

This stuff seems incredibly fragile, and I don't exactly understand
why.  Besides, the assertion failure above is strange/wrong in two ways:

/vol/gcc/src/hg/trunk/local/libgo/runtime/malloc.goc:366: libgo assertion 
failure

* I'd expect to see the message from runtime_throw() here, not just
  `libgo assertion failure'.

* The message points to the wrong line due to a broken test: malloc.goc
  has:

p = runtime_SysReserve((void*)(0x00f8ULL32), bitmap_size + 
arena_size);
if(p == nil)
runtime_throw(runtime: cannot reserve arena virtual 
address space);

  On failure, p will be MAP_FAILED ((void *)-1), not nil, so the wrong
  assertion it thrown.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [C++ preview patch] PR 44277

2011-10-31 Thread Paolo Carlini


Hi,

How does it work to warn in convert_like_real instead?
the problem is that (expr, totype) can be a lot of different things for 
which we want to warn, can be a zero and a pointer for assignments, but, 
when totype is a BOOLEAN_TYPE expr can be an EQ_EXPR or NEQ_EXPR and 
then the operands various things depending on whether we are looking for 
pointer, data member pointer, etc, on the left or on the right of the == 
or != sign. In other terms, the pattern matching doesn't seem matter of 
a few lines. I'm annoyed by this. Also, for the assignment case, I'm 
getting duplicate warnings, maybe can be fixed.


Do you think there is no neat way to implement my idea of avoiding 
generating those implicit 0s in the first place? The internals of the 
front-end, seem very c++98-ish for null pointers ;)


Paolo.

Re: C++ PATCH to add -std=c++11 ??

2011-10-31 Thread Jason Merrill


On 10/31/2011 06:39 AM, Paolo Carlini wrote:

Great. When you commit it, you can as well add 'PR c++/50920' to the
ChangeLog!


OK, here's what I'm checking in.  There are a lot more instances of 
C++0x in comments and cxx_dialect checks, but I'm not going to worry 
about those now.


Tested x86_64-pc-linux-gnu.
commit 12395569015d26ee38609653bf9b589961f546e2
Author: Jason Merrill ja...@redhat.com
Date:   Fri Aug 12 17:09:47 2011 -0400

	PR c++/50920
gcc/c-family
	* c-common.h (cxx_dialect): Add cxx11 and cxx03.
	* c.opt: Add -std=c++11, -std=gnu++11, -std=gnu++03,
	and -Wc++11-compat.
	* c-opts.c (set_std_cxx11): Rename from set_std_cxx0x.
gcc/cp
	* class.c (check_field_decl): Change c++0x in diags to c++11.
	* error.c (maybe_warn_cpp0x): Likewise.
	* parser.c (cp_parser_diagnose_invalid_type_name): Likewise.
	* pt.c (check_default_tmpl_args): Likewise.
libcpp
	* include/cpplib.h (enum c_lang): Rename CLK_CXX0X to CLK_CXX11,
	CLK_GNUCXX0X to CLK_GNUCXX11.
libstdc++-v3
	* include/bits/c++0x_warning.h: Change -std=c++0x to -std=c++11.

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index be9d729..71746a9 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -643,11 +643,12 @@ extern int flag_use_repository;
 /* The supported C++ dialects.  */
 
 enum cxx_dialect {
-  /* C++98  */
+  /* C++98 with TC1  */
   cxx98,
-  /* Experimental features that are likely to become part of
- C++0x.  */
-  cxx0x
+  cxx03 = cxx98,
+  /* C++11  */
+  cxx0x,
+  cxx11 = cxx0x
 };
 
 /* The C++ dialect being used. C++98 is the default.  */
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index 6869d5c..b56aec7 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -110,7 +110,7 @@ static size_t include_cursor;
 
 static void handle_OPT_d (const char *);
 static void set_std_cxx98 (int);
-static void set_std_cxx0x (int);
+static void set_std_cxx11 (int);
 static void set_std_c89 (int, int);
 static void set_std_c99 (int);
 static void set_std_c1x (int);
@@ -775,10 +775,10 @@ c_common_handle_option (size_t scode, const char *arg, int value,
 	set_std_cxx98 (code == OPT_std_c__98 /* ISO */);
   break;
 
-case OPT_std_c__0x:
-case OPT_std_gnu__0x:
+case OPT_std_c__11:
+case OPT_std_gnu__11:
   if (!preprocessing_asm_p)
-	set_std_cxx0x (code == OPT_std_c__0x /* ISO */);
+	set_std_cxx11 (code == OPT_std_c__11 /* ISO */);
   break;
 
 case OPT_std_c90:
@@ -1501,18 +1501,18 @@ set_std_cxx98 (int iso)
   cxx_dialect = cxx98;
 }
 
-/* Set the C++ 0x working draft standard (without GNU extensions if ISO).  */
+/* Set the C++ 2011 standard (without GNU extensions if ISO).  */
 static void
-set_std_cxx0x (int iso)
+set_std_cxx11 (int iso)
 {
-  cpp_set_lang (parse_in, iso ? CLK_CXX0X: CLK_GNUCXX0X);
+  cpp_set_lang (parse_in, iso ? CLK_CXX11: CLK_GNUCXX11);
   flag_no_gnu_keywords = iso;
   flag_no_nonansi_builtin = iso;
   flag_iso = iso;
-  /* C++0x includes the C99 standard library.  */
+  /* C++11 includes the C99 standard library.  */
   flag_isoc94 = 1;
   flag_isoc99 = 1;
-  cxx_dialect = cxx0x;
+  cxx_dialect = cxx11;
 }
 
 /* Args to -d specify what to dump.  Silently ignore
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 693f191..336a75a 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -289,7 +289,11 @@ Warn about C constructs that are not in the common subset of C and C++
 
 Wc++0x-compat
 C++ ObjC++ Var(warn_cxx0x_compat) Warning
-Warn about C++ constructs whose meaning differs between ISO C++ 1998 and ISO C++ 200x
+Deprecated in favor of -Wc++11-compat
+
+Wc++11-compat
+C++ ObjC++ Warning Alias(Wc++0x-compat)
+Warn about C++ constructs whose meaning differs between ISO C++ 1998 and ISO C++ 2011
 
 Wcast-qual
 C ObjC C++ ObjC++ Var(warn_cast_qual) Warning
@@ -1175,12 +1179,13 @@ std=c++03
 C++ ObjC++ Alias(std=c++98)
 Conform to the ISO 1998 C++ standard revised by the 2003 technical corrigendum
 
+std=c++11
+C++ ObjC++
+Conform to the ISO 2011 C++ standard (experimental and incomplete support)
+
 std=c++0x
-C++ ObjC++
-Conform to the ISO 1998 C++ standard, with extensions that are likely to
-become a part of the upcoming ISO C++ standard, dubbed C++0x. Note that the
-extensions enabled by this mode are experimental and may be removed in
-future releases of GCC.
+C++ ObjC++ Alias(std=c++11)
+Deprecated in favor of -std=c++11
 
 std=c1x
 C ObjC
@@ -1204,14 +1209,21 @@ Deprecated in favor of -std=c99
 
 std=gnu++98
 C++ ObjC++
-Conform to the ISO 1998 C++ standard with GNU extensions
+Conform to the ISO 1998 C++ standard revised by the 2003 technical
+corrigendum with GNU extensions
+
+std=gnu++03
+C++ ObjC++ Alias(std=gnu++98)
+Conform to the ISO 1998 C++ standard revised by the 2003 technical
+corrigendum with GNU extensions
+
+std=gnu++11
+C++ ObjC++
+Conform to the ISO 2011 C++ standard with GNU extensions (experimental and incomplete support)

Re: [libcpp] Correctly define __cplusplus (PR libstdc++-v3/1773)

2011-10-31 Thread Jason Merrill


On 10/21/2011 03:52 PM, Jason Merrill wrote:

On 10/21/2011 03:11 PM, Marc Glisse wrote:

Note that at least clang now defines __cplusplus to its new C++11 value
(in experimental C++0X mode only). Apparently they switched around last
June and say they are not the only ones. So if you want to follow their
lead...


Hmm, between that and the fact that 4.7 will in fact have almost all of
the C++11 features, I think changing the value makes sense.


Thus:

commit f6f3e056eac1f9bcdc2ba0459723665dafd57396
Author: Jason Merrill ja...@redhat.com
Date:   Mon Oct 31 11:26:25 2011 -0400

	PR libstdc++/1773
	* init.c (cpp_init_builtins): Set __cplusplus for C++11.

diff --git a/libcpp/init.c b/libcpp/init.c
index bbaa8ae..9101b34 100644
--- a/libcpp/init.c
+++ b/libcpp/init.c
@@ -461,7 +461,13 @@ cpp_init_builtins (cpp_reader *pfile, int hosted)
 _cpp_define_builtin (pfile, __STDC__ 1);
 
   if (CPP_OPTION (pfile, cplusplus))
-_cpp_define_builtin (pfile, __cplusplus 199711L);
+{
+  if (CPP_OPTION (pfile, lang) == CLK_CXX11
+	   || CPP_OPTION (pfile, lang) == CLK_GNUCXX11)
+	_cpp_define_builtin (pfile, __cplusplus 201103L);
+  else
+	_cpp_define_builtin (pfile, __cplusplus 199711L);
+}
   else if (CPP_OPTION (pfile, lang) == CLK_ASM)
 _cpp_define_builtin (pfile, __ASSEMBLER__ 1);
   else if (CPP_OPTION (pfile, lang) == CLK_STDC94)

Re: [RFC PATCH] update to libtool-2.4.2 and regenerate

2011-10-31 Thread Ian Lance Taylor

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 I made a quick comparison of the libtool.m4 in libgo/config with the
 2.4.2 version: the only relevant change seems to be an instance of
 AC_PROG_GO, which also lives in go.m4.  Ian will know why that
 additional copy is necessary.

The version of AC_PROG_GO in libgo/config/go.m4 is there so that I can
rebuild libgo/configure with a version of autoconf that does not have Go
support.

The version of AC_PROG_GO in libgo/config/libtool.m4 is there because
all the languages work that way in libtool.m4.

Ian

Re: [PR50869] don't attempt to expand CFA within cselib

2011-10-31 Thread Jeff Law

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 10/28/11 15:07, Alexandre Oliva wrote:
 An assertion check meant to verify that var loc expansions that
 didn't involve VALUEs (say constants, REGs, etc) didn't push values
 onto the dependency stack failed in an expansion of the argp reg,
 because equivalences for it are preserved at cselib table resets,
 and cselib later tries to expand it to equivalent expressions.
 
 It's not profitable to expand it within var-tracking, and that's
 the only user of the CFA-base special-casing in cselib, so I
 arranged for argp to be preserved in expansions, just like other
 stack base registers.
 
 While debugging it, I noticed it was theoretically possible for
 the expression depth to remain uninitialized, and added an
 initialization and an assertion check to make sure it only remains
 zero when no location is found.
 
 Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to
 install?
OK.
jeff

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOruRVAAoJEBRtltQi2kC7JlIH/3zxv5lhZ2VaGGVFjntIZO2T
AlANqcP3UZRsbBcIQ4J/3MA19ob4QTw5gQq7nFxX1OGUlRag9mFzE00L3Q2uLCSn
z7OVZGNwL48eN5G36HH9UY5ktQmy14UPQfE1d4P+X3h/bhLAMHfaQuMIl2+/QK60
nhaGYQMx0qlv2Ndof+HNwo/6s/o4oX3bWS5EavPFyPCHuy7dGrlcY10C2gZnund8
JYA4byxtFKNybiji5WNFO2XxzjVCxGe0+XWAPqO2jNj3CBEfzMyUbZhVP3llOJBI
9Mcjn3k/kTp/3h9aGzoGPssYR9DpMxyU+IQlSPyhR9ZiNGCC7Udj/aALtI6/b/U=
=689Y
-END PGP SIGNATURE-

[Patch,AVR]: Fix PR50910: int/2 leads to libgcc call

2011-10-31 Thread Georg-Johann Lay

This is a fix for optimization flaw when dividing int by 2.

There is really no need for a library call. Costs of [U]DIV/[U]MOD are adjusted
to take into account the costs of CONST_INT operands that must be loaded for
division by means of libgcc call.

There are some new combiner patterns suffixed .lt0 that so adjustment
frequently seen when division-by-const in lowered to arithmetic in order to
avoid more expensive libcall.

Moreover, there are two patterns for adding sign-extended QI to HI. These
patterns are shorter, faster and have lower register pressure than explicitly
sign-extending the QI before adding it.  Example code is:

int add (int a, char b) { return a + b; }
int sub (int a, char b) { return a - b; }

add:
add r24,r22  ;  13  *addhi3.sign_extend1[length = 4]
adc r25,__zero_reg__
sbrc r22,7
dec r25
ret

sub:
sub r24,r22  ;  13  *subhi3.sign_extend2[length = 4]
sbc r25,__zero_reg__
sbrc r22,7
inc r25
ret

The reg_overlap_mentioned case is just for pathological code like, e.g.
   a + (char) a
so that the expected size is 4 instructions.

Since beginning of time, BRANCH_COST was set to 0 so that some optimization
passes make code happily jumping around. The patch introduces a new command
line option for that; mainly because I don't know the rationale behind setting
BRANCH_COST to 0.

Regression-tested.

Ok for trunk?

Johann

* config/avr/avr.opt (-mbranch-cost=): New option.
* config/avr/avr.h (BRANCH_COST): Define to avr_branch_cost.
* config/avr/avr.c (avr_rtx_costs_1): Adjust [U]DIV/[U]MOD costs.
* config/avr/avr.md (*addqi3.lt0, *addhi3.lt0, *addsi3.lt0): New insns.
(*addhi3_zero_extend1): Remov % in constraint of operand 1.
(*addhi3.sign_extend1, *subhi3.sign_extend2): New insns.
Index: config/avr/avr.md
===
--- config/avr/avr.md	(revision 180654)
+++ config/avr/avr.md	(working copy)
@@ -776,27 +776,36 @@ (define_expand addhi3
 
 
 (define_insn *addhi3_zero_extend
-  [(set (match_operand:HI 0 register_operand =r)
-	(plus:HI (zero_extend:HI
-		  (match_operand:QI 1 register_operand r))
-		 (match_operand:HI 2 register_operand 0)))]
+  [(set (match_operand:HI 0 register_operand =r)
+(plus:HI (zero_extend:HI (match_operand:QI 1 register_operand r))
+ (match_operand:HI 2 register_operand 0)))]
   
-  add %A0,%1
-	adc %B0,__zero_reg__
+  add %A0,%1\;adc %B0,__zero_reg__
   [(set_attr length 2)
(set_attr cc set_n)])
 
 (define_insn *addhi3_zero_extend1
-  [(set (match_operand:HI 0 register_operand =r)
-	(plus:HI (match_operand:HI 1 register_operand %0)
-		 (zero_extend:HI
-		  (match_operand:QI 2 register_operand r]
+  [(set (match_operand:HI 0 register_operand =r)
+(plus:HI (match_operand:HI 1 register_operand 0)
+ (zero_extend:HI (match_operand:QI 2 register_operand r]
   
-  add %A0,%2
-	adc %B0,__zero_reg__
+  add %A0,%2\;adc %B0,__zero_reg__
   [(set_attr length 2)
(set_attr cc set_n)])
 
+(define_insn *addhi3.sign_extend1
+  [(set (match_operand:HI 0 register_operand =r)
+(plus:HI (sign_extend:HI (match_operand:QI 1 register_operand r))
+ (match_operand:HI 2 register_operand 0)))]
+  
+  {
+return reg_overlap_mentioned_p (operands[0], operands[1])
+  ? mov __tmp_reg__,%1\;add %A0,%1\;adc %B0,__zero_reg__\;sbrc __tmp_reg__,7\;dec %B0
+  : add %A0,%1\;adc %B0,__zero_reg__\;sbrc %1,7\;dec %B0;
+  }
+  [(set_attr length 5)
+   (set_attr cc clobber)])
+
 (define_insn *addhi3_sp
   [(set (match_operand:HI 1 stack_register_operand   =q)
 (plus:HI (match_operand:HI 2 stack_register_operand   q)
@@ -956,6 +965,19 @@ (define_insn *subhi3_zero_extend1
   [(set_attr length 2)
(set_attr cc set_czn)])
 
+(define_insn *subhi3.sign_extend2
+  [(set (match_operand:HI 0 register_operand  =r)
+(minus:HI (match_operand:HI 1 register_operand 0)
+  (sign_extend:HI (match_operand:QI 2 register_operand r]
+  
+  {
+return reg_overlap_mentioned_p (operands[0], operands[2])
+  ? mov __tmp_reg__,%2\;sub %A0,%2\;sbc %B0,__zero_reg__\;sbrc __tmp_reg__,7\;inc %B0
+  : sub %A0,%2\;sbc %B0,__zero_reg__\;sbrc %2,7\;inc %B0;
+  }
+  [(set_attr length 5)
+   (set_attr cc clobber)])
+
 (define_insn subsi3
   [(set (match_operand:SI 0 register_operand  =r)
 (minus:SI (match_operand:SI 1 register_operand 0)
@@ -1054,6 +1076,41 @@ (define_insn *subqi3.ashiftrt7
   [(set_attr length 2)
(set_attr cc clobber)])
 
+(define_insn *addqi3.lt0
+  [(set (match_operand:QI 0 register_operand =r)
+(plus:QI (lt:QI (match_operand:QI 1 register_operand  r)
+

[trans-mem] Fix tm_pure not inlinable in tm_safe

2011-10-31 Thread Patrick Marlier


This fixes the g++ pr45940-4 failure. I think it is due to the latest merge.

Tested on i686. (I cannot test it yet on x86-64, I hope to get access to 
a 64 bit soon...)


Patrick.

2011-10-31  Patrick Marlier  patrick.marl...@gmail.com
* ipa-inline.c: Adjust how cannot_inline is set.

Index: ipa-inline.c
===
--- ipa-inline.c(revision 180705)
+++ ipa-inline.c(working copy)
@@ -285,14 +285,14 @@
   inlinable = false;
 }
   /* TM pure functions should not get inlined if the outer function is
- a TM safe function.  */
+ a TM safe function. ??? TM pure function could be inlined if waiver block
+ is implemented. */
   else if (flag_tm
is_tm_pure (callee-decl)
is_tm_safe (e-caller-decl))
 {
   e-inline_failed = CIF_UNSPECIFIED;
-  gimple_call_set_cannot_inline (e-call_stmt, true);
-  return false;
+  inlinable = false;
 }
   /* Don't inline if the callee can throw non-call exceptions but the
  caller cannot.

[Patch, libfortran] PR 50016 Slow IO on Windows due to _commit()

2011-10-31 Thread Janne Blomqvist

Hi,

here's an updated version of my patch that gets rid of _commit along
with a section in the manual describing data consistency and
durability issues.

See also the thread starting at

http://gcc.gnu.org/ml/fortran/2011-10/msg00079.html

and the latest mail in that thread with my current thinking which
perhaps explains some of the motivations behind this patch:

http://gcc.gnu.org/ml/fortran/2011-10/msg00141.html

Regtested on x86_64-unknown-linux-gnu, Ok for trunk?

frontend ChangeLog:

2011-10-31  Janne Blomqvist  j...@gcc.gnu.org

PR libfortran/50016
* gfortran.texi (Data consistency and durability): New section.


testsuite ChangeLog:

2011-10-31  Janne Blomqvist  j...@gcc.gnu.org

PR libfortran/50016
* gfortran.dg/inquire_size.f90: Don't flush the unit.

libgfortran ChangeLog:

2011-10-31  Janne Blomqvist  j...@gcc.gnu.org

PR libfortran/50016
* io/inquire.c (inquire_via_unit): Flush the unit and use ssize.
* io/unix.c (buf_flush): Don't call _commit.


-- 
Janne Blomqvist
diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index f847df3..b45b71a 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -1090,6 +1090,7 @@ might in some way or another become visible to the programmer.
 * KIND Type Parameters::
 * Internal representation of LOGICAL variables::
 * Thread-safety of the runtime library::
+* Data consistency and durability::
 @end menu
 
 
@@ -1194,6 +1195,81 @@ Finally, for platforms not supporting thread-safe POSIX functions,
 further functionality might not be thread-safe.  For details, please
 consult the documentation for your operating system.
 
+
+@node Data consistency and durability
+@section Data consistency and durability
+@cindex consistency, durability
+
+This section contains a brief overview of data and metadata
+consistency and durability issues when doing I/O.
+
+With respect to durability, GNU Fortran makes no effort to ensure that
+data is committed to stable storage. If this is required, the GNU
+Fortran programmer can use the intrinsic @code{FNUM} to retrieve the
+low level file descriptor corresponding to an open Fortran unit. Then,
+using e.g. the @code{ISO_C_BINDING} feature, one can call the
+underlying system call to flush dirty data to stable storage, such as
+@code{fsync} on POSIX, @code{_commit} on MingW, or @code{fcntl(fd,
+F_FULLSYNC, 0)} on Mac OS X. The following example shows how to call
+fsync:
+
+@smallexample
+  ! Declare the interface for POSIX fsync function
+  interface
+function fsync (fd) bind(c,name=fsync)
+use iso_c_binding, only: c_int
+  integer(c_int), value :: fd
+  integer(c_int) :: fsync
+end function fsync
+  end interface
+
+  ! Variable declaration
+  integer :: ret
+
+  ! Opening unit 10
+  open (10,file=foo)
+
+  ! ...
+  ! Perform I/O on unit 10
+  ! ...
+
+  ! Flush and sync
+  flush(10)
+  ret = fsync(fnum(10))
+
+  ! Handle possible error
+  if (ret /= 0) stop Error calling FSYNC
+@end smallexample
+
+With respect to consistency, for regular files GNU Fortran uses
+buffered I/O in order to improve performance. This buffer is flushed
+automatically when full and in some other situations, e.g. when
+closing a unit. It can also be explicitly flushed with the
+@code{FLUSH} statement. Also, the buffering can be turned off with the
+@code{GFORTRAN_UNBUFFERED_ALL} and
+@code{GFORTRAN_UNBUFFERED_PRECONNECTED} environment variables. Special
+files, such as terminals and pipes, are always unbuffered. Sometimes,
+however, further things may need to be done in order to allow other
+processes to see data that GNU Fortran has written, as follows.
+
+The Windows platform supports a relaxed metadata consistency model,
+where file metadata is written to the directory lazily. This means
+that, for instance, the @code{dir} command can show a stale size for a
+file. One can force a directory metadata update by closing the unit,
+or by calling @code{_commit} on the file descriptor. Note, though,
+that @code{_commit} will force all dirty data to stable storage, which
+is often a very slow operation.
+
+The Network File System (NFS) implements a relaxed consistency model
+called open-to-close consistency. Closing a file forces dirty data and
+metadata to be flushed to the server, and opening a file forces the
+client to contact the server in order to revalidate cached
+data. @code{fsync} will also force a flush of dirty data and metadata
+to the server. Similar to @code{open} and @code{close}, acquiring and
+releasing @code{fcntl} file locks, if the server supports them, will
+also force cache validation and flushing dirty data and metadata.
+
+
 @c -
 @c Extensions
 @c -
diff --git a/gcc/testsuite/gfortran.dg/inquire_size.f90 b/gcc/testsuite/gfortran.dg/inquire_size.f90
index 568c3d6..13876cf 100644
---

Re: [trans-mem] Fix tm_pure not inlinable in tm_safe

2011-10-31 Thread Aldy Hernandez


On 10/31/11 13:54, Patrick Marlier wrote:

This fixes the g++ pr45940-4 failure. I think it is due to the latest
merge.

Tested on i686. (I cannot test it yet on x86-64, I hope to get access to
a 64 bit soon...)

Patrick.

2011-10-31 Patrick Marlier patrick.marl...@gmail.com
* ipa-inline.c: Adjust how cannot_inline is set.



Heh, funny... I have the exact same patch on this end.  But it doesn't 
completely fix the pr45940-4, cause now I get a segfault here:


  if (is_gimple_call (stmt))
{
  struct cgraph_edge *edge = cgraph_edge (node, stmt);
  struct inline_edge_summary *es = inline_edge_summary (edge);

  /* Special case: results of BUILT_IN_CONSTANT_P will be always
 resolved as constant.  We however don't want to optimize
 out the cgraph edges.  */

The edge isn't set.  I don't know if this is related or not.  I'm 
investigating.


BTW, are you sure it fixes the regression?  I still get this other 
segfault on both x86-32 and x86-64.

Re: [trans-mem] Fix tm_pure not inlinable in tm_safe

2011-10-31 Thread Patrick Marlier


On 10/31/2011 03:21 PM, Aldy Hernandez wrote:

On 10/31/11 13:54, Patrick Marlier wrote:

This fixes the g++ pr45940-4 failure. I think it is due to the latest
merge.

Tested on i686. (I cannot test it yet on x86-64, I hope to get access to
a 64 bit soon...)

Patrick.

2011-10-31 Patrick Marlier patrick.marl...@gmail.com
* ipa-inline.c: Adjust how cannot_inline is set.



Heh, funny... I have the exact same patch on this end. But it doesn't
completely fix the pr45940-4, cause now I get a segfault here:

if (is_gimple_call (stmt))
{
struct cgraph_edge *edge = cgraph_edge (node, stmt);
struct inline_edge_summary *es = inline_edge_summary (edge);

/* Special case: results of BUILT_IN_CONSTANT_P will be always
resolved as constant. We however don't want to optimize
out the cgraph edges. */

The edge isn't set. I don't know if this is related or not. I'm
investigating.

BTW, are you sure it fixes the regression? I still get this other
segfault on both x86-32 and x86-64.


It does on my side:

=== g++ Summary ===
# of expected passes122

I have no other change over the source.

Patrick.


Copy/Paste if I run the command line directly in the terminal:

marlier@d01:/localdisk/gcc/tm-build-dbg$ 
/localdisk/gcc/tm-build-dbg/gcc/testsuite/g++/../../g++ 
-B/localdisk/gcc/tm-build-dbg/gcc/testsuite/g++/../../ 
/localdisk/gcc/tm-src/gcc/testsuite/g++.dg/tm/pr45940-4.C -nostdinc++ 
-I/localdisk/gcc/tm-build-dbg/i686-pc-linux-gnu/libstdc++-v3/include/i686-pc-linux-gnu 
-I/localdisk/gcc/tm-build-dbg/i686-pc-linux-gnu/libstdc++-v3/include 
-I/localdisk/gcc/tm-src/libstdc++-v3/libsupc++ 
-I/localdisk/gcc/tm-src/libstdc++-v3/include/backward 
-I/localdisk/gcc/tm-src/libstdc++-v3/testsuite/util -fmessage-length=0 
-fgnu-tm -O1 -S -o pr45940-4.s

marlier@d01:/localdisk/gcc/tm-build-dbg$ echo $?
0

- no segfault

RFA: libstdc++ PATCH to initializer_list to #error in C++98 mode

2011-10-31 Thread Jason Merrill

I have on occasion been confused by initializer_list silently becoming 
empty in C++98 mode.  OK for trunk?


Jason
commit 3d7ac3e4d8bb54921eb3e1f70b1a42a165ba4f5b
Author: Jason Merrill ja...@redhat.com
Date:   Mon Oct 31 01:21:49 2011 -0400

	* libsupc++/initializer_list: Copy C++0x #error from
	bits/c++0x_warning.h.

diff --git a/libstdc++-v3/include/bits/algorithmfwd.h b/libstdc++-v3/include/bits/algorithmfwd.h
index cc0b98e..fbec55d 100644
--- a/libstdc++-v3/include/bits/algorithmfwd.h
+++ b/libstdc++-v3/include/bits/algorithmfwd.h
@@ -35,7 +35,9 @@
 #include bits/c++config.h
 #include bits/stl_pair.h
 #include bits/stl_iterator_base_types.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index 5708194..0edb8b2 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -40,7 +40,9 @@
 
 #include ext/atomicity.h
 #include debug/debug.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/forward_list.h b/libstdc++-v3/include/bits/forward_list.h
index c80ee50..0fc8323 100644
--- a/libstdc++-v3/include/bits/forward_list.h
+++ b/libstdc++-v3/include/bits/forward_list.h
@@ -33,7 +33,9 @@
 #pragma GCC system_header
 
 #include memory
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index bddecb0..8f28640 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -57,7 +57,9 @@
 #ifndef _STL_BVECTOR_H
 #define _STL_BVECTOR_H 1
 
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_deque.h b/libstdc++-v3/include/bits/stl_deque.h
index 17ea01a..b924917 100644
--- a/libstdc++-v3/include/bits/stl_deque.h
+++ b/libstdc++-v3/include/bits/stl_deque.h
@@ -60,7 +60,9 @@
 #include bits/concept_check.h
 #include bits/stl_iterator_base_types.h
 #include bits/stl_iterator_base_funcs.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_list.h b/libstdc++-v3/include/bits/stl_list.h
index 56ee2fb..fc1d8f8 100644
--- a/libstdc++-v3/include/bits/stl_list.h
+++ b/libstdc++-v3/include/bits/stl_list.h
@@ -58,7 +58,9 @@
 #define _STL_LIST_H 1
 
 #include bits/concept_check.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_map.h b/libstdc++-v3/include/bits/stl_map.h
index 889e52b..45824f0 100644
--- a/libstdc++-v3/include/bits/stl_map.h
+++ b/libstdc++-v3/include/bits/stl_map.h
@@ -59,7 +59,9 @@
 
 #include bits/functexcept.h
 #include bits/concept_check.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_multimap.h b/libstdc++-v3/include/bits/stl_multimap.h
index 6b74558..fd5a5a8 100644
--- a/libstdc++-v3/include/bits/stl_multimap.h
+++ b/libstdc++-v3/include/bits/stl_multimap.h
@@ -58,7 +58,9 @@
 #define _STL_MULTIMAP_H 1
 
 #include bits/concept_check.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_multiset.h b/libstdc++-v3/include/bits/stl_multiset.h
index 8b25a97..ab467c8 100644
--- a/libstdc++-v3/include/bits/stl_multiset.h
+++ b/libstdc++-v3/include/bits/stl_multiset.h
@@ -58,7 +58,9 @@
 #define _STL_MULTISET_H 1
 
 #include bits/concept_check.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_set.h b/libstdc++-v3/include/bits/stl_set.h
index b30966a..18fd117 100644
--- a/libstdc++-v3/include/bits/stl_set.h
+++ b/libstdc++-v3/include/bits/stl_set.h
@@ -58,7 +58,9 @@
 #define _STL_SET_H 1
 
 #include bits/concept_check.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_vector.h b/libstdc++-v3/include/bits/stl_vector.h
index 869bcf7..9b7b698 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -60,7 +60,9 @@
 #include bits/stl_iterator_base_funcs.h
 #include bits/functexcept.h
 #include bits/concept_check.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/ext/vstring.h

[i386] Remove TARGET_VECTORIZE_BUILTIN_CONVERSION

2011-10-31 Thread Richard Henderson

I checked in the generic portion of Dmitry Plotnikov's patch
to the vectorizer and optabs that enables this patch.  The ARM
portion of his patch is still outstanding, awaiting approval.

This allows this target hook to be removed from other targets.

Can I talk you into doing a similar patch for rs6000, Mike?
After that I can take care of removing the target hook entirely.

Tested on x86_64-linux.


r~
i386: Remove TARGET_VECTORIZE_BUILTIN_CONVERSION.

Renaming all of the insn patterns as needed to the standard
optab forms.  Sadly, only one of the builtins is unused by
the various header files, so most of them must stay around.

* config/i386/sse.md (floatv8siv8sf2): Rename from avx_cvtdq2ps256.
(floatv4siv4sf2): Rename from sse2_cvtdq2ps.
(floatunsv4siv4sf2): Rename from sse2_cvtudq2ps.
(fix_truncv8sfv8si2): Rename from avx_cvttps2dq256.
(fix_truncv4sfv4si2): Rename from sse2_cvttps2dq.
(floatv4siv4df2): Rename from avx_cvtdq2pd256.
(fix_truncv4dfv4si2): Rename from avx_cvttpd2dq256.
(vec_unpacku_float_hi_v8si): Update for insn pattern name changes.
* config/i386/i386.md (splitters for int-float conversion): 
Likewise.
* config/i386/i386.c (ix86_split_convert_uns_si_sse): Likewise.
(bdesc_args): Likewise.
(enum ix86_builtins) [IX86_BUILTIN_CVTUDQ2PS]: Remove.
(ix86_vectorize_builtin_conversion): Remove.
(TARGET_VECTORIZE_BUILTIN_CONVERSION): Remove.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 148fcfb..4e34f25 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -16857,7 +16857,7 @@ ix86_split_convert_uns_si_sse (rtx operands[])
 
   x = gen_rtx_REG (V4SImode, REGNO (value));
   if (vecmode == V4SFmode)
-emit_insn (gen_sse2_cvttps2dq (x, value));
+emit_insn (gen_fix_truncv4sfv4si2 (x, value));
   else
 emit_insn (gen_sse2_cvttpd2dq (x, value));
   value = x;
@@ -25077,8 +25077,6 @@ enum ix86_builtins
   IX86_BUILTIN_CPYSGNPS256,
   IX86_BUILTIN_CPYSGNPD256,
 
-  IX86_BUILTIN_CVTUDQ2PS,
-
   /* FMA4 instructions.  */
   IX86_BUILTIN_VFMADDSS,
   IX86_BUILTIN_VFMADDSD,
@@ -25791,8 +25789,7 @@ static const struct builtin_description bdesc_args[] =
   { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_pmovmskb, 
__builtin_ia32_pmovmskb128, IX86_BUILTIN_PMOVMSKB128, UNKNOWN, (int) 
INT_FTYPE_V16QI },
   { OPTION_MASK_ISA_SSE2, CODE_FOR_sqrtv2df2, __builtin_ia32_sqrtpd, 
IX86_BUILTIN_SQRTPD, UNKNOWN, (int) V2DF_FTYPE_V2DF },
   { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_cvtdq2pd, __builtin_ia32_cvtdq2pd, 
IX86_BUILTIN_CVTDQ2PD, UNKNOWN, (int) V2DF_FTYPE_V4SI },
-  { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_cvtdq2ps, __builtin_ia32_cvtdq2ps, 
IX86_BUILTIN_CVTDQ2PS, UNKNOWN, (int) V4SF_FTYPE_V4SI },
-  { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_cvtudq2ps, __builtin_ia32_cvtudq2ps, 
IX86_BUILTIN_CVTUDQ2PS, UNKNOWN, (int) V4SF_FTYPE_V4SI },
+  { OPTION_MASK_ISA_SSE2, CODE_FOR_floatv4siv4sf2, __builtin_ia32_cvtdq2ps, 
IX86_BUILTIN_CVTDQ2PS, UNKNOWN, (int) V4SF_FTYPE_V4SI },
 
   { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_cvtpd2dq, __builtin_ia32_cvtpd2dq, 
IX86_BUILTIN_CVTPD2DQ, UNKNOWN, (int) V4SI_FTYPE_V2DF },
   { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_cvtpd2pi, __builtin_ia32_cvtpd2pi, 
IX86_BUILTIN_CVTPD2PI, UNKNOWN, (int) V2SI_FTYPE_V2DF },
@@ -25809,7 +25806,7 @@ static const struct builtin_description bdesc_args[] =
 
   { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_cvtps2dq, __builtin_ia32_cvtps2dq, 
IX86_BUILTIN_CVTPS2DQ, UNKNOWN, (int) V4SI_FTYPE_V4SF },
   { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_cvtps2pd, __builtin_ia32_cvtps2pd, 
IX86_BUILTIN_CVTPS2PD, UNKNOWN, (int) V2DF_FTYPE_V4SF },
-  { OPTION_MASK_ISA_SSE2, CODE_FOR_sse2_cvttps2dq, __builtin_ia32_cvttps2dq, 
IX86_BUILTIN_CVTTPS2DQ, UNKNOWN, (int) V4SI_FTYPE_V4SF },
+  { OPTION_MASK_ISA_SSE2, CODE_FOR_fix_truncv4sfv4si2, 
__builtin_ia32_cvttps2dq, IX86_BUILTIN_CVTTPS2DQ, UNKNOWN, (int) 
V4SI_FTYPE_V4SF },
 
   { OPTION_MASK_ISA_SSE2, CODE_FOR_addv2df3, __builtin_ia32_addpd, 
IX86_BUILTIN_ADDPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF },
   { OPTION_MASK_ISA_SSE2, CODE_FOR_subv2df3, __builtin_ia32_subpd, 
IX86_BUILTIN_SUBPD, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF },
@@ -26147,14 +26144,14 @@ static const struct builtin_description bdesc_args[] =
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vextractf128v4df, 
__builtin_ia32_vextractf128_pd256, IX86_BUILTIN_EXTRACTF128PD256, UNKNOWN, 
(int) V2DF_FTYPE_V4DF_INT },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vextractf128v8sf, 
__builtin_ia32_vextractf128_ps256, IX86_BUILTIN_EXTRACTF128PS256, UNKNOWN, 
(int) V4SF_FTYPE_V8SF_INT },
   { OPTION_MASK_ISA_AVX, CODE_FOR_avx_vextractf128v8si, 
__builtin_ia32_vextractf128_si256, IX86_BUILTIN_EXTRACTF128SI256, UNKNOWN, 
(int) V4SI_FTYPE_V8SI_INT },
-  { OPTION_MASK_ISA_AVX, CODE_FOR_avx_cvtdq2pd256, 
__builtin_ia32_cvtdq2pd256, IX86_BUILTIN_CVTDQ2PD256, UNKNOWN, (int)

Re: [trans-mem] Fix tm_pure not inlinable in tm_safe

2011-10-31 Thread Aldy Hernandez




It does on my side:

=== g++ Summary ===
# of expected passes 122

I have no other change over the source.


Woah I hereby profess my love for Richard and Patrick.  That didn't 
sound, right, but whatever...  During the weekend they apparently fixed 
the rest of the bug I've been working on all morning.


Yay!!! With this patch we have ironed out all the C++ regressions.

Thanks and sorry for the duplicate work.

Committing to branch.
* ipa-inline.c (can_inline_edge_p): Do not inline TM safe calling
TM pure functions.

Index: ipa-inline.c
===
--- ipa-inline.c(revision 180710)
+++ ipa-inline.c(working copy)
@@ -291,8 +291,7 @@ can_inline_edge_p (struct cgraph_edge *e
is_tm_safe (e-caller-decl))
 {
   e-inline_failed = CIF_UNSPECIFIED;
-  gimple_call_set_cannot_inline (e-call_stmt, true);
-  return false;
+  inlinable = false;
 }
   /* Don't inline if the callee can throw non-call exceptions but the
  caller cannot.

Re: [C++-11] User defined literals

2011-10-31 Thread Jason Merrill


On 10/31/2011 12:44 PM, 3dw...@verizon.net wrote:

For string and character literals, we can still just build up a call; we
only need to walk the overload list here for numeric literals.


I found that if you don't walk the overload list for chars, a char could be 
routed to the operator taking wchar_t for example.


Ah, yes, I was overlooking the bit in the standard that says S shall 
contain a literal operator (13.5.8) whose only parameter has the type ch.


The paragraph for string literals doesn't have a similar restriction.

Jason

Re: RFA: libstdc++ PATCH to initializer_list to #error in C++98 mode

2011-10-31 Thread Paolo Carlini


On 10/31/2011 08:37 PM, Jason Merrill wrote:
I have on occasion been confused by initializer_list silently 
becoming empty in C++98 mode.  OK for trunk?

For c++98, I think we should use the usual:

#ifndef __GXX_EXPERIMENTAL_CXX0X__
# include bits/c++0x_warning.h
#else

which we have in place elsewhere. Or we have special reasons for not 
doing that?


Paolo.

Re: [C++ preview patch] PR 44277

2011-10-31 Thread Paolo Carlini

... so today I noticed the c_inhibit_evaluation_warnings use in 
cp_convert_and_check and occurred to me that we could use the existing 
mechanism for this warning too?


The below still passes checking and my small set of tests...

What do you think?

Thanks,
Paolo.

//
Index: c-family/c.opt
===
--- c-family/c.opt  (revision 180705)
+++ c-family/c.opt  (working copy)
@@ -685,6 +685,9 @@ Wpointer-sign
 C ObjC Var(warn_pointer_sign) Init(-1) Warning
 Warn when a pointer differs in signedness in an assignment
 
+Wzero-as-null-pointer-constant
+C++ ObjC++ Var(warn_zero_as_null_pointer_constant) Warning
+
 ansi
 C ObjC C++ ObjC++
 A synonym for -std=c89 (for C) or -std=c++98 (for C++)
Index: cp/typeck.c
===
--- cp/typeck.c (revision 180705)
+++ cp/typeck.c (working copy)
@@ -4057,8 +4057,13 @@ cp_build_binary_op (location_t location,
}
  else 
{
+ bool inhibit = NULLPTR_TYPE_P (TREE_TYPE (op1));
  op0 = build_ptrmemfunc_access_expr (op0, pfn_identifier);
- op1 = cp_convert (TREE_TYPE (op0), integer_zero_node); 
+ if (inhibit)
+   ++c_inhibit_evaluation_warnings;
+ op1 = cp_convert (TREE_TYPE (op0), integer_zero_node);
+ if (inhibit)
+   --c_inhibit_evaluation_warnings;
}
  result_type = TREE_TYPE (op0);
}
@@ -4666,11 +4671,25 @@ tree
 cp_truthvalue_conversion (tree expr)
 {
   tree type = TREE_TYPE (expr);
+  tree ret;
+
   if (TYPE_PTRMEM_P (type))
-return build_binary_op (EXPR_LOCATION (expr),
-   NE_EXPR, expr, integer_zero_node, 1);
+{
+  ++c_inhibit_evaluation_warnings;
+  ret = build_binary_op (EXPR_LOCATION (expr),
+NE_EXPR, expr, integer_zero_node, 1);
+  --c_inhibit_evaluation_warnings;
+}
+  else if (TYPE_PTR_P (type) || TYPE_PTRMEMFUNC_P (type))
+{
+  ++c_inhibit_evaluation_warnings;
+  ret = c_common_truthvalue_conversion (input_location, expr);
+  --c_inhibit_evaluation_warnings;
+}
   else
-return c_common_truthvalue_conversion (input_location, expr);
+ret = c_common_truthvalue_conversion (input_location, expr);
+
+  return ret;
 }
 
 /* Just like cp_truthvalue_conversion, but we want a CLEANUP_POINT_EXPR.  */
Index: cp/init.c
===
--- cp/init.c   (revision 180705)
+++ cp/init.c   (working copy)
@@ -176,6 +176,12 @@ build_zero_init_1 (tree type, tree nelts, bool sta
items with static storage duration that are not otherwise
initialized are initialized to zero.  */
 ;
+  else if (TYPE_PTR_P (type) || TYPE_PTR_TO_MEMBER_P (type))
+{
+  ++c_inhibit_evaluation_warnings;
+  init = convert (type, integer_zero_node);
+  --c_inhibit_evaluation_warnings;
+}
   else if (SCALAR_TYPE_P (type))
 init = convert (type, integer_zero_node);
   else if (CLASS_TYPE_P (type))
Index: cp/cvt.c
===
--- cp/cvt.c(revision 180705)
+++ cp/cvt.c(working copy)
@@ -198,6 +198,11 @@ cp_convert_to_pointer (tree type, tree expr)
 
   if (null_ptr_cst_p (expr))
 {
+  if (c_inhibit_evaluation_warnings == 0
+  !NULLPTR_TYPE_P (TREE_TYPE (expr)))
+   warning (OPT_Wzero_as_null_pointer_constant,
+zero as null pointer constant);
+
   if (TYPE_PTRMEMFUNC_P (type))
return build_ptrmemfunc (TYPE_PTRMEMFUNC_FN_TYPE (type), expr, 0,
 /*c_cast_p=*/false, tf_warning_or_error);

[PATCH] Add floatunsv8siv8sf2 support

2011-10-31 Thread Jakub Jelinek

Hi!

On Mon, Oct 31, 2011 at 12:43:14PM -0700, Richard Henderson wrote:
 Renaming all of the insn patterns as needed to the standard
 optab forms.  Sadly, only one of the builtins is unused by
 the various header files, so most of them must stay around.

Thanks.  Here is a patch that adds floatunsv8siv8sf2 and macroizes
floatv[48]siv[48]sf2.  Ok if bootstrap/regtest passes?

2011-10-31  Jakub Jelinek  ja...@redhat.com

* config/i386/sse.md (sseintvecmode): Remove duplicate modes.
(sseintvecmodelower): New mode iterator.
(floatv8siv8sf2, floatunsv4siv4sf2): Macroize into...
(floatsseintvecmodelowermode2): ... this using VF1 iterator.
(floatunsv4siv4sf2): Macroize into...
(floatunssseintvecmodelowermode2): ... this using VF1 iterator.

--- gcc/config/i386/sse.md.jj   2011-10-31 20:44:13.0 +0100
+++ gcc/config/i386/sse.md  2011-10-31 21:05:21.0 +0100
@@ -233,12 +233,19 @@ (define_mode_attr sseinsnmode
 (define_mode_attr sseintvecmode
   [(V8SF V8SI) (V4DF V4DI)
(V4SF V4SI) (V2DF V2DI)
-   (V4DF V4DI) (V8SF V8SI)
(V8SI V8SI) (V4DI V4DI)
(V4SI V4SI) (V2DI V2DI)
(V16HI V16HI) (V8HI V8HI)
(V32QI V32QI) (V16QI V16QI)])
 
+(define_mode_attr sseintvecmodelower
+  [(V8SF v8si) (V4DF v4di)
+   (V4SF v4si) (V2DF v2di)
+   (V8SI v8si) (V4DI v4di)
+   (V4SI v4si) (V2DI v2di)
+   (V16HI v16hi) (V8HI v8hi)
+   (V32QI v32qi) (V16QI v16qi)])
+
 ;; Mapping of vector modes to a vector mode of double size
 (define_mode_attr ssedoublevecmode
   [(V32QI V64QI) (V16HI V32HI) (V8SI V16SI) (V4DI V8DI)
@@ -2224,33 +2231,26 @@ (define_insn sse_cvttss2siq
(set_attr prefix maybe_vex)
(set_attr mode DI)])
 
-(define_insn floatv8siv8sf2
-  [(set (match_operand:V8SF 0 register_operand =x)
-   (float:V8SF (match_operand:V8SI 1 nonimmediate_operand xm)))]
-  TARGET_AVX
-  vcvtdq2ps\t{%1, %0|%0, %1}
-  [(set_attr type ssecvt)
-   (set_attr prefix vex)
-   (set_attr mode V8SF)])
-
-(define_insn floatv4siv4sf2
-  [(set (match_operand:V4SF 0 register_operand =x)
-   (float:V4SF (match_operand:V4SI 1 nonimmediate_operand xm)))]
+(define_insn floatsseintvecmodelowermode2
+  [(set (match_operand:VF1 0 register_operand =x)
+   (float:VF1
+ (match_operand:sseintvecmode 1 nonimmediate_operand xm)))]
   TARGET_SSE2
   %vcvtdq2ps\t{%1, %0|%0, %1}
   [(set_attr type ssecvt)
(set_attr prefix maybe_vex)
-   (set_attr mode V4SF)])
+   (set_attr mode sseinsnmode)])
 
-(define_expand floatunsv4siv4sf2
+(define_expand floatunssseintvecmodelowermode2
   [(set (match_dup 5)
-   (float:V4SF (match_operand:V4SI 1 nonimmediate_operand )))
+   (float:VF1
+ (match_operand:sseintvecmode 1 nonimmediate_operand )))
(set (match_dup 6)
-   (lt:V4SF (match_dup 5) (match_dup 3)))
+   (lt:VF1 (match_dup 5) (match_dup 3)))
(set (match_dup 7)
-   (and:V4SF (match_dup 6) (match_dup 4)))
-   (set (match_operand:V4SF 0 register_operand )
-   (plus:V4SF (match_dup 5) (match_dup 7)))]
+   (and:VF1 (match_dup 6) (match_dup 4)))
+   (set (match_operand:VF1 0 register_operand )
+   (plus:VF1 (match_dup 5) (match_dup 7)))]
   TARGET_SSE2
 {
   REAL_VALUE_TYPE TWO32r;
@@ -2260,12 +2260,12 @@ (define_expand floatunsv4siv4sf2
   real_ldexp (TWO32r, dconst1, 32);
   x = const_double_from_real_value (TWO32r, SFmode);
 
-  operands[3] = force_reg (V4SFmode, CONST0_RTX (V4SFmode));
-  operands[4] = force_reg (V4SFmode,
-  ix86_build_const_vector (V4SFmode, 1, x));
+  operands[3] = force_reg (MODEmode, CONST0_RTX (MODEmode));
+  operands[4] = force_reg (MODEmode,
+  ix86_build_const_vector (MODEmode, 1, x));
 
   for (i = 5; i  8; i++)
-operands[i] = gen_reg_rtx (V4SFmode);
+operands[i] = gen_reg_rtx (MODEmode);
 })
 
 (define_insn avx_cvtps2dq256

Jakub

Re: RFA: libstdc++ PATCH to initializer_list to #error in C++98 mode

2011-10-31 Thread Jason Merrill


On 10/31/2011 04:07 PM, Paolo Carlini wrote:

On 10/31/2011 08:37 PM, Jason Merrill wrote:

I have on occasion been confused by initializer_list silently
becoming empty in C++98 mode. OK for trunk?

For c++98, I think we should use the usual:

#ifndef __GXX_EXPERIMENTAL_CXX0X__
# include bits/c++0x_warning.h
#else

which we have in place elsewhere. Or we have special reasons for not
doing that?


I'd rather not make a libsupc++ header dependent on a header from the 
main library.  I guess we could move c++0x_warning.h into libsupc++, 
though...


Jason

Re: [PR50878, PATCH] Fix for verify_dominators in -ftree-tail-merge

2011-10-31 Thread Tom de Vries

On 10/30/2011 10:54 AM, Richard Guenther wrote:
 On Sun, Oct 30, 2011 at 9:27 AM, Tom de Vries tom_devr...@mentor.com wrote:
 On 10/30/2011 09:20 AM, Tom de Vries wrote:
 Richard,

 I have a fix for PR50878.

 Sorry, with patch this time.
 
 Ok for now, but see Davids mail and the complexity issue with iteratively
 updating dominators.

I'm not sure which mail you mean.

 It seems to me that we know exactly what to update
 and how, and we should do that (well, if we need up-to-date dominators,
 re-computing them once in the pass would be ok).
 

Indeed, in this example we know exactly what to update and how. However, PR50908
popped up, and there that's not the case anymore.

Consider the following cfg, where A is the direct dominator of I:

A
   / \
  B   \
 / \   \
C   D
   /|   |\
E   F
|\ /|
| x |
|/ \|
G   H
 \ /
  I

Say E and F are duplicates, and F is removed.  The cfg then looks like
this:

A
   / \
  B   \
 / \   \
C   D
   / \ / \
  E
 / \
G   H
 \ /
  I

E is now the new direct dominator of I.

The patch for PR50878 did not address this example, since it uses the set of bbs
directly dominated by the (single) predecessor of bb1 and bb2.

The new patch calculates the updated dominator info by taking the nearest common
dominator (A) of bb1 (F) and bb2 (E), and getting the set of bbs immediately
dominated by it.  Part of this set is now directly dominated by bb2.

Ideally we would have a means to determine which bbs in the set are now
directly dominated by bb2, and call set_immediate_dominator for those bbs, but
we don't, so instead we let iterate_fix_dominators figure it out.

Additionally, the patch makes sure it updates dominator info before updating the
vuses, this fixes a latent bug.

The patch fixes both PR50908 and PR50878.

Bootstrapped and reg-tested on x86_64 and i686, and build and reg-tested on ARM
and MIPS.

Ok for trunk?

Thanks,
- Tom

 Richard.
 
 Thanks,
 - Tom


 A simplified form of the problem from the test-case of the PR is shown in 
 this
 cfg. Block 12 has as direct dominator block 5.

 5
/ \
   /   \
  * *
  6 7
  | |
  | |
  * *
  8 9
   \   /
\ /
 *
12

 tail_merge_optimize finds that blocks 6 and 7 are duplicates. After 
 replacing
 block 7 by block 6, the cfg looks like this:

 5
 |
 |
 *
 6
/ \
   /   \
  * *
  8 9
   \   /
\ /
 *
12

 The new direct dominator of block 12 is block 6, but the current algorithm 
 only
 recalculates dominator info for blocks 6, 8 and 9.

 The patch fixes this by additionally recalculating the dominator info for 
 blocks
 immediately dominated by bb2 (block 6 in the example), if bb2 has a single
 predecessor after replacement.

 Bootstapped and reg-tested on x86_64 and i686. Build and reg-tested on MIPS 
 and ARM.

 Ok for trunk?

 Thanks,
 - Tom

 2011-10-30  Tom de Vries  t...@codesourcery.com

   PR tree-optimization/50878
   * tree-ssa-tail-merge.c (replace_block_by): Recalculate dominator info
   for blocks immediately dominated by bb2, if bb2 has a single 
 predecessor
   after replacement.



2011-10-31  Tom de Vries  t...@codesourcery.com

PR tree-optimization/50908
* tree-ssa-tail-merge.c (update_vuses): Now that edges are removed
before update_vuses, test for 1 predecessor rather than two.
(delete_block_update_dominator_info): New function, part of it factored
out of ...
(replace_block_by): Use delete_block_update_dominator_info.  Call
update_vuses after deleting bb1 and updating dominator info, instead of
before.
Index: gcc/tree-ssa-tail-merge.c
===
--- gcc/tree-ssa-tail-merge.c (revision 180521)
+++ gcc/tree-ssa-tail-merge.c (working copy)
@@ -1458,7 +1458,7 @@ update_vuses (bool vuse1_phi_args, tree
 	  if (!dominated_by_p (CDI_DOMINATORS, pred, bb2))
 		continue;
 
-	  if (pred == bb2  EDGE_COUNT (gimple_bb (stmt)-preds) == 2)
+	  if (pred == bb2  EDGE_COUNT (gimple_bb (stmt)-preds) == 1)
 		{
 		  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
 		  unlink_virtual_phi (stmt, lhs);
@@ -1526,6 +1526,88 @@ vop_at_entry (basic_block bb)
 	  : NULL_TREE);
 }
 
+/* Given that all incoming edges of BB1 have been redirected to BB2, delete BB1
+   and recompute dominator info.  */
+
+static void
+delete_block_update_dominator_info (basic_block bb1, basic_block bb2)
+{
+  VEC (basic_block,heap) *fix_dom_bb;
+  unsigned int i;
+  basic_block bb, dom;

Re: RFA: libstdc++ PATCH to initializer_list to #error in C++98 mode

2011-10-31 Thread Paolo Carlini


On 10/31/2011 09:17 PM, Jason Merrill wrote:

On 10/31/2011 04:07 PM, Paolo Carlini wrote:

On 10/31/2011 08:37 PM, Jason Merrill wrote:

I have on occasion been confused by initializer_list silently
becoming empty in C++98 mode. OK for trunk?

For c++98, I think we should use the usual:

#ifndef __GXX_EXPERIMENTAL_CXX0X__
# include bits/c++0x_warning.h
#else

which we have in place elsewhere. Or we have special reasons for not
doing that?


I'd rather not make a libsupc++ header dependent on a header from the 
main library.  I guess we could move c++0x_warning.h into libsupc++, 
though...
Sure. Note anyway, that bits/c++config.h is already included, elsewhere 
too in libsupc++.


Paolo

Re: [PATCH] Add floatunsv8siv8sf2 support

2011-10-31 Thread Richard Henderson

On 10/31/2011 01:15 PM, Jakub Jelinek wrote:
 Hi!
 
 On Mon, Oct 31, 2011 at 12:43:14PM -0700, Richard Henderson wrote:
 Renaming all of the insn patterns as needed to the standard
 optab forms.  Sadly, only one of the builtins is unused by
 the various header files, so most of them must stay around.
 
 Thanks.  Here is a patch that adds floatunsv8siv8sf2 and macroizes
 floatv[48]siv[48]sf2.  Ok if bootstrap/regtest passes?
 
 2011-10-31  Jakub Jelinek  ja...@redhat.com
 
   * config/i386/sse.md (sseintvecmode): Remove duplicate modes.
   (sseintvecmodelower): New mode iterator.
   (floatv8siv8sf2, floatunsv4siv4sf2): Macroize into...
   (floatsseintvecmodelowermode2): ... this using VF1 iterator.
   (floatunsv4siv4sf2): Macroize into...
   (floatunssseintvecmodelowermode2): ... this using VF1 iterator.

Ok.


r~

Re: [C++ preview patch] PR 44277

2011-10-31 Thread Jason Merrill


On 10/31/2011 04:09 PM, Paolo Carlini wrote:

... so today I noticed the c_inhibit_evaluation_warnings use in
cp_convert_and_check and occurred to me that we could use the existing
mechanism for this warning too?

The below still passes checking and my small set of tests...


I notice that this patch only changes the C++ front end, and it seems 
like you already have special cases for pointers/pointers to members, so 
you might as well go ahead and use nullptr_node.


Jason

Re: RFA: libstdc++ PATCH to initializer_list to #error in C++98 mode

2011-10-31 Thread Jason Merrill


On 10/31/2011 04:24 PM, Paolo Carlini wrote:

Sure. Note anyway, that bits/c++config.h is already included, elsewhere
too in libsupc++.


I guess the other option would be to add it to install-freestanding-headers.

Jason

RE: AVX generic mode tuning discussion.

2011-10-31 Thread Jagasia, Harsha

   We would like to propose changing AVX generic mode tuning to
 generate
  128-bit
   AVX instead of 256-bit AVX.
 
  You indicate a 3% reduction on bulldozer with avx256.
  How does avx128 compare to -mno-avx -msse4.2?
 
 We see these % differences going from SSE42 to AVX128 to AVX256 on
 Bulldozer with -mtune=generic -Ofast.
 (Positive is improvement, negative is degradation)
 
 Bulldozer:
   AVX128/SSE42AVX256/AVX-128
 410.bwaves-1.4%   -1.4%
 416.gamess-1.1%   0.0%
 433.milc  0.5%-2.4%
 434.zeusmp9.7%-2.1%
 435.gromacs   5.1%0.5%
 436.cactusADM 8.2%-23.8%
 437.leslie3d  8.1%0.4%
 444.namd  3.6%0.0%
 447.dealII-1.4%   -0.4%
 450.soplex-0.4%   -0.4%
 453.povray0.0%-1.5%
 454.calculix  15.7%   -8.3%
 459.GemsFDTD  4.9%1.4%
 465.tonto 1.3%-0.6%
 470.lbm   0.9%0.3%
 481.wrf   7.3%-3.6%
 482.sphinx3   5.0%-9.8%
 SPECFP3.8%-3.2%
 
  Will the next AMD generation have a useable avx256?
  I'm not keen on the idea of generic mode being tune
  for a single processor revision that maybe shouldn't
  actually be using avx at all.
 
 We see a substantial gain in several SPECFP benchmarks going from SSE42
 to AVX128 on Bulldozer.
 IMHO, accomplishing even a 5% gain in an individual benchmark takes a
 hardware company several man months.
 The loss with AVX256 for Bulldozer is much more significant than the
 gain for SandyBridge.
 While the general trend in the industry is a move toward AVX256, for
 now we would be disadvantaging Bulldozer with this choice.
 
 We have several customers who use -mtune=generic and it is default,
 unless a user explicitly overrides it with -mtune=native. They are the
 ones who want to experiment with latest ISA using gcc, but want to keep
 their ISA selection and tuning agnostic on x86/64. IMHO, it is with
 these customers in mind that generic was introduced in the first place.

Since stage 1 closure is around the corner, just wanted to ping to see if the 
maintainers have made up their mind on this one.
AVX-128 is an improvement over SSE42 for Bulldozer and AVX-256 wipes out pretty 
much all of that gain in generic mode.
Until there is a convergence on AVX-256 for x86/64, we would like to propose 
having generic generate avx-128 by default and have a user override to avx-256 
manually when known to benefit performance.

Thanks,
Harsha

Re: [C++ preview patch] PR 44277

2011-10-31 Thread Paolo Carlini


On 10/31/2011 09:29 PM, Jason Merrill wrote:

On 10/31/2011 04:09 PM, Paolo Carlini wrote:

... so today I noticed the c_inhibit_evaluation_warnings use in
cp_convert_and_check and occurred to me that we could use the existing
mechanism for this warning too?

The below still passes checking and my small set of tests...


I notice that this patch only changes the C++ front end, and it seems 
like you already have special cases for pointers/pointers to members, 
so you might as well go ahead and use nullptr_node.
Right. Thus essentially a mix of the two recent tries, like the below, 
right? Patch becomes even simpler and more importantly we rely on 
c_inhibit_* only for c code proper.


If you think I'm on the right track, I will add the testcases, 
documentation, etc.


Is the name of the warning ok? It's a bit long...

Thanks,
Paolo.


Index: c-family/c.opt
===
--- c-family/c.opt  (revision 180705)
+++ c-family/c.opt  (working copy)
@@ -685,6 +685,9 @@ Wpointer-sign
 C ObjC Var(warn_pointer_sign) Init(-1) Warning
 Warn when a pointer differs in signedness in an assignment
 
+Wzero-as-null-pointer-constant
+C++ ObjC++ Var(warn_zero_as_null_pointer_constant) Warning
+
 ansi
 C ObjC C++ ObjC++
 A synonym for -std=c89 (for C) or -std=c++98 (for C++)
Index: cp/typeck.c
===
--- cp/typeck.c (revision 180705)
+++ cp/typeck.c (working copy)
@@ -4058,7 +4058,9 @@ cp_build_binary_op (location_t location,
  else 
{
  op0 = build_ptrmemfunc_access_expr (op0, pfn_identifier);
- op1 = cp_convert (TREE_TYPE (op0), integer_zero_node); 
+ op1 = cp_convert (TREE_TYPE (op0),
+   NULLPTR_TYPE_P (TREE_TYPE (op1))
+   ? nullptr_node : integer_zero_node);
}
  result_type = TREE_TYPE (op0);
}
@@ -4666,11 +4668,21 @@ tree
 cp_truthvalue_conversion (tree expr)
 {
   tree type = TREE_TYPE (expr);
+  tree ret;
+
   if (TYPE_PTRMEM_P (type))
-return build_binary_op (EXPR_LOCATION (expr),
-   NE_EXPR, expr, integer_zero_node, 1);
+ret = build_binary_op (EXPR_LOCATION (expr),
+  NE_EXPR, expr, nullptr_node, 1);
+  else if (TYPE_PTR_P (type) || TYPE_PTRMEMFUNC_P (type))
+{
+  ++c_inhibit_evaluation_warnings;
+  ret = c_common_truthvalue_conversion (input_location, expr);
+  --c_inhibit_evaluation_warnings;
+}
   else
-return c_common_truthvalue_conversion (input_location, expr);
+ret = c_common_truthvalue_conversion (input_location, expr);
+
+  return ret;
 }
 
 /* Just like cp_truthvalue_conversion, but we want a CLEANUP_POINT_EXPR.  */
Index: cp/init.c
===
--- cp/init.c   (revision 180705)
+++ cp/init.c   (working copy)
@@ -176,6 +176,8 @@ build_zero_init_1 (tree type, tree nelts, bool sta
items with static storage duration that are not otherwise
initialized are initialized to zero.  */
 ;
+  else if (TYPE_PTR_P (type) || TYPE_PTR_TO_MEMBER_P (type))
+init = convert (type, nullptr_node);
   else if (SCALAR_TYPE_P (type))
 init = convert (type, integer_zero_node);
   else if (CLASS_TYPE_P (type))
Index: cp/cvt.c
===
--- cp/cvt.c(revision 180705)
+++ cp/cvt.c(working copy)
@@ -198,6 +198,11 @@ cp_convert_to_pointer (tree type, tree expr)
 
   if (null_ptr_cst_p (expr))
 {
+  if (c_inhibit_evaluation_warnings == 0
+  !NULLPTR_TYPE_P (TREE_TYPE (expr)))
+   warning (OPT_Wzero_as_null_pointer_constant,
+zero as null pointer constant);
+
   if (TYPE_PTRMEMFUNC_P (type))
return build_ptrmemfunc (TYPE_PTRMEMFUNC_FN_TYPE (type), expr, 0,
 /*c_cast_p=*/false, tf_warning_or_error);

Re: v2[PATCH] update to libtool-2.4.2 and regenerate

2011-10-31 Thread Markus Trippelsdorf

This is an updated version of the libtool update patch. It fixes the
--with-sysroot clash by reverting commit 3334f7ed5851ef1 in libtools.
I've also included Rainer's 64bit Solaris patch.

http://trippelsdorf.de/update-to-libtool-2.4.2-and-regenerate.patch.bz2

---
 boehm-gc/Makefile.in   |3 +
 boehm-gc/configure | 1857 ++--
 boehm-gc/include/Makefile.in   |3 +
 boehm-gc/include/gc_config.h.in|6 -
 boehm-gc/testsuite/Makefile.in |3 +
 fixincludes/configure  |   95 +-
 gcc/configure  | 1383 +--
 intl/config.h.in   |  232 +-
 intl/configure | 4764 +++-
 libffi/Makefile.in |3 +
 libffi/configure   | 1430 +--
 libffi/include/Makefile.in |3 +
 libffi/man/Makefile.in |3 +
 libffi/testsuite/Makefile.in   |3 +
 libgfortran/Makefile.in|3 +
 libgfortran/configure  | 1884 ++---
 libgomp/Makefile.in|3 +
 libgomp/configure  | 1886 ++---
 libgomp/testsuite/Makefile.in  |3 +
 libjava/Makefile.in|2 +
 libjava/classpath/Makefile.in  |3 +
 libjava/classpath/configure| 1869 ++---
 libjava/classpath/doc/Makefile.in  |3 +
 libjava/classpath/doc/api/Makefile.in  |3 +
 libjava/classpath/examples/Makefile.in |3 +
 libjava/classpath/external/Makefile.in |3 +
 libjava/classpath/external/jsr166/Makefile.in  |3 +
 .../classpath/external/relaxngDatatype/Makefile.in |3 +
 libjava/classpath/external/sax/Makefile.in |3 +
 libjava/classpath/external/w3c_dom/Makefile.in |3 +
 libjava/classpath/include/Makefile.in  |3 +
 libjava/classpath/lib/Makefile.in  |3 +
 libjava/classpath/native/Makefile.in   |3 +
 libjava/classpath/native/fdlibm/Makefile.in|3 +
 libjava/classpath/native/jawt/Makefile.in  |3 +
 libjava/classpath/native/jni/Makefile.in   |3 +
 libjava/classpath/native/jni/classpath/Makefile.in |3 +
 .../classpath/native/jni/gconf-peer/Makefile.in|3 +
 .../native/jni/gstreamer-peer/Makefile.in  |3 +
 libjava/classpath/native/jni/gtk-peer/Makefile.in  |3 +
 libjava/classpath/native/jni/java-io/Makefile.in   |3 +
 libjava/classpath/native/jni/java-lang/Makefile.in |3 +
 libjava/classpath/native/jni/java-math/Makefile.in |3 +
 libjava/classpath/native/jni/java-net/Makefile.in  |3 +
 libjava/classpath/native/jni/java-nio/Makefile.in  |3 +
 libjava/classpath/native/jni/java-util/Makefile.in |3 +
 libjava/classpath/native/jni/midi-alsa/Makefile.in |3 +
 libjava/classpath/native/jni/midi-dssi/Makefile.in |3 +
 .../classpath/native/jni/native-lib/Makefile.in|3 +
 libjava/classpath/native/jni/qt-peer/Makefile.in   |3 +
 libjava/classpath/native/jni/xmlj/Makefile.in  |3 +
 libjava/classpath/native/plugin/Makefile.in|3 +
 libjava/classpath/resource/Makefile.in |3 +
 libjava/classpath/scripts/Makefile.in  |3 +
 libjava/classpath/tools/Makefile.in|3 +
 libjava/configure  | 2164 +++---
 libjava/gcj/Makefile.in|4 +-
 libjava/include/Makefile.in|4 +-
 libjava/testsuite/Makefile.in  |4 +-
 libmudflap/Makefile.in |3 +
 libmudflap/configure   | 1430 +--
 libmudflap/testsuite/Makefile.in   |3 +
 libobjc/configure  | 1457 +--
 libquadmath/Makefile.in|3 +
 libquadmath/configure  | 1430 +--
 libssp/Makefile.in |3 +
 libssp/configure   | 1430 +--
 libstdc++-v3/Makefile.in   |3 +
 libstdc++-v3/configure | 1873 ++---
 libstdc++-v3/doc/Makefile.in   |3 +
 libstdc++-v3/include/Makefile.in   |3 +
 libstdc++-v3/libsupc++/Makefile.in |3 +
 libstdc++-v3/po/Makefile.in|3 +
 libstdc++-v3/python/Makefile.in|3 +
 libstdc++-v3/src/Makefile.in   |3 +
 libstdc++-v3/testsuite/Makefile.in

implementation of std::thread::hardware_concurrency()

2011-10-31 Thread niXman

Hi all.

This is patch is implement the std::thread::hardware_concurrency().
Tested on pthreads-win32/winpthreads on windows OS, and on Linux/FreeBSD.




diff --git a/libstdc++-v3/src/thread.cc b/libstdc++-v3/src/thread.cc
index 09e7fc5..3eacb06 100644
--- a/libstdc++-v3/src/thread.cc
+++ b/libstdc++-v3/src/thread.cc
@@ -112,10 +112,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   unsigned int
   thread::hardware_concurrency() noexcept
   {
-int __n = _GLIBCXX_NPROCS;
-if (__n  0)
-  __n = 0;
-return __n;
+int count=0;
+#if defined(PTW32_VERSION) || \
+   (defined(__MINGW64_VERSION_MAJOR)  defined(_POSIX_THREADS)) || \
+   defined(__hpux)
+count=pthread_num_processors_np();
+#elif defined(__APPLE__) || defined(__FreeBSD__)
+size_t size=sizeof(count);
+sysctlbyname(hw.ncpu, count, size, NULL, 0);
+#elif defined(_SC_NPROCESSORS_ONLN)
+count=sysconf(_SC_NPROCESSORS_ONLN);
+#elif definen(_GLIBCXX_USE_GET_NPROCS)
+count=_GLIBCXX_NPROCS;
+#endif
+return (count0)?count:0;
   }

 _GLIBCXX_END_NAMESPACE_VERSION

Re: [PATCH] Use gcc's libtool in libgo

2011-10-31 Thread Markus Trippelsdorf

Here is a patch that updates libgo to use gcc's internal libtool
version.  I've only retained config/go.m4 for now.

http://trippelsdorf.de/Use-gcc-s-libtool-in-libgo.patch.bz2

---
 libgo/Makefile.in   |9 +-
 libgo/aclocal.m4|   10 +-
 libgo/config/libtool.m4 | 7516 -
 libgo/config/ltmain.sh  | 8636 ---
 libgo/config/ltoptions.m4   |  369 --
 libgo/config/ltsugar.m4 |  123 -
 libgo/config/ltversion.m4   |   23 -
 libgo/config/lt~obsolete.m4 |   98 -
 libgo/configure | 1747 +++---
 libgo/testsuite/Makefile.in |9 +-
 10 files changed, 1276 insertions(+), 17264 deletions(-)
 delete mode 100644 libgo/config/libtool.m4
 delete mode 100644 libgo/config/ltmain.sh
 delete mode 100644 libgo/config/ltoptions.m4
 delete mode 100644 libgo/config/ltsugar.m4
 delete mode 100644 libgo/config/ltversion.m4
 delete mode 100644 libgo/config/lt~obsolete.m4

diff --git a/libgo/Makefile.in b/libgo/Makefile.in
index 05223a6..bad8ed3 100644
--- a/libgo/Makefile.in
+++ b/libgo/Makefile.in
@@ -58,11 +58,7 @@ am__aclocal_m4_deps = $(top_srcdir)/../config/depstand.m4 \
$(top_srcdir)/../config/multi.m4 \
$(top_srcdir)/../config/override.m4 \
$(top_srcdir)/../config/unwind_ipinfo.m4 \
-   $(top_srcdir)/config/go.m4 $(top_srcdir)/config/libtool.m4 \
-   $(top_srcdir)/config/ltoptions.m4 \
-   $(top_srcdir)/config/ltsugar.m4 \
-   $(top_srcdir)/config/ltversion.m4 \
-   $(top_srcdir)/config/lt~obsolete.m4 $(top_srcdir)/configure.ac
+   $(top_srcdir)/config/go.m4 $(top_srcdir)/configure.ac
 am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
 am__CONFIG_DISTCLEAN_FILES = config.status config.cache config.log \
@@ -365,6 +361,7 @@ CPPFLAGS = @CPPFLAGS@
 CYGPATH_W = @CYGPATH_W@
 DEFS = @DEFS@
 DEPDIR = @DEPDIR@
+DLLTOOL = @DLLTOOL@
 DSYMUTIL = @DSYMUTIL@
 DUMPBIN = @DUMPBIN@
 ECHO_C = @ECHO_C@
@@ -399,6 +396,7 @@ LN_S = @LN_S@
 LTLIBOBJS = @LTLIBOBJS@
 MAINT = @MAINT@
 MAKEINFO = @MAKEINFO@
+MANIFEST_TOOL = @MANIFEST_TOOL@
 MATH_LIBS = @MATH_LIBS@
 MKDIR_P = @MKDIR_P@
 NET_LIBS = @NET_LIBS@
@@ -434,6 +432,7 @@ abs_builddir = @abs_builddir@
 abs_srcdir = @abs_srcdir@
 abs_top_builddir = @abs_top_builddir@
 abs_top_srcdir = @abs_top_srcdir@
+ac_ct_AR = @ac_ct_AR@
 ac_ct_CC = @ac_ct_CC@
 ac_ct_DUMPBIN = @ac_ct_DUMPBIN@
 am__include = @am__include@
diff --git a/libgo/aclocal.m4 b/libgo/aclocal.m4
index ca453c6..9d4f58c 100644
--- a/libgo/aclocal.m4
+++ b/libgo/aclocal.m4
@@ -973,9 +973,9 @@ m4_include([../config/lead-dot.m4])
 m4_include([../config/multi.m4])
 m4_include([../config/override.m4])
 m4_include([../config/unwind_ipinfo.m4])
+m4_include([../config/libtool.m4])
+m4_include([../config/ltoptions.m4])
+m4_include([../config/ltsugar.m4])
+m4_include([../config/ltversion.m4])
+m4_include([../config/lt~obsolete.m4])
 m4_include([config/go.m4])
-m4_include([config/libtool.m4])
-m4_include([config/ltoptions.m4])
-m4_include([config/ltsugar.m4])
-m4_include([config/ltversion.m4])
-m4_include([config/lt~obsolete.m4])
diff --git a/libgo/config/libtool.m4 b/libgo/config/libtool.m4
deleted file mode 100644
index 1a667d3..000
--- a/libgo/config/libtool.m4
+++ /dev/null
...

-- 
Markus

Re: implementation of std::thread::hardware_concurrency()

2011-10-31 Thread Paolo Carlini

Hi,

 This is patch is implement the std::thread::hardware_concurrency().
 Tested on pthreads-win32/winpthreads on windows OS, and on Linux/FreeBSD.

Please send library patches to the library mailing list too. Also, always parch 
mainline first: actually in the latter the function is alread implemented, 
maybe something is missing for win32, please check, rediff, and resend.

Thanks
Paolo

Re: implementation of std::thread::hardware_concurrency()

2011-10-31 Thread Richard Henderson

On 10/31/2011 02:10 PM, niXman wrote:
 +#elif definen(_GLIBCXX_USE_GET_NPROCS)

Typo.


r~

Re: [PATCH, rs6000] Preserve link stack for 476 cpus

2011-10-31 Thread Peter Bergner

On Fri, 2011-10-28 at 15:37 -0400, David Edelsohn wrote:
 On Fri, Oct 28, 2011 at 12:36 PM, Peter Bergner berg...@vnet.ibm.com wrote:
 
  So David, do we even want to bother trying to support this on -m64
  given the only cpu that needs this is a 32-bit only cpu?  If so, I
  can try and work with Alan to figure out how we can merge the
  function descriptors for the thunk routines when using -m64.
 
 I barely want to bother with this ;-).  So, no, I don't want to bother
 with -m64 support.

Ok, attached below is the updated patch that passes bootstrap and regtesting
that only enables the new link stack code for 32-bit compiles.  However,
talking with Alan, he mentioned we just have to mark the opd entry weak
and that will fix my link problem (confirmed it does).  It seems we might
want to allow this on 64-bit too, since it actually makes the code cleaner
wrt where we set TARGET_LINK_STACK.  To get 64-bit working, we only need the
following patch on top of the 32-bit only patch below:

--- gcc/config/rs6000/rs6000.c.old  2011-10-31 16:16:04.0 -0500
+++ gcc/config/rs6000/rs6000.c  2011-10-31 16:16:37.0 -0500
@@ -3245,13 +3245,7 @@
 
   /* If not explicitly specified via option, decide whether to generate the
  extra blr's required to preserve the link stack on some cpus (eg, 476).  
*/
-  if (TARGET_POWERPC64)
-{
-  if (TARGET_LINK_STACK  0)
-   warning (0, -m64 disables -mpreserve-ppc476-link-stack);
-  SET_TARGET_LINK_STACK (0);
-}
-  else if (TARGET_LINK_STACK == -1)
+  if (TARGET_LINK_STACK == -1)
 SET_TARGET_LINK_STACK (rs6000_cpu == PROCESSOR_PPC476  flag_pic);
 
   return ret;
@@ -27960,6 +27954,8 @@
   DECL_COMDAT_GROUP (decl) = DECL_ASSEMBLER_NAME (decl);
   targetm.asm_out.unique_section (decl, 0);
   switch_to_section (get_named_section (decl, NULL, 0));
+  DECL_WEAK (decl) = 1;
+  ASM_WEAKEN_DECL (asm_out_file, decl, name, 0);
   targetm.asm_out.globalize_label (asm_out_file, name);
   targetm.asm_out.assemble_visibility (decl, VISIBILITY_HIDDEN);
   ASM_DECLARE_FUNCTION_NAME (asm_out_file, name, decl);

It's up to you David whether we should stick with the 32-bit only
patch or go ahead and allow 64-bit too.  What do you think?

Peter



* config.gcc (powerpc*-*-linux*): Add powerpc*-*-linux*ppc476* variant.
* config/rs6000/476.h: New file.
* config/rs6000/476.opt: Likewise.
* config/rs6000/rs6000.h (TARGET_LINK_STACK): New define.
(SET_TARGET_LINK_STACK): Likewise.
(TARGET_ASM_CODE_END): Define.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Enable
TARGET_LINK_STACK for -mtune=476 and -mtune=476fp.
(rs6000_legitimize_tls_address): Emit the link stack preserving GOT
code if TARGET_LINK_STACK.
(rs6000_emit_load_toc_table): Likewise.
(output_function_profiler): Likewise
(macho_branch_islands): Likewise
(machopic_output_stub): Likewise
(get_ppc476_thunk_name): New function.
(rs6000_code_end): Likewise.
* config/rs6000/rs6000.md (load_toc_v4_PIC_1, load_toc_v4_PIC_1b):
Convert to a define_expand.
(load_toc_v4_PIC_1_normal): New define_insn.
(load_toc_v4_PIC_1_476): Likewise.
(load_toc_v4_PIC_1b_normal): Likewise.
(load_toc_v4_PIC_1b_476): Likewise.

Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 179091)
+++ gcc/config.gcc  (working copy)
@@ -2133,6 +2133,9 @@ powerpc-*-linux* | powerpc64-*-linux*)
esac
tmake_file=${tmake_file} t-slibgcc-libgcc
case ${target} in
+   powerpc*-*-linux*ppc476*)
+   tm_file=${tm_file} rs6000/476.h
+   extra_options=${extra_options} rs6000/476.opt ;;
powerpc*-*-linux*altivec*)
tm_file=${tm_file} rs6000/linuxaltivec.h ;;
powerpc*-*-linux*spe*)
Index: gcc/config/rs6000/476.h
===
--- gcc/config/rs6000/476.h (revision 0)
+++ gcc/config/rs6000/476.h (revision 0)
@@ -0,0 +1,32 @@
+/* Enable IBM PowerPC 476 support.
+   Copyright (C) 2011 Free Software Foundation, Inc.
+   Contributed by Peter Bergner (berg...@vnet.ibm.com)
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as

Re: RFA: libstdc++ PATCH to initializer_list to #error in C++98 mode

2011-10-31 Thread Paolo Carlini


On 10/31/2011 09:32 PM, Jason Merrill wrote:

On 10/31/2011 04:24 PM, Paolo Carlini wrote:

Sure. Note anyway, that bits/c++config.h is already included, elsewhere
too in libsupc++.
I guess the other option would be to add it to 
install-freestanding-headers.
Of course. I think Benjamin followed in better detail the issues having 
to with libsupc++ vs the C++ runtime proper, freestanding, making sure 
there aren't overly annoying links, etc. For now you can of course 
commit the patch as-is, only please make sure that the error message is 
by and large consistent with the one we provide via c++0x_warning.h.


Paolo.

Re: Go patch committed: Implement new syscall package

2011-10-31 Thread Ian Lance Taylor

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 Ian,

 I committed this patch which should fix this problem.  Bootstrapped and
 ran Go testsuite on x86_64-unknown-linux-gnu.

 thanks, but this is not enough:

 nawk: syntax error at source line 173
  context is
  ([^   ]*)$,   cparam) == 0) {
 nawk: illegal statement at source line 173
 nawk: syntax error at source line 179

 and there is another instance on l.210.  I haven't tried fixing this
 myself since I'm fighting with other issues.

Whoops, my patch was incomplete.  Sorry about that.  Fixed by this
patch.  Bootstrapped on x86_64-unknown-linux-gnu and saw that it did not
change the code generated by the script.  Committed to mainline.

Ian

diff -r a880b911554e libgo/go/syscall/mksyscall.awk
--- a/libgo/go/syscall/mksyscall.awk	Fri Oct 28 15:05:12 2011 -0700
+++ b/libgo/go/syscall/mksyscall.awk	Mon Oct 31 14:41:12 2011 -0700
@@ -170,7 +170,7 @@
 	printf(\t}\n)
 
 	++carg
-	if (match(cargs[carg], ^([^ ]*) ([^ ]*)$, cparam) == 0) {
+	if (split(cargs[carg], cparam) != 2) {
 		print loc, bad C parameter:, cargs[carg] | cat 12
 		status = 1
 		next
@@ -207,7 +207,7 @@
 	}
 	usedr = 0
 	for (goresult = 1; goresults[goresult] != ; goresult++) {
-	if (match(goresults[goresult], ^([^ ]*) ([^ ]*)$, goparam) == 0) {
+	if (split(goresults[goresult], goparam) != 2) {
 		print loc, bad result:, goresults[goresult] | cat 12
 		status = 1
 		next

Re: implementation of std::thread::hardware_concurrency()

2011-10-31 Thread Mike Stump

On Oct 31, 2011, at 2:10 PM, niXman wrote:
 This is patch is implement the std::thread::hardware_concurrency().

[ general comment ] Ick, this isn't what I'd call clean.  Maybe a porting 
header inclusion that defines a static inline pthread_num_processors_np when on 
those system that don't have it.  With that then this routine could just use 
pthread_num_processors_np instead after including that porting header.  Having 
dozens of files with cascades of #if went out of fashion back in the 1990s.

Re: [PATCH] Handle many consecutive location notes more efficiently in dwarf2.

2011-10-31 Thread David Miller

From: Jakub Jelinek ja...@redhat.com
Date: Mon, 31 Oct 2011 11:26:40 +0100

 Or alternatively you could remove the whole if (! !next_note ...)
 next_note = NULL_RTX; stmt and move your cache to a global var and
 clear it when reaching end of function (like
 e.g. last_var_location_insn is cleared in dwarf2out_end_epilogue).

This solution sounds the best, thanks Jakub!

If I get some time I'll see if I can more strongly integrate my changes
with the existing last_label sharing code there, as all of these tests
are checking for essentially the same thing.


Invalidate cached next real insn in dwarf2out_end_epilogue().

* dwarf2out.c (cached_next_real_insn): New.
(dwarf2out_end_epilogue): Set it to NULL_RTX.
(dwarf2out_var_location): Remove cached_next_real_insn local static.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180713 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog   |6 ++
 gcc/dwarf2out.c |3 ++-
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index caed12e..4848147 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2011-10-31  David S. Miller  da...@davemloft.net
+
+   * dwarf2out.c (cached_next_real_insn): New.
+   (dwarf2out_end_epilogue): Set it to NULL_RTX.
+   (dwarf2out_var_location): Remove cached_next_real_insn local static.
+
 2011-10-31  Richard Henderson  r...@redhat.com
 
* config/i386/sse.md (floatv8siv8sf2): Rename from avx_cvtdq2ps256.
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 478952f..e6f86a4 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -98,6 +98,7 @@ along with GCC; see the file COPYING3.  If not see
 
 static void dwarf2out_source_line (unsigned int, const char *, int, bool);
 static rtx last_var_location_insn;
+static rtx cached_next_real_insn;
 
 #ifdef VMS_DEBUGGING_INFO
 int vms_file_stats_name (const char *, long long *, long *, char *, int *);
@@ -1090,6 +1091,7 @@ dwarf2out_end_epilogue (unsigned int line 
ATTRIBUTE_UNUSED,
   char label[MAX_ARTIFICIAL_LABEL_BYTES];
 
   last_var_location_insn = NULL_RTX;
+  cached_next_real_insn = NULL_RTX;
 
   if (dwarf2out_do_cfi_asm ())
 fprintf (asm_out_file, \t.cfi_endproc\n);
@@ -20132,7 +20134,6 @@ dwarf2out_var_location (rtx loc_note)
   static const char *last_postcall_label;
   static bool last_in_cold_section_p;
   static rtx expected_next_loc_note;
-  static rtx cached_next_real_insn;
   tree decl;
   bool var_loc_p;
 
-- 
1.7.6.401.g6a319

Re: Go patch committed: Implement new syscall package

2011-10-31 Thread Ian Lance Taylor

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 /vol/gcc/src/hg/trunk/local/libgo/go/syscall/errstr_nor.go:22:8: error: 
 referenc
 e to undefined name 'libc_strerror'
 make[4]: *** [syscall/syscall.lo] Error 1

Sorry about that.  I thought I had tested that, but evidently not.

Fixed like so.  Committed to mainline.

Ian

diff -r 56a1bd1d907a libgo/go/syscall/errstr_nor.go
--- a/libgo/go/syscall/errstr_nor.go	Mon Oct 31 14:41:55 2011 -0700
+++ b/libgo/go/syscall/errstr_nor.go	Mon Oct 31 14:53:22 2011 -0700
@@ -11,7 +11,7 @@
 	unsafe
 )
 
-//sysnb	strerror(errnum int) *byte
+//sysnb	strerror(errnum int) (buf *byte)
 //strerror(errnum int) *byte
 
 var errstr_lock sync.Mutex
@@ -19,7 +19,7 @@
 func Errstr(errno int) string {
 	errstr_lock.Lock()
 
-	bp := libc_strerror(errno)
+	bp := strerror(errno)
 	b := (*[1000]byte)(unsafe.Pointer(bp))
 	i := 0
 	for b[i] != 0 {

Re: Go patch committed: Update Go library

2011-10-31 Thread Ian Lance Taylor

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 the only issue I've found on Solaris is the use of pthread_yield, which
 doesn't exist even on Solaris 11.  The following patch checks for this,
 and falls back to thr_yield if available.

Rather than that patch, I changed the code to use sched_yield rather
than pthread_yield.  I realized that libgo is already using sched_yield,
in runtime/go-sched.c.  There shouldn't be any portability penalty to
also using it in yield.c.

Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian

diff -r 7135ea46b116 libgo/runtime/yield.c
--- a/libgo/runtime/yield.c	Mon Oct 31 14:53:56 2011 -0700
+++ b/libgo/runtime/yield.c	Mon Oct 31 14:58:19 2011 -0700
@@ -9,7 +9,7 @@
 #include stddef.h
 #include sys/types.h
 #include sys/time.h
-#include pthread.h
+#include sched.h
 #include unistd.h
 
 #ifdef HAVE_SYS_SELECT_H
@@ -38,7 +38,7 @@
 void
 runtime_osyield (void)
 {
-  pthread_yield ();
+  sched_yield ();
 }
 
 /* Sleep for some number of microseconds.  */

[PATCH] Allow zero operand in sparc VIS3 cmask patterns.

2011-10-31 Thread David Miller


I noticed this while working on vcond patterns for sparc.

Committed to trunk.

gcc/

* config/sparc/sparc.md (cmask patterns): Allow zero operand.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@180715 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog |2 ++
 gcc/config/sparc/sparc.md |6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 4848147..ebf8cdc 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,7 @@
 2011-10-31  David S. Miller  da...@davemloft.net
 
+   * config/sparc/sparc.md (cmask patterns): Allow zero operand.
+
* dwarf2out.c (cached_next_real_insn): New.
(dwarf2out_end_epilogue): Set it to NULL_RTX.
(dwarf2out_var_location): Remove cached_next_real_insn local static.
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 6dd3909..fbd1a87 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -8452,7 +8452,7 @@
 ;; Conditional moves are possible via fcmpX -- cmaskX - bshuffle
 (define_insn cmask8P:mode_vis
   [(set (reg:DI GSR_REG)
-(unspec:DI [(match_operand:P 0 register_operand r)
+(unspec:DI [(match_operand:P 0 register_or_zero_operand rJ)
(reg:DI GSR_REG)]
UNSPEC_CMASK8))]
   TARGET_VIS3
@@ -8460,7 +8460,7 @@
 
 (define_insn cmask16P:mode_vis
   [(set (reg:DI GSR_REG)
-(unspec:DI [(match_operand:P 0 register_operand r)
+(unspec:DI [(match_operand:P 0 register_or_zero_operand rJ)
(reg:DI GSR_REG)]
UNSPEC_CMASK16))]
   TARGET_VIS3
@@ -8468,7 +8468,7 @@
 
 (define_insn cmask32P:mode_vis
   [(set (reg:DI GSR_REG)
-(unspec:DI [(match_operand:P 0 register_operand r)
+(unspec:DI [(match_operand:P 0 register_or_zero_operand rJ)
(reg:DI GSR_REG)]
UNSPEC_CMASK32))]
   TARGET_VIS3
-- 
1.7.6.401.g6a319

PATCH: Move f16c intrinsics into f16cintrin.h

2011-10-31 Thread Quentin Neill

Hi,

This patch moves f16c intrinsics out of immintrin.h into their own
header f16cintrin.h

Interested parties should view these threads from three years ago:
http://gcc.gnu.org/ml/gcc-patches/2008-11/threads.html#00145
http://gcc.gnu.org/ml/gcc-patches/2008-12/threads.html#00174

Testing on x86_64, okay to commit if no regressions?
-- 
Quentin Neill
-- 

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index caed12e..5af1c78 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2011-10-31  Quentin Neill  quentin.ne...@amd.com
+
+   Piledriver f16cintrin.h fix.
+   * config/i386/f16cintrin.h: Contents moved from immintrin.h.
+
 2011-10-31  Richard Henderson  r...@redhat.com

* config/i386/sse.md (floatv8siv8sf2): Rename from avx_cvtdq2ps256.
diff --git a/gcc/config/i386/f16cintrin.h b/gcc/config/i386/f16cintrin.h
new file mode 100644
index 000..5ff836b
--- /dev/null
+++ b/gcc/config/i386/f16cintrin.h
@@ -0,0 +1,94 @@
+/* Copyright (C) 2011
+   Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   http://www.gnu.org/licenses/.  */
+
+#ifndef _X86INTRIN_H_INCLUDED
+#if (!defined(_X86INTRIN_H_INCLUDED)  !defined(_IMMINTRIN_H_INCLUDED))
+# error Never use f16intrin.h directly; include x86intrin.h or
immintrin.h instead.
+#endif
+
+#ifndef __F16C__
+# error F16C instruction set not enabled
+#else
+
+#ifndef _F16CINTRIN_H_INCLUDED
+#define _F16CINTRIN_H_INCLUDED
+
+extern __inline float __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_cvtsh_ss (unsigned short __S)
+{
+  __v8hi __H = __extension__ (__v8hi){ __S, 0, 0, 0, 0, 0, 0, 0 };
+  __v4sf __A = __builtin_ia32_vcvtph2ps (__H);
+  return __builtin_ia32_vec_ext_v4sf (__A, 0);
+}
+
+extern __inline __m128 __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_mm_cvtph_ps (__m128i __A)
+{
+  return (__m128) __builtin_ia32_vcvtph2ps ((__v8hi) __A);
+}
+
+extern __inline __m256 __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_mm256_cvtph_ps (__m128i __A)
+{
+  return (__m256) __builtin_ia32_vcvtph2ps256 ((__v8hi) __A);
+}
+
+#ifdef __OPTIMIZE__
+extern __inline unsigned short __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_cvtss_sh (float __F, const int __I)
+{
+  __v4sf __A =  __extension__ (__v4sf){ __F, 0, 0, 0 };
+  __v8hi __H = __builtin_ia32_vcvtps2ph (__A, __I);
+  return (unsigned short) __builtin_ia32_vec_ext_v8hi (__H, 0);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_mm_cvtps_ph (__m128 __A, const int __I)
+{
+  return (__m128i) __builtin_ia32_vcvtps2ph ((__v4sf) __A, __I);
+}
+
+extern __inline __m128i __attribute__((__gnu_inline__,
__always_inline__, __artificial__))
+_mm256_cvtps_ph (__m256 __A, const int __I)
+{
+  return (__m128i) __builtin_ia32_vcvtps2ph256 ((__v8sf) __A, __I);
+}
+#else
+#define _cvtss_sh(__F, __I)\
+  (__extension__   \
+   ({  \
+  __v4sf __A =  __extension__ (__v4sf){ __F, 0, 0, 0 };\
+  __v8hi __H = __builtin_ia32_vcvtps2ph (__A, __I);
\
+  (unsigned short) __builtin_ia32_vec_ext_v8hi (__H, 0);   \
+}))
+
+#define _mm_cvtps_ph(A, I) \
+  ((__m128i) __builtin_ia32_vcvtps2ph ((__v4sf)(__m128) A, (int) (I)))
+
+#define _mm256_cvtps_ph(A, I) \
+  ((__m128i) __builtin_ia32_vcvtps2ph256 ((__v8sf)(__m256) A, (int) (I)))
+#endif
+
+#endif /* __F16C__ */
+#endif
diff --git a/gcc/config/i386/immintrin.h b/gcc/config/i386/immintrin.h
index 102814e..986a573 100644
--- a/gcc/config/i386/immintrin.h
+++ b/gcc/config/i386/immintrin.h
@@ -76,6 +76,10 @@
 #include fmaintrin.h
 #endif

+#ifdef __F16C__
+#include f16cintrin.h
+#endif
+
 #ifdef __RDRND__
 extern __inline int
 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
@@ -161,63 +165,4 @@ _rdrand64_step (unsigned long long *__P)
 #endif /* __RDRND__ */
 #endif /* __x86_64__  */

-#ifdef __F16C__
-extern __inline

[PATCH] Add fixuns_truncmodesseintvecmodelower2

2011-10-31 Thread Jakub Jelinek

Hi!

This allows to vectorize float - uint conversion.
To convert V{4,8}SFmode op0 to V{4,8}SImode target, it emits:
  V{4,8}SFmode mask = op0 = { INT_MAX + 1U + .0f, INT_MAX + 1U + .0f, ... }
// non-signalling GE
  V{4,8}SFmode tmp1 = mask  { 2.0f * INT_MIN, 2.0f * INT_MIN, ... }
  V{4,8}SFmode tmp2 = op0 + tmp1
  V{4,8}SImode target = (V{4,8}SImode) tmp2
TARGET_AVX is needed, because pre-AVX we didn't have non-signalling GE in
cmpps and we don't want to raise exceptions if op0 is QNaN (scalar code uses
vucomiss).

Ok for trunk?

2011-10-31  Jakub Jelinek  ja...@redhat.com

* config/i386/sse.md (fixuns_truncmodesseintvecmodelower2): New
expander.

--- gcc/config/i386/sse.md.jj   2011-10-31 21:05:21.0 +0100
+++ gcc/config/i386/sse.md  2011-10-31 22:53:13.0 +0100
@@ -2322,6 +2322,35 @@ (define_insn fix_truncv4sfv4si2
(set_attr prefix maybe_vex)
(set_attr mode TI)])
 
+(define_expand fixuns_truncmodesseintvecmodelower2
+  [(set (match_dup 4)
+   (unspec:VF1
+ [(match_operand:VF1 1 register_operand )
+  (match_dup 2)
+  (const_int 29)] UNSPEC_PCMP))
+   (set (match_dup 5)
+   (and:VF1 (match_dup 4) (match_dup 3)))
+   (set (match_dup 6)
+   (plus:VF1 (match_dup 1) (match_dup 5)))
+   (set (match_operand:sseintvecmode 0 register_operand )
+   (fix:sseintvecmode (match_dup 6)))]
+  TARGET_AVX
+{
+  REAL_VALUE_TYPE MTWO32r, TWO31r;
+  int i;
+
+  real_ldexp (TWO31r, dconst1, 31);
+  operands[2] = const_double_from_real_value (TWO31r, SFmode);
+  operands[2] = ix86_build_const_vector (MODEmode, 1, operands[2]);
+  operands[2] = force_reg (MODEmode, operands[2]);
+  real_ldexp (MTWO32r, dconstm1, 32);
+  operands[3] = const_double_from_real_value (MTWO32r, SFmode);
+  operands[3] = ix86_build_const_vector (MODEmode, 1, operands[3]);
+  operands[3] = force_reg (MODEmode, operands[3]);
+  for (i = 4; i  7; i++)
+operands[i] = gen_reg_rtx (MODEmode);
+})
+
 ;
 ;;
 ;; Parallel double-precision floating point conversion operations

Jakub

Re: PATCH: Move f16c intrinsics into f16cintrin.h

2011-10-31 Thread Jakub Jelinek

On Mon, Oct 31, 2011 at 05:23:58PM -0500, Quentin Neill wrote:
 Interested parties should view these threads from three years ago:
 http://gcc.gnu.org/ml/gcc-patches/2008-11/threads.html#00145
 http://gcc.gnu.org/ml/gcc-patches/2008-12/threads.html#00174
 
 Testing on x86_64, okay to commit if no regressions?

You aren't installing the header, so it will cause regressions.
config.gcc needs to be adjusted for it.
 
 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index caed12e..5af1c78 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,8 @@
 +2011-10-31  Quentin Neill  quentin.ne...@amd.com
 +
 + Piledriver f16cintrin.h fix.
 + * config/i386/f16cintrin.h: Contents moved from immintrin.h.
 +
  2011-10-31  Richard Henderson  r...@redhat.com
 
   * config/i386/sse.md (floatv8siv8sf2): Rename from avx_cvtdq2ps256.

Jakub

Re: [PATCH, rs6000] Preserve link stack for 476 cpus

2011-10-31 Thread David Edelsohn

On Mon, Oct 31, 2011 at 5:32 PM, Peter Bergner berg...@vnet.ibm.com wrote:

 Ok, attached below is the updated patch that passes bootstrap and regtesting
 that only enables the new link stack code for 32-bit compiles.  However,
 talking with Alan, he mentioned we just have to mark the opd entry weak
 and that will fix my link problem (confirmed it does).  It seems we might
 want to allow this on 64-bit too, since it actually makes the code cleaner
 wrt where we set TARGET_LINK_STACK.  To get 64-bit working, we only need the
 following patch on top of the 32-bit only patch below:

Okay, go ahead with PPC64 support as well.  Hopefully no one ever will
have to use it.  That implies the option should not explicitly
reference ppc476.

- David

Re: PATCH: Move f16c intrinsics into f16cintrin.h

2011-10-31 Thread Quentin Neill

On Mon, Oct 31, 2011 at 5:31 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Mon, Oct 31, 2011 at 05:23:58PM -0500, Quentin Neill wrote:
 Interested parties should view these threads from three years ago:
 http://gcc.gnu.org/ml/gcc-patches/2008-11/threads.html#00145
 http://gcc.gnu.org/ml/gcc-patches/2008-12/threads.html#00174

 Testing on x86_64, okay to commit if no regressions?

 You aren't installing the header, so it will cause regressions.
 config.gcc needs to be adjusted for it.

Arggh.  Thanks, my tests found that too.

Reposting, okay to commit after testing on x86_64 if no regressions?
--
Quentin Neill
From c0379bf7dacbe457813893cdaf381ae7206566c7 Mon Sep 17 00:00:00 2001
From: Quentin Neill quentin.ne...@amd.com
Date: Mon, 31 Oct 2011 16:54:18 -0500
Subject: [PATCH] 2011-10-31  Quentin Neill  quentin.ne...@amd.com

Piledriver f16cintrin.h fix.
* config/i386/f16cintrin.h: Contents moved from immintrin.h.
* config/config.gcc: Add f16cintrin.h.
---
 gcc/ChangeLog|6 +++
 gcc/config.gcc   |4 +-
 gcc/config/i386/f16cintrin.h |   94 ++
 gcc/config/i386/immintrin.h  |   63 ++--
 4 files changed, 106 insertions(+), 61 deletions(-)
 create mode 100644 gcc/config/i386/f16cintrin.h

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index caed12e..14a4392 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2011-10-31  Quentin Neill  quentin.ne...@amd.com
+
+   Piledriver f16cintrin.h fix.
+   * config/i386/f16cintrin.h: Contents moved from immintrin.h.
+   * config/config.gcc: Add f16cintrin.h.
+
 2011-10-31  Richard Henderson  r...@redhat.com
 
* config/i386/sse.md (floatv8siv8sf2): Rename from avx_cvtdq2ps256.
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 2c18655..2b60e77 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -361,7 +361,7 @@ i[34567]86-*-*)
   immintrin.h x86intrin.h avxintrin.h xopintrin.h
   ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
   lzcntintrin.h bmiintrin.h bmi2intrin.h tbmintrin.h
-  avx2intrin.h fmaintrin.h
+  avx2intrin.h fmaintrin.h f16cintrin.h
;;
 x86_64-*-*)
cpu_type=i386
@@ -374,7 +374,7 @@ x86_64-*-*)
   immintrin.h x86intrin.h avxintrin.h xopintrin.h
   ia32intrin.h cross-stdarg.h lwpintrin.h popcntintrin.h
   lzcntintrin.h bmiintrin.h tbmintrin.h bmi2intrin.h
-  avx2intrin.h fmaintrin.h
+  avx2intrin.h fmaintrin.h f16cintrin.h
need_64bit_hwint=yes
;;
 ia64-*-*)
diff --git a/gcc/config/i386/f16cintrin.h b/gcc/config/i386/f16cintrin.h
new file mode 100644
index 000..5ff836b
--- /dev/null
+++ b/gcc/config/i386/f16cintrin.h
@@ -0,0 +1,94 @@
+/* Copyright (C) 2011
+   Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   Under Section 7 of GPL version 3, you are granted additional
+   permissions described in the GCC Runtime Library Exception, version
+   3.1, as published by the Free Software Foundation.
+
+   You should have received a copy of the GNU General Public License and
+   a copy of the GCC Runtime Library Exception along with this program;
+   see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+   http://www.gnu.org/licenses/.  */
+
+#ifndef _X86INTRIN_H_INCLUDED
+#if (!defined(_X86INTRIN_H_INCLUDED)  !defined(_IMMINTRIN_H_INCLUDED))
+# error Never use f16intrin.h directly; include x86intrin.h or 
immintrin.h instead.
+#endif
+
+#ifndef __F16C__
+# error F16C instruction set not enabled
+#else
+
+#ifndef _F16CINTRIN_H_INCLUDED
+#define _F16CINTRIN_H_INCLUDED
+
+extern __inline float __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_cvtsh_ss (unsigned short __S)
+{
+  __v8hi __H = __extension__ (__v8hi){ __S, 0, 0, 0, 0, 0, 0, 0 };
+  __v4sf __A = __builtin_ia32_vcvtph2ps (__H);
+  return __builtin_ia32_vec_ext_v4sf (__A, 0);
+}
+
+extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm_cvtph_ps (__m128i __A)
+{
+  return (__m128) __builtin_ia32_vcvtph2ps ((__v8hi) __A);
+}
+
+extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
+_mm256_cvtph_ps (__m128i __A)
+{
+  return (__m256) __builtin_ia32_vcvtph2ps256 ((__v8hi) __A);
+}
+
+#ifdef __OPTIMIZE__
+extern __inline unsigned

Re: [PATCH] Add fixuns_truncmodesseintvecmodelower2

2011-10-31 Thread Richard Henderson

On 10/31/2011 03:29 PM, Jakub Jelinek wrote:
   * config/i386/sse.md (fixuns_truncmodesseintvecmodelower2): New
   expander.

Ok.


r~

[google] Enable loop unroll/peel notes under -fopt-info

2011-10-31 Thread Teresa Johnson

This patch is for google-main only.

Tested with bootstrap and regression tests.

Print unroll and peel factors along with loop source position under -fopt-info.

Teresa

2011-10-31   Teresa Johnson  tejohn...@google.com

* common.opt (fopt-info): Disable -fopt-info by default.
* loop-unroll.c (report_unroll_peel): New function.
(unroll_and_peel_loops): Call record_loop_exits for later use.
(peel_loops_completely): Print the loop source position in dump
info and emit note under -fopt-info.
(decide_unroll_and_peeling): Ditto.
(decide_peel_once_rolling): Record peel factor for use in note
emission.
(decide_peel_completely): Ditto.
* cfgloop.c (get_loop_location): New function.
* cfgloop.h (get_loop_location): Ditto.
* tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Emit note
under -fopt-info.

Index: tree-ssa-loop-ivcanon.c
===
--- tree-ssa-loop-ivcanon.c (revision 180437)
+++ tree-ssa-loop-ivcanon.c (working copy)
@@ -52,6 +52,7 @@
 #include flags.h
 #include tree-inline.h
 #include target.h
+#include diagnostic.h

 /* Specifies types of loops that may be unrolled.  */

@@ -443,6 +444,17 @@
 fprintf (dump_file, Unrolled loop %d completely by factor %d.\n,
  loop-num, (int) n_unroll);

+  if (flag_opt_info = OPT_INFO_MIN)
+{
+  location_t locus;
+  locus = gimple_location (cond);
+
+  inform (locus, Completely Unroll loop by %d (execution count
%d, const iterations %d),
+  (int) n_unroll,
+  (int) loop-header-count,
+  (int) TREE_INT_CST_LOW(niter));
+}
+
   return true;
 }

Index: loop-unroll.c
===
--- loop-unroll.c   (revision 180437)
+++ loop-unroll.c   (working copy)
@@ -34,6 +34,7 @@
 #include hashtab.h
 #include recog.h
 #include target.h
+#include diagnostic.h

 /* This pass performs loop unrolling and peeling.  We only perform these
optimizations on innermost loops (with single exception) because
@@ -152,6 +153,30 @@
 basic_block);
 static rtx get_expansion (struct var_to_expand *);

+static void
+report_unroll_peel(struct loop *loop, location_t locus)
+{
+  struct niter_desc *desc;
+  int niters = 0;
+
+  desc = get_simple_loop_desc (loop);
+
+  if (desc-const_iter)
+niters = desc-niter;
+  else if (loop-header-count)
+niters = expected_loop_iterations (loop);
+
+  inform (locus, %s%s loop by %d (execution count %d, %s iterations %d),
+  loop-lpt_decision.decision == LPT_PEEL_COMPLETELY ?
+Completely  : ,
+  loop-lpt_decision.decision == LPT_PEEL_SIMPLE ?
+Peel : Unroll,
+  loop-lpt_decision.times,
+  (int)loop-header-count,
+  desc-const_iter?const:average,
+  niters);
+}
+
 /* Unroll and/or peel (depending on FLAGS) LOOPS.  */
 void
 unroll_and_peel_loops (int flags)
@@ -160,6 +185,8 @@
   bool check;
   loop_iterator li;

+  record_loop_exits();
+
   /* First perform complete loop peeling (it is almost surely a win,
  and affects parameters for further decision a lot).  */
   peel_loops_completely (flags);
@@ -234,16 +261,18 @@
 {
   struct loop *loop;
   loop_iterator li;
+  location_t locus;

   /* Scan the loops, the inner ones first.  */
   FOR_EACH_LOOP (li, loop, LI_FROM_INNERMOST)
 {
   loop-lpt_decision.decision = LPT_NONE;
+  locus = get_loop_location(loop);

   if (dump_file)
-   fprintf (dump_file,
-\n;; *** Considering loop %d for complete peeling ***\n,
-loop-num);
+   fprintf (dump_file, \n;; *** Considering loop %d for complete
peeling at BB %d from %s:%d ***\n,
+ loop-num, loop-header-index, LOCATION_FILE(locus),
+ LOCATION_LINE(locus));

   loop-ninsns = num_loop_insns (loop);

@@ -253,6 +282,11 @@

   if (loop-lpt_decision.decision == LPT_PEEL_COMPLETELY)
{
+  if (flag_opt_info = OPT_INFO_MIN)
+{
+  report_unroll_peel(loop, locus);
+}
+
  peel_loop_completely (loop);
 #ifdef ENABLE_CHECKING
  verify_dominators (CDI_DOMINATORS);
@@ -268,14 +302,18 @@
 {
   struct loop *loop;
   loop_iterator li;
+  location_t locus;

   /* Scan the loops, inner ones first.  */
   FOR_EACH_LOOP (li, loop, LI_FROM_INNERMOST)
 {
   loop-lpt_decision.decision = LPT_NONE;
+  locus = get_loop_location(loop);

   if (dump_file)
-   fprintf (dump_file, \n;; *** Considering loop %d ***\n, loop-num);
+   fprintf (dump_file, \n;; *** Considering loop %d at BB %d from %s:%d 
***\n,
+ loop-num, loop-header-index, LOCATION_FILE(locus),
+ LOCATION_LINE(locus));

   /* Do not peel cold areas.  */
   if (optimize_loop_for_size_p (loop))
@@

Re: C++ PATCH to add -std=c++11 ??

2011-10-31 Thread Jason Merrill


On 10/31/2011 01:57 PM, Jason Merrill wrote:

On 10/31/2011 06:39 AM, Paolo Carlini wrote:

Great. When you commit it, you can as well add 'PR c++/50920' to the
ChangeLog!


OK, here's what I'm checking in. There are a lot more instances of C++0x
in comments and cxx_dialect checks, but I'm not going to worry about
those now.


And some doc changes:

commit 611f8e25ffb46a089716187f84414ed4ae56fde4
Author: Jason Merrill ja...@redhat.com
Date:   Mon Oct 31 22:02:42 2011 -0400

	* doc/invoke.texi: Update for -std=c++11.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1aa0541..0c97453 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -235,7 +235,7 @@ Objective-C and Objective-C++ Dialects}.
 -pedantic-errors @gol
 -w  -Wextra  -Wall  -Waddress  -Waggregate-return  -Warray-bounds @gol
 -Wno-attributes -Wno-builtin-macro-redefined @gol
--Wc++-compat -Wc++0x-compat -Wcast-align  -Wcast-qual  @gol
+-Wc++-compat -Wc++11-compat -Wcast-align  -Wcast-qual  @gol
 -Wchar-subscripts -Wclobbered  -Wcomment @gol
 -Wconversion  -Wcoverage-mismatch  -Wno-cpp  -Wno-deprecated  @gol
 -Wno-deprecated-declarations -Wdisabled-optimization  @gol
@@ -1574,16 +1574,13 @@ C++ code.
 GNU dialect of @option{-std=c++98}.  This is the default for
 C++ code.
 
-@item c++0x
-The working draft of the upcoming ISO C++0x standard. This option
-enables experimental features that are likely to be included in
-C++0x. The working draft is constantly changing, and any feature that is
-enabled by this flag may be removed from future versions of GCC if it is
-not part of the C++0x standard.
+@item c++11
+The 2011 ISO C++ standard plus amendments.  Support for C++11 is still
+experimental, and may change in incompatible ways in future releases.
 
-@item gnu++0x
-GNU dialect of @option{-std=c++0x}. This option enables
-experimental features that may be removed in future versions of GCC.
+@item gnu++11
+GNU dialect of @option{-std=c++11}. Support for C++11 is still
+experimental, and may change in incompatible ways in future releases.
 @end table
 
 @item -fgnu89-inline
@@ -1870,7 +1867,7 @@ Version 5 corrects the mangling of attribute const/volatile on
 function pointer types, decltype of a plain decl, and use of a
 function parameter in the declaration of another parameter.
 
-Version 6 corrects the promotion behavior of C++0x scoped enums.
+Version 6 corrects the promotion behavior of C++11 scoped enums.
 
 See also @option{-Wabi}.
 
@@ -1905,7 +1902,7 @@ been added for putting variables into BSS without making them common.
 
 @item -fconstexpr-depth=@var{n}
 @opindex fconstexpr-depth
-Set the maximum nested evaluation depth for C++0x constexpr functions
+Set the maximum nested evaluation depth for C++11 constexpr functions
 to @var{n}.  A limit is needed to detect endless recursion during
 constant expression evaluation.  The minimum specified by the standard
 is 512.
@@ -2093,7 +2090,7 @@ Set the maximum instantiation depth for template classes to @var{n}.
 A limit on the template instantiation depth is needed to detect
 endless recursions during template class instantiation.  ANSI/ISO C++
 conforming programs must not rely on a maximum depth greater than 17
-(changed to 1024 in C++0x).  The default value is 900, as the compiler
+(changed to 1024 in C++11).  The default value is 900, as the compiler
 can run out of stack space before hitting 1024 in some situations.
 
 @item -fno-threadsafe-statics
@@ -2368,14 +2365,14 @@ by @option{-Wall}.
 @item -Wno-narrowing @r{(C++ and Objective-C++ only)}
 @opindex Wnarrowing
 @opindex Wno-narrowing
-With -std=c++0x, suppress the diagnostic required by the standard for
+With -std=c++11, suppress the diagnostic required by the standard for
 narrowing conversions within @samp{@{ @}}, e.g.
 
 @smallexample
 int i = @{ 2.2 @}; // error: narrowing from double to int
 @end smallexample
 
-This flag can be useful for compiling valid C++98 code in C++0x mode
+This flag can be useful for compiling valid C++98 code in C++11 mode.
 
 @item -Wnoexcept @r{(C++ and Objective-C++ only)}
 @opindex Wnoexcept
@@ -2993,7 +2990,7 @@ Options} and @ref{Objective-C and Objective-C++ Dialect Options}.
 
 @gccoptlist{-Waddress   @gol
 -Warray-bounds @r{(only with} @option{-O2}@r{)}  @gol
--Wc++0x-compat  @gol
+-Wc++11-compat  @gol
 -Wchar-subscripts  @gol
 -Wenum-compare @r{(in C/Objc; this is on by default in C++)} @gol
 -Wimplicit-int @r{(C and Objective-C only)} @gol
@@ -4063,10 +4060,10 @@ Warn about ISO C constructs that are outside of the common subset of
 ISO C and ISO C++, e.g.@: request for implicit conversion from
 @code{void *} to a pointer to non-@code{void} type.
 
-@item -Wc++0x-compat @r{(C++ and Objective-C++ only)}
+@item -Wc++11-compat @r{(C++ and Objective-C++ only)}
 Warn about C++ constructs whose meaning differs between ISO C++ 1998 and
-ISO C++ 200x, e.g., identifiers in ISO C++ 1998 that will become keywords
-in ISO C++ 200x.  This warning is enabled by @option{-Wall}.
+ISO C++

Re: RFA: libstdc++ PATCH to initializer_list to #error in C++98 mode

2011-10-31 Thread Jason Merrill


Here's what I'm checking in:
commit 6e82dfcf49c92195b5d4bc4b522207b92bad554f
Author: Jason Merrill ja...@redhat.com
Date:   Mon Oct 31 01:21:49 2011 -0400

	* include/Makefile.am (install-freestanding-headers): Install
	c++0x_warning.h.
	* libsupc++/initializer_list: Include it.

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index 02deefc..74acbf0 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -1201,8 +1201,10 @@ endif
 # are installed by libsupc++, so only the first four and the sub-includes
 # are copied here.
 install-freestanding-headers:
-	$(mkinstalldirs) $(DESTDIR)${gxx_include_dir}
+	$(mkinstalldirs) $(DESTDIR)${gxx_include_dir}/bits
 	$(mkinstalldirs) $(DESTDIR)${host_installdir}
+	$(INSTALL_DATA) ${glibcxx_srcdir}/include/bits/c++0x_warning.h \
+	  $(DESTDIR)${gxx_include_dir}/bits
 	for file in ${host_srcdir}/os_defines.h ${host_builddir}/c++config.h \
 	  ${glibcxx_srcdir}/$(ABI_TWEAKS_SRCDIR)/cxxabi_tweaks.h \
 	  ${glibcxx_srcdir}/$(CPU_DEFINES_SRCDIR)/cpu_defines.h; do \
diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in
index e3e75a0..8b805ba 100644
--- a/libstdc++-v3/include/Makefile.in
+++ b/libstdc++-v3/include/Makefile.in
@@ -1586,8 +1586,10 @@ ${pch3_output}: ${pch3_source} ${pch2_output}
 # are installed by libsupc++, so only the first four and the sub-includes
 # are copied here.
 install-freestanding-headers:
-	$(mkinstalldirs) $(DESTDIR)${gxx_include_dir}
+	$(mkinstalldirs) $(DESTDIR)${gxx_include_dir}/bits
 	$(mkinstalldirs) $(DESTDIR)${host_installdir}
+	$(INSTALL_DATA) ${glibcxx_srcdir}/include/bits/c++0x_warning.h \
+	  $(DESTDIR)${gxx_include_dir}/bits
 	for file in ${host_srcdir}/os_defines.h ${host_builddir}/c++config.h \
 	  ${glibcxx_srcdir}/$(ABI_TWEAKS_SRCDIR)/cxxabi_tweaks.h \
 	  ${glibcxx_srcdir}/$(CPU_DEFINES_SRCDIR)/cpu_defines.h; do \
diff --git a/libstdc++-v3/include/bits/algorithmfwd.h b/libstdc++-v3/include/bits/algorithmfwd.h
index cc0b98e..fbec55d 100644
--- a/libstdc++-v3/include/bits/algorithmfwd.h
+++ b/libstdc++-v3/include/bits/algorithmfwd.h
@@ -35,7 +35,9 @@
 #include bits/c++config.h
 #include bits/stl_pair.h
 #include bits/stl_iterator_base_types.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index 5708194..0edb8b2 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -40,7 +40,9 @@
 
 #include ext/atomicity.h
 #include debug/debug.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/forward_list.h b/libstdc++-v3/include/bits/forward_list.h
index c80ee50..0fc8323 100644
--- a/libstdc++-v3/include/bits/forward_list.h
+++ b/libstdc++-v3/include/bits/forward_list.h
@@ -33,7 +33,9 @@
 #pragma GCC system_header
 
 #include memory
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_bvector.h b/libstdc++-v3/include/bits/stl_bvector.h
index bddecb0..8f28640 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -57,7 +57,9 @@
 #ifndef _STL_BVECTOR_H
 #define _STL_BVECTOR_H 1
 
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_deque.h b/libstdc++-v3/include/bits/stl_deque.h
index 17ea01a..b924917 100644
--- a/libstdc++-v3/include/bits/stl_deque.h
+++ b/libstdc++-v3/include/bits/stl_deque.h
@@ -60,7 +60,9 @@
 #include bits/concept_check.h
 #include bits/stl_iterator_base_types.h
 #include bits/stl_iterator_base_funcs.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_list.h b/libstdc++-v3/include/bits/stl_list.h
index 56ee2fb..fc1d8f8 100644
--- a/libstdc++-v3/include/bits/stl_list.h
+++ b/libstdc++-v3/include/bits/stl_list.h
@@ -58,7 +58,9 @@
 #define _STL_LIST_H 1
 
 #include bits/concept_check.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_map.h b/libstdc++-v3/include/bits/stl_map.h
index 889e52b..45824f0 100644
--- a/libstdc++-v3/include/bits/stl_map.h
+++ b/libstdc++-v3/include/bits/stl_map.h
@@ -59,7 +59,9 @@
 
 #include bits/functexcept.h
 #include bits/concept_check.h
+#ifdef __GXX_EXPERIMENTAL_CXX0X__
 #include initializer_list
+#endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
diff --git a/libstdc++-v3/include/bits/stl_multimap.h b/libstdc++-v3/include/bits/stl_multimap.h
index 6b74558..fd5a5a8 100644
---

Re: [go]: Port to ALPHA arch - epoll problems

2011-10-31 Thread Ian Lance Taylor

Uros Bizjak ubiz...@gmail.com writes:

 It turned out that the EpollEvent definition in
 libgo/syscalls/epoll/socket_epoll.go is non-portable (if not outright
 dangerous...). The definition does have a FIXME comment, but does not
 take into account the effects of __attribute__((__packed__)) from
 system headers. Contrary to alpha header, x86 has
 __attribute__((__packed__)) added to struct epoll_event definition in
 sys/epoll.h header.

I couldn't work out a way to handle this correctly in mksysinfo.sh or
-fdump-go-spec, so I did it in configure instead.  Bootstrapped and
tested on x86_64-unknown-linux-gnu.  Committed to mainline.  Let me know
if it seems to do the right sort of thing on Alpha GNU/Linux--see if the
generated file TARGET/libgo/epoll.h looks OK.

Ian

Index: libgo/configure.ac
===
--- libgo/configure.ac	(revision 180345)
+++ libgo/configure.ac	(working copy)
@@ -505,6 +505,28 @@ CFLAGS=$CFLAGS -D_LARGEFILE_SOURCE -D_L
 AC_CHECK_TYPES(off64_t)
 CFLAGS=$CFLAGS_hold
 
+dnl Work out the size of the epoll_events struct on GNU/Linux.
+AC_CACHE_CHECK([epoll_event size],
+[libgo_cv_c_epoll_event_size],
+[AC_COMPUTE_INT(libgo_cv_c_epoll_event_size,
+[sizeof (struct epoll_event)],
+[#include sys/epoll.h],
+[libgo_cv_c_epoll_event_size=0])])
+SIZEOF_STRUCT_EPOLL_EVENT=${libgo_cv_c_epoll_event_size}
+AC_SUBST(SIZEOF_STRUCT_EPOLL_EVENT)
+
+dnl Work out the offset of the fd field in the epoll_events struct on
+dnl GNU/Linux.
+AC_CACHE_CHECK([epoll_event data.fd offset],
+[libgo_cv_c_epoll_event_fd_offset],
+[AC_COMPUTE_INT(libgo_cv_c_epoll_event_fd_offset,
+[offsetof (struct epoll_event, data.fd)],
+[#include stddef.h
+#include sys/epoll.h],
+[libgo_cv_c_epoll_event_fd_offset=0])])
+STRUCT_EPOLL_EVENT_FD_OFFSET=${libgo_cv_c_epoll_event_fd_offset}
+AC_SUBST(STRUCT_EPOLL_EVENT_FD_OFFSET)
+
 AC_CACHE_SAVE
 
 if test ${multilib} = yes; then
Index: libgo/go/syscall/socket_linux.go
===
--- libgo/go/syscall/socket_linux.go	(revision 180552)
+++ libgo/go/syscall/socket_linux.go	(working copy)
@@ -164,15 +164,6 @@ func anyToSockaddrOS(rsa *RawSockaddrAny
 	return nil, EAFNOSUPPORT
 }
 
-// We don't take this type directly from the header file because it
-// uses a union.  FIXME.
-
-type EpollEvent struct {
-	Events uint32
-	Fd int32
-	Pad int32
-}
-
 //sysnb	EpollCreate(size int) (fd int, errno int)
 //epoll_create(size int) int
 
Index: libgo/Makefile.am
===
--- libgo/Makefile.am	(revision 180552)
+++ libgo/Makefile.am	(working copy)
@@ -1498,7 +1498,7 @@ endif # !LIBGO_IS_LINUX
 
 # Define socket sizes and types.
 if LIBGO_IS_LINUX
-syscall_socket_file = go/syscall/socket_linux.go
+syscall_socket_file = go/syscall/socket_linux.go epoll.go
 else
 if LIBGO_IS_SOLARIS
 syscall_socket_file = go/syscall/socket_solaris.go
@@ -1582,6 +1582,34 @@ s-sysinfo: $(srcdir)/mksysinfo.sh config
 	$(SHELL) $(srcdir)/../move-if-change tmp-sysinfo.go sysinfo.go
 	$(STAMP) $@
 
+# The epoll struct has an embedded union and is packed on x86_64,
+# which is too complicated for mksysinfo.sh.  We find the offset of
+# the only field we care about in configure.ac, and generate the
+# struct here.
+epoll.go: s-epoll; @true
+s-epoll: Makefile
+	rm -f epoll.go.tmp
+	echo 'package syscall'  epoll.go.tmp
+	echo 'type EpollEvent struct {'  epoll.go.tmp
+	echo '	Events uint32'  epoll.go.tmp
+	case $(SIZEOF_STRUCT_EPOLL_EVENT),$(STRUCT_EPOLL_EVENT_FD_OFFSET) in \
+	0,0) echo 12 *** struct epoll_event data.fd offset unknown; \
+	   exit 1; ;; \
+	8,4) echo '	Fd int32'  epoll.go.tmp; ;; \
+	12,4) echo '	Fd int32'  epoll.go.tmp; \
+	   echo '	Pad [4]byte'  epoll.go.tmp; ;; \
+	12,8) echo '	Pad [4]byte'  epoll.go.tmp; \
+	   echo '	Fd int32'  epoll.go.tmp; ;; \
+	16,8) echo '	Pad [4]byte'  epoll.go.tmp; \
+	   echo '	Fd int32'  epoll.go.tmp; \
+	   echo '	Pad2 [4]byte'  epoll.go.tmp; ;; \
+	*) echo 12 *** struct epoll_event unsupported; \
+	   exit 1; ;; \
+	esac
+	echo '}'  epoll.go.tmp
+	$(SHELL) $(srcdir)/../move-if-change epoll.go.tmp epoll.go
+	$(STAMP) $@
+
 if LIBGO_IS_LINUX
 # os_lib_inotify_lo = os/inotify.lo
 os_lib_inotify_lo =

[PATCH] Fix errors in expand_atomic_store.

2011-10-31 Thread Richard Henderson

* optabs.c (expand_atomic_store): Use create_fixed_operand for
atomic_store optab.  Don't try to fall back to sync_lock_release.

---
The create_fixed_operand thinko is obvious.  The sync_lock_release is
more subtle.  The target is allowed to support only storing 0/1 with
the test_and_set/lock_release pair, and it's allowed to support that
in non-obvious ways.  We don't want to get involved in that.


r~

---
 gcc/ChangeLog.mm |5 +
 gcc/optabs.c |   21 +
 2 files changed, 6 insertions(+), 20 deletions(-)

diff --git a/gcc/optabs.c b/gcc/optabs.c
index 1ecab53..d8ab97e 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -7118,32 +7118,13 @@ expand_atomic_store (rtx mem, rtx val, enum memmodel 
model)
   icode = direct_optab_handler (atomic_store_optab, mode);
   if (icode != CODE_FOR_nothing)
 {
-
-  create_output_operand (ops[0], mem, mode);
+  create_fixed_operand (ops[0], mem);
   create_input_operand (ops[1], val, mode);
   create_integer_operand (ops[2], model);
   if (maybe_expand_insn (icode, 3, ops))
return const0_rtx;
 }
 
-  /* A store of 0 is the same as __sync_lock_release, try that.  */
-  if (CONST_INT_P (val)  INTVAL (val) == 0)
-{
-  icode = direct_optab_handler (sync_lock_release_optab, mode);
-  if (icode != CODE_FOR_nothing)
-   {
- create_fixed_operand (ops[0], mem);
- create_input_operand (ops[1], const0_rtx, mode);
- if (maybe_expand_insn (icode, 2, ops))
-   {
- /* lock_release is only a release barrier.  */
- if (model == MEMMODEL_SEQ_CST)
-   expand_builtin_mem_thread_fence (model);
- return const0_rtx;
-   }
-   }
-}
-
   /* If the size of the object is greater than word size on this target,
  a default store will not be atomic, Try a mem_exchange and throw away
  the result.  If that doesn't work, don't do anything.  */
-- 
1.7.6.4

[cxx-mem-model] i386 atomic load/store

2011-10-31 Thread Richard Henderson

I'm considering the following.  Does anyone believe this i386/i486 decision
re DImode is a mistake?  Should I limit that to Pentium by checking cmpxchg?


r~
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 7ce57d8..7d28e43 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -248,6 +248,9 @@
   ;; For BMI2 support
   UNSPEC_PDEP
   UNSPEC_PEXT
+
+  ;; For __atomic support
+  UNSPEC_MOVA
 ])
 
 (define_c_enum unspecv [
diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md
index e5579b1..da08e92 100644
--- a/gcc/config/i386/sync.md
+++ b/gcc/config/i386/sync.md
@@ -46,6 +46,88 @@
   lock{%;} or{l}\t{$0, (%%esp)|DWORD PTR [esp], 0}
   [(set_attr memory unknown)])
 
+;; ??? From volume 3 section 7.1.1 Guaranteed Atomic Operations,
+;; Only beginning at Pentium family processors do we get any guarantee of
+;; atomicity in aligned 64-bit quantities.  Beginning at P6, we get a
+;; guarantee for 64-bit accesses that do not cross a cacheline boundary.
+;; This distinction is ignored below, since I *suspect* that FSTLL will
+;; appear atomic from the point of view of user-level threads even back
+;; on the 80386; I suspect that the non-atomicity can only be seen from
+;; other bus-level devices.
+;;
+;; Importantly, *no* processor makes atomicity guarantees for larger
+;; accesses.  In particular, there's no way to perform an atomic TImode
+;; move, despite the apparent applicability of MOVDQA et al.
+
+(define_mode_iterator ATOMIC
+   [QI HI SI (DI TARGET_64BIT || TARGET_80387 || TARGET_SSE)])
+
+(define_expand atomic_loadmode
+  [(set (match_operand:ATOMIC 0 register_operand )
+   (unspec:ATOMIC [(match_operand:ATOMIC 1 memory_operand )
+   (match_operand:SI 2 const_int_operand )]
+  UNSPEC_MOVA))]
+  
+{
+  /* For DImode on 32-bit, we can use the FPU to perform the load.  */
+  if (MODEmode == DImode  !TARGET_64BIT)
+emit_insn (gen_atomic_loaddi_fpu (operands[1], operands[2]));
+  else
+emit_move_insn (operands[0], operands[1]);
+  DONE;
+})
+
+(define_insn_and_split atomic_loaddi_fpu
+  [(set (match_operand:DI 0 register_operand =fx)
+   (unspec:DI [(match_operand:DI 1 memory_operand m)]
+  UNSPEC_MOVA))]
+  !TARGET_64BIT  (TARGET_80387 || TARGET_SSE)
+  #
+   reload_completed
+  [(set (match_dup 0) (match_dup 1))])
+
+(define_expand atomic_storemode
+  [(set (match_operand:ATOMIC 0 memory_operand )
+   (unspec:ATOMIC [(match_operand:ATOMIC 1 register_operand )
+   (match_operand:SI 2 const_int_operand )]
+  UNSPEC_MOVA))]
+  
+{
+  enum memmodel model = (enum memmodel) INTVAL (operands[2]);
+
+  if (MODEmode == DImode  !TARGET_64BIT)
+{
+  /* For DImode on 32-bit, we can use the FPU to perform the store.  */
+  emit_insn (gen_atomic_storedi_fpu (operands[1], operands[2]));
+  if (model == MEMMODEL_SEQ_CST)
+   emit_insn (gen_mem_thread_fence (operands[2]));
+}
+  else
+{
+  /* For non-seq-cst stores, we can simply just perform the store.  */
+  if (model != MEMMODEL_SEQ_CST)
+   {
+ emit_move_insn (operands[0], operands[1]);
+ DONE;
+   }
+
+  /* For sub-word-size, sequentialy-consistent stores, use xchg.  */
+  emit_insn (gen_atomic_exchangemode (gen_reg_rtx (MODEmode),
+   operands[0], operands[1],
+   operands[2]));
+}
+  DONE;
+})
+
+(define_insn_and_split atomic_storedi_fpu
+  [(set (match_operand:DI 0 memory_operand =m)
+   (unspec:DI [(match_operand:DI 1 register_operand fx)]
+  UNSPEC_MOVA))]
+  !TARGET_64BIT  (TARGET_80387 || TARGET_SSE)
+  #
+   reload_completed
+  [(set (match_dup 0) (match_dup 1))])
+
 (define_expand atomic_compare_and_swapmode
   [(match_operand:QI 0 register_operand )  ;; bool success output
(match_operand:SWI124 1 register_operand )  ;; oldval output

Re: Go patch committed: Update Go library

2011-10-31 Thread Ian Lance Taylor

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 * The message points to the wrong line due to a broken test: malloc.goc
   has:

   p = runtime_SysReserve((void*)(0x00f8ULL32), bitmap_size + 
 arena_size);
   if(p == nil)
   runtime_throw(runtime: cannot reserve arena virtual 
 address space);

   On failure, p will be MAP_FAILED ((void *)-1), not nil, so the wrong
   assertion it thrown.

I fixed this particular issue as follows, copying the code from the
other Go library.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu.  Committed to mainline.

Ian

diff -r 1bc825e20b21 libgo/runtime/mem.c
--- a/libgo/runtime/mem.c	Mon Oct 31 21:07:36 2011 -0700
+++ b/libgo/runtime/mem.c	Mon Oct 31 21:53:12 2011 -0700
@@ -85,6 +85,7 @@
 runtime_SysReserve(void *v, uintptr n)
 {
 	int fd = -1;
+	void *p;
 
 	// On 64-bit, people with ulimit -v set complain if we reserve too
 	// much address space.  Instead, assume that the reservation is okay
@@ -103,7 +104,11 @@
 	fd = dev_zero;
 #endif
 
-	return runtime_mmap(v, n, PROT_NONE, MAP_ANON|MAP_PRIVATE, fd, 0);
+	p = runtime_mmap(v, n, PROT_NONE, MAP_ANON|MAP_PRIVATE, fd, 0);
+	if((uintptr)p  4096 || -(uintptr)p  4096) {
+		return nil;
+	}
+	return p;
 }
 
 void

Re: [RFC PATCH] Gather vectorization (PR tree-optimization/50789)

2011-10-31 Thread Toon Moene


On 10/31/2011 03:23 PM, Jakub Jelinek wrote:


On Sat, Oct 29, 2011 at 03:53:37PM +0200, Toon Moene wrote:



I wonder whether it will work with the attached Fortran routine - it
sure would mean a boost to the 18%+ heaviest CPU user in our code.



Would be nice to cut down slightly this testcase into just one or two loops
that are vectorized and turn it into a runtime testcase which verifies
the vectorization was correct.


This is not a verifiable routine yet, but as the linear interpolation 
part already has all the juicy indirection necessary to test this 
vectorization, most of the routine can be thrown away, to leave the 
attached as essential.


--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290  | 4 more
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands   | 4 44
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/wiki/GFortran#news
  SUBROUTINE VERINT (
 I   KLON   , KLAT   , KLEV   , KINT  , KHALO
 I , KLON1  , KLON2  , KLAT1  , KLAT2
 I , KP , KQ , KR
 R , PARG   , PRES
 R , PALFH  , PBETH
 R , PALFA  , PBETA  , PGAMA   )
C
C***
C
C  VERINT - THREE DIMENSIONAL INTERPOLATION
C
C  PURPOSE:
C
C  THREE DIMENSIONAL INTERPOLATION
C
C  INPUT PARAMETERS:
C
C  KLON  NUMBER OF GRIDPOINTS IN X-DIRECTION
C  KLAT  NUMBER OF GRIDPOINTS IN Y-DIRECTION
C  KLEV  NUMBER OF VERTICAL LEVELS
C  KINT  TYPE OF INTERPOLATION
C= 1 - LINEAR
C= 2 - QUADRATIC
C= 3 - CUBIC
C= 4 - MIXED CUBIC/LINEAR
C  KLON1 FIRST GRIDPOINT IN X-DIRECTION
C  KLON2 LAST  GRIDPOINT IN X-DIRECTION
C  KLAT1 FIRST GRIDPOINT IN Y-DIRECTION
C  KLAT2 LAST  GRIDPOINT IN Y-DIRECTION
C  KPARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KQARRAY OF INDEXES FOR HORIZONTAL DISPLACEMENTS
C  KRARRAY OF INDEXES FOR VERTICAL   DISPLACEMENTS
C  PARG  ARRAY OF ARGUMENTS
C  PALFH ALFA HAT
C  PBETH BETA HAT
C  PALFA ARRAY OF WEIGHTS IN X-DIRECTION
C  PBETA ARRAY OF WEIGHTS IN Y-DIRECTION
C  PGAMA ARRAY OF WEIGHTS IN VERTICAL DIRECTION
C
C  OUTPUT PARAMETERS:
C
C  PRES  INTERPOLATED FIELD
C
C  HISTORY:
C
C  J.E. HAUGEN   1  1992
C
C***
C
  IMPLICIT NONE
C
  INTEGER KLON   , KLAT   , KLEV   , KINT   , KHALO,
 IKLON1  , KLON2  , KLAT1  , KLAT2
C
  INTEGER   KP(KLON,KLAT), KQ(KLON,KLAT), KR(KLON,KLAT)
  REALPARG(2-KHALO:KLON+KHALO-1,2-KHALO:KLAT+KHALO-1,KLEV)  ,   
 RPRES(KLON,KLAT) ,
 R   PALFH(KLON,KLAT) ,  PBETH(KLON,KLAT)  ,
 R   PALFA(KLON,KLAT,4)   ,  PBETA(KLON,KLAT,4),
 R   PGAMA(KLON,KLAT,4)
C
  INTEGER JX, JY, IDX, IDY, ILEV
  REAL Z1MAH, Z1MBH
C
C  LINEAR INTERPOLATION
C
  DO JY = KLAT1,KLAT2
  DO JX = KLON1,KLON2
 IDX  = KP(JX,JY)
 IDY  = KQ(JX,JY)
 ILEV = KR(JX,JY)
C
 PRES(JX,JY) = PGAMA(JX,JY,1)*(
C
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV-1) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV-1)
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV-1) ) )
C+
 +   + PGAMA(JX,JY,2)*(
C+
 +   PBETA(JX,JY,1)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY-1,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY-1,ILEV  ) )
 + + PBETA(JX,JY,2)*( PALFA(JX,JY,1)*PARG(IDX-1,IDY  ,ILEV  )
 +  + PALFA(JX,JY,2)*PARG(IDX  ,IDY  ,ILEV  ) ) )
  ENDDO
  ENDDO
C
  RETURN
  END

Re: Go patch committed: Update Go library

2011-10-31 Thread Ian Lance Taylor

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 After this change, I'm seeing another issue: most 32-bit go execution
 tests fail like this on Solaris 11/x86:

 /vol/gcc/src/hg/trunk/local/libgo/runtime/malloc.goc:366: libgo assertion 
 failure
 FAIL: go.go-torture/execute/array-1.go execution,  -O0 

 Running the test under truss, I find:

 14261:mmap(0xFF00, 805306368, PROT_NONE, MAP_PRIVATE|MAP_ANON, 
 -1, 0) Err#12 ENOMEM

 With truss -u (user function tracing), I see:

 14285/1@1:- libgo:runtime_mallocinit()
 14285/1@1:  - libgo:runtime_InitSizes()
 14285/1@1:  - libgo:runtime_InitSizes() = 2
 14285/1@1:  - libgo:runtime_SysReserve()
 14285/1:  mmap(0xFF00, 805306368, PROT_NONE, MAP_PRIVATE|MAP_ANON, 
 -1, 0) Err#12 ENOMEM
 14285/1@1:  - libgo:runtime_SysReserve() = -1
 14285/1@1:  - libgo:__go_assert_fail()

 If I remove the adjustment in runtime/malloc.goc (runtime_mallocinit),
 the test passes:

 14445/1:  mmap(0xFEF78114, 805306368, PROT_NONE, MAP_PRIVATE|MAP_ANON, 
 -1, 0) = 0xCE00

 This stuff seems incredibly fragile, and I don't exactly understand
 why.

I don't understand why one case passes and the other fails.  In an
attempt to make this work better, I committed the appended patch.  It
will at least avoid asking for impossible situations, such as the one in
this example.  Bootstrapped and ran Go testsuite on
x86_64-unknown-linux-gnu (including the 32-bit tests, which I always run
anyhow).  Committed to mainline.

Ian

diff -r 250b34075533 libgo/runtime/malloc.goc
--- a/libgo/runtime/malloc.goc	Mon Oct 31 21:54:06 2011 -0700
+++ b/libgo/runtime/malloc.goc	Mon Oct 31 22:18:21 2011 -0700
@@ -358,6 +358,8 @@
 		// away from the running binary image and then round up
 		// to a MB boundary.
 		want = (byte*)(((uintptr)end + (118) + (120) - 1)~((120)-1));
+		if(0x - (uintptr)want = bitmap_size + arena_size)
+		  want = 0;
 		p = runtime_SysReserve(want, bitmap_size + arena_size);
 		if(p == nil)
 			runtime_throw(runtime: cannot reserve arena virtual address space);

78 matches

Mail list logo