Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-04-14 Thread Tom de Vries
On 27/01/12 21:37, Tom de Vries wrote:
 On 24/01/12 11:40, Richard Guenther wrote:
 On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries tom_devr...@mentor.com 
 wrote:
 Richard,
 Jakub,

 the following patch fixes PR51879.

 Consider the following test-case:
 ...
 int bar (int);
 void baz (int);

 void
 foo (int y)
 {
  int a;
  if (y == 6)
a = bar (7);
  else
a = bar (7);
  baz (a);
 }
 ...

 after compiling at -02, the representation looks like this before 
 tail-merging:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1714_7 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_3 = barD.1703 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1714_8 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_4 = barD.1703 (7);
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1709_1 = PHI aD.1709_3(3), aD.1709_4(4)
  # .MEMD.1714_5 = PHI .MEMD.1714_7(3), .MEMD.1714_8(4)
  # .MEMD.1714_9 = VDEF .MEMD.1714_5
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1705 (aD.1709_1);
  # VUSE .MEMD.1714_9
  return;
 ...

 the patch allows aD.1709_4 to be value numbered to aD.1709_3, and 
 .MEMD.1714_8
 to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.

 The patch makes sure non-const/pure call results (gimple_vdef and
 gimple_call_lhs) are properly value numbered.

 Bootstrapped and reg-tested on x86_64.

 ok for stage1?

 The following cannot really work:

 @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
result = vn_reference_lookup_1 (vr1, NULL);
if (result)
  {
 -  changed = set_ssa_val_to (lhs, result);
 +  tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
 +  if (vdef)
 +   changed |= set_ssa_val_to (vdef, result_vdef);
 +  changed |= set_ssa_val_to (lhs, result);

 because 'result' may be not an SSA name.  It might also not have
 a proper definition statement (if VN_INFO (result)-needs_insertion
 is true).  So you at least need to guard things properly.

 
 Right. And that also doesn't work if the function is without lhs, such as in 
 the
 new test-case pr51879-6.c.
 
 I fixed this by storing both lhs and vdef, such that I don't have to derive
 the vdef from the lhs.
 
 (On a side-note - I _did_ want to remove value-numbering virtual operands
 at some point ...)

 
 Doing so willl hurt performance of tail-merging in its current form.
 OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that
 value numbering as used in tail-merging has its limitations too.
 Do you have any ideas how to address that one?
 
 @@ -3359,8 +3366,10 @@ visit_use (tree use)
   /* ???  We should handle stores from calls.  */
   else if (TREE_CODE (lhs) == SSA_NAME)
 {
 + tree vuse = gimple_vuse (stmt);
   if (!gimple_call_internal_p (stmt)
 -  gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST))
 +  (gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST)
 + || (vuse  SSA_VAL (vuse) != VN_TOP)))
 changed = visit_reference_op_call (lhs, stmt);
   else
 changed = defs_to_varying (stmt);

 ... exactly because of the issue that a stmt has multiple defs.  Btw,
 vuse should have been visited here or be part of our SCC, so, why do
 you need this check?

 
 Removed now, that was a workaround for a bug in an earlier version of the 
 patch,
 that I didn't need anymore.
 
 Bootstrapped and reg-tested on x86_64.
 
 OK for stage1?
 

Richard,

quoting you in http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00618.html:
...
I think these fixes hint at that we should
use structural equality as fallback if value-numbering doesn't equate
two stmt effects.  Thus, treat two stmts with exactly the same operands
and flags as equal and using value-numbering to canonicalize operands
(when they are SSA names) for that comparison, or use VN entirely
if there are no side-effects on the stmt.

Changing value-numbering of virtual operands, even if it looks correct in the
simple cases you change, doesn't look like a general solution for the missed
tail merging opportunities.
...

The test-case pr51879-6.c shows a case where improving value numbering will help
tail-merging, but structural equality comparison not:
...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1717_7 = VDEF .MEMD.1717_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  blaD.1708 (5);
  # .MEMD.1717_8 = VDEF .MEMD.1717_7
  # USE = nonlocal
  # CLB = nonlocal
  aD.1712_3 = barD.1704 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1717_9 = VDEF .MEMD.1717_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  blaD.1708 (5);
  # .MEMD.1717_10 = VDEF .MEMD.1717_9
  # USE = nonlocal
  # CLB = nonlocal
  

Re: [PATCH] Remove dead labels to increase superblock scope

2012-04-14 Thread Tom de Vries
On 09/12/11 14:59, Eric Botcazou wrote:
 2011-12-07  Tom de Vries  t...@codesourcery.com

  * cfgcleanup.c (try_optimize_cfg): Replace call to delete_isns_chain by
 
 delete_insn_chain
 
  call to delete_insn.  Remove code to reorder BASIC_BLOCK note and
  DELETED_LABEL note, and move it to ...
  * cfg_rtl.c (delete_insn):  Change return type to void.
 
 The file is cfgrtl.c without underscore.  When you have a to... somewhere, 
 you need to have a ...here somewhere else:
 
   * cfgrtl.c (delete_insn): ...here.  Change return type to void.
 
 The hunk for delete_insn contains a line with trailing TABs and spaces.
 
  (delete_insn_and_edges): Likewise.
  (delete_insn_chain): Handle new return type of delete_insn.  Delete
  chain backwards rather than forwards.
  * rtl.h (delete_insn, delete_insn_and_edges): Change return type to
  void.
  * cfglayout.c (fixup_reorder_chain): Delete unused label.

  * gcc.dg/superblock.c: New test.
 
 OK for stage1 when it re-opens (modulo the above nits).  Nice cleanup!
 

nits fixed and committed.

Thanks,
- Tom

2012-04-14  Tom de Vries  t...@codesourcery.com

* cfgcleanup.c (try_optimize_cfg): Replace call to delete_insn_chain by
call to delete_insn.  Remove code to reorder BASIC_BLOCK note and
DELETED_LABEL note, and move it to ...
* cfgrtl.c (delete_insn): ... here.  Change return type to void.
(delete_insn_and_edges): Likewise.
(delete_insn_chain): Handle new return type of delete_insn.  Delete
chain backwards rather than forwards.
* rtl.h (delete_insn, delete_insn_and_edges): Change return type to
void.
* cfglayout.c (fixup_reorder_chain): Delete unused label.

* gcc.dg/superblock.c: New test.
Index: gcc/cfgcleanup.c
===
--- gcc/cfgcleanup.c (revision 181652)
+++ gcc/cfgcleanup.c (working copy)
@@ -2632,20 +2632,7 @@ try_optimize_cfg (int mode)
 		  || ! label_is_jump_target_p (BB_HEAD (b),
 		   BB_END (single_pred (b)
 		{
-		  rtx label = BB_HEAD (b);
-
-		  delete_insn_chain (label, label, false);
-		  /* If the case label is undeletable, move it after the
-		 BASIC_BLOCK note.  */
-		  if (NOTE_KIND (BB_HEAD (b)) == NOTE_INSN_DELETED_LABEL)
-		{
-		  rtx bb_note = NEXT_INSN (BB_HEAD (b));
-
-		  reorder_insns_nobb (label, label, bb_note);
-		  BB_HEAD (b) = bb_note;
-		  if (BB_END (b) == bb_note)
-			BB_END (b) = label;
-		}
+		  delete_insn (BB_HEAD (b));
 		  if (dump_file)
 		fprintf (dump_file, Deleted label in block %i.\n,
 			 b-index);
Index: gcc/cfglayout.c
===
--- gcc/cfglayout.c (revision 181652)
+++ gcc/cfglayout.c (working copy)
@@ -857,6 +857,9 @@ fixup_reorder_chain (void)
    (e_taken-src, e_taken-dest));
 		  e_taken-flags |= EDGE_FALLTHRU;
 		  update_br_prob_note (bb);
+		  if (LABEL_NUSES (ret_label) == 0
+		   single_pred_p (e_taken-dest))
+		delete_insn (ret_label);
 		  continue;
 		}
 	}
Index: gcc/rtl.h
===
--- gcc/rtl.h (revision 181652)
+++ gcc/rtl.h (working copy)
@@ -2460,12 +2460,12 @@ extern void add_insn_before (rtx, rtx, s
 extern void add_insn_after (rtx, rtx, struct basic_block_def *);
 extern void remove_insn (rtx);
 extern rtx emit (rtx);
-extern rtx delete_insn (rtx);
+extern void delete_insn (rtx);
 extern rtx entry_of_function (void);
 extern void emit_insn_at_entry (rtx);
 extern void delete_insn_chain (rtx, rtx, bool);
 extern rtx unlink_insn_chain (rtx, rtx);
-extern rtx delete_insn_and_edges (rtx);
+extern void delete_insn_and_edges (rtx);
 extern rtx gen_lowpart_SUBREG (enum machine_mode, rtx);
 extern rtx gen_const_mem (enum machine_mode, rtx);
 extern rtx gen_frame_mem (enum machine_mode, rtx);
Index: gcc/cfgrtl.c
===
--- gcc/cfgrtl.c (revision 181652)
+++ gcc/cfgrtl.c (working copy)
@@ -111,12 +111,11 @@ can_delete_label_p (const_rtx label)
 	   !in_expr_list_p (forced_labels, label));
 }
 
-/* Delete INSN by patching it out.  Return the next insn.  */
+/* Delete INSN by patching it out.  */
 
-rtx
+void
 delete_insn (rtx insn)
 {
-  rtx next = NEXT_INSN (insn);
   rtx note;
   bool really_delete = true;
 
@@ -128,11 +127,22 @@ delete_insn (rtx insn)
   if (! can_delete_label_p (insn))
 	{
 	  const char *name = LABEL_NAME (insn);
+	  basic_block bb = BLOCK_FOR_INSN (insn);
+	  rtx bb_note = NEXT_INSN (insn);
 
 	  really_delete = false;
 	  PUT_CODE (insn, NOTE);
 	  NOTE_KIND (insn) = NOTE_INSN_DELETED_LABEL;
 	  NOTE_DELETED_LABEL_NAME (insn) = name;
+
+	  if (bb_note != NULL_RTX  NOTE_INSN_BASIC_BLOCK_P (bb_note)
+	   BLOCK_FOR_INSN (bb_note) == bb)
+	{
+	  reorder_insns_nobb (insn, insn, bb_note);
+	  BB_HEAD (bb) = bb_note;
+	  if 

Re: [PATCH][ARM] NEON DImode neg

2012-04-14 Thread Andrew Stubbs

On 12/04/12 16:48, Richard Earnshaw wrote:

If negation in Neon needs a scratch register, it seems to me to be
somewhat odd that we're disparaging the ARM version.

Also, wouldn't it be sensible to support a variant that was
early-clobber on operand 0, but loaded immediate zero into that value first:

vmovDd, #0
vsubDd, Dd, Dm

That way you'll never need more than two registers, whereas today you
want three.


This patch implements the changes you suggested.

I've done a full bootstrap and test and found no regressions.

OK?

Andrew

P.S. This patch can't actually be committed until my NEON DImode 
immediate constants patch is approved and committed. (Without that the 
load #0 needs a constant pool, and loading constants this late has a bug 
at -O0.)


Re: [PATCH][ARM] NEON DImode neg

2012-04-14 Thread Andrew Stubbs

And now with the patch. :(

On 14/04/12 13:48, Andrew Stubbs wrote:

On 12/04/12 16:48, Richard Earnshaw wrote:

If negation in Neon needs a scratch register, it seems to me to be
somewhat odd that we're disparaging the ARM version.

Also, wouldn't it be sensible to support a variant that was
early-clobber on operand 0, but loaded immediate zero into that value
first:

vmov Dd, #0
vsub Dd, Dd, Dm

That way you'll never need more than two registers, whereas today you
want three.


This patch implements the changes you suggested.

I've done a full bootstrap and test and found no regressions.

OK?

Andrew

P.S. This patch can't actually be committed until my NEON DImode
immediate constants patch is approved and committed. (Without that the
load #0 needs a constant pool, and loading constants this late has a bug
at -O0.)


2012-04-12  Andrew Stubbs  a...@codesourcery.com

	gcc/
	* config/arm/arm.md (negdi2): Use gen_negdi2_neon.
	* config/arm/neon.md (negdi2_neon): New insn.
	Also add splitters for core and NEON registers.

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 751997f..f1dbbf7 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -4048,7 +4048,13 @@
 	 (neg:DI (match_operand:DI 1 s_register_operand )))
 (clobber (reg:CC CC_REGNUM))])]
   TARGET_EITHER
-  
+  {
+if (TARGET_NEON)
+  {
+emit_insn (gen_negdi2_neon (operands[0], operands[1]));
+	DONE;
+  }
+  }
 )
 
 ;; The constraints here are to prevent a *partial* overlap (where %Q0 == %R1).
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 3c88568..8c8b02d 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -922,6 +922,45 @@
 (const_string neon_int_3)))]
 )
 
+(define_insn negdi2_neon
+  [(set (match_operand:DI 0 s_register_operand	 =w, w,r,r)
+	(neg:DI (match_operand:DI 1 s_register_operand   w, w,0, r)))
+   (clobber (match_scratch:DI 2 = X,w,X, X))
+   (clobber (reg:CC CC_REGNUM))]
+  TARGET_NEON
+  #
+  [(set_attr length 8)]
+)
+
+; Split negdi2_neon for vfp registers
+(define_split
+  [(set (match_operand:DI 0 s_register_operand )
+	(neg:DI (match_operand:DI 1 s_register_operand )))
+   (clobber (match_scratch:DI 2 ))
+   (clobber (reg:CC CC_REGNUM))]
+  TARGET_NEON  reload_completed  IS_VFP_REGNUM (REGNO (operands[0]))
+  [(set (match_dup 2) (const_int 0))
+   (parallel [(set (match_dup 0) (minus:DI (match_dup 2) (match_dup 1)))
+	  (clobber (reg:CC CC_REGNUM))])]
+  {
+if (!REG_P (operands[2]))
+  operands[2] = operands[0];
+  }
+)
+
+; Split negdi2_neon for core registers
+(define_split
+  [(set (match_operand:DI 0 s_register_operand )
+	(neg:DI (match_operand:DI 1 s_register_operand )))
+   (clobber (match_scratch:DI 2 ))
+   (clobber (reg:CC CC_REGNUM))]
+  TARGET_32BIT  reload_completed
+arm_general_register_operand (operands[0], DImode)
+  [(parallel [(set (match_dup 0) (neg:DI (match_dup 1)))
+	  (clobber (reg:CC CC_REGNUM))])]
+  
+)
+
 (define_insn *uminmode3_neon
   [(set (match_operand:VDQIW 0 s_register_operand =w)
 	(umin:VDQIW (match_operand:VDQIW 1 s_register_operand w)


Re: _GLIBCXX_ATOMIC_BUILTINS too coarse

2012-04-14 Thread Alan Modra
This patch partially reverts the change made on 2012-02-10 that
partially disabled builtin atomics on powerpc, resulting in
inconsistent locking (mix of atomics and pthread mutexes) and an ABI
incompatibility with previous versions of libstdc++.  See the PR for
all the gory details.  Applying to mainline with Jonathan's approval,
and in a few days to the 4.7 branch assuming no problems appear due to
this change on mainline.

PR libstdc++/52839
* acinclude.m4 (_GLIBCXX_ATOMIC_BUILTINS): Do not depend on
glibcxx_cv_atomic_long_long.
* configure: Regenerate.

Index: libstdc++-v3/acinclude.m4
===
--- libstdc++-v3/acinclude.m4   (revision 186130)
+++ libstdc++-v3/acinclude.m4   (working copy)
@@ -2861,11 +2861,10 @@
   CXXFLAGS=$old_CXXFLAGS
   AC_LANG_RESTORE
 
-  # Set atomicity_dir to builtins if all of above tests pass.
+  # Set atomicity_dir to builtins if all but the long long test above passes.
   if test $glibcxx_cv_atomic_bool = yes \
   test $glibcxx_cv_atomic_short = yes \
-  test $glibcxx_cv_atomic_int = yes \
-  test $glibcxx_cv_atomic_long_long = yes ; then
+  test $glibcxx_cv_atomic_int = yes; then
 AC_DEFINE(_GLIBCXX_ATOMIC_BUILTINS, 1,
 [Define if the compiler supports C++11 atomics.])
 atomicity_dir=cpu/generic/atomicity_builtins

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Fix PRs c/52283/37985

2012-04-14 Thread Manuel López-Ibáñez
As far as I know, this patch hasn't been reviewed:

http://patchwork.ozlabs.org/patch/150636/

Cheers,

Manuel.


On 4 April 2012 10:17, Christian Bruel christian.br...@st.com wrote:
 Hello,

 Is it OK to push the cleaning of TREE_NO_WARNING to fix the constant
 expressions errors discrepancies, as discussed in bugzilla #52283, now that
 the trunk is open ?

 Many thanks,





[PATCH] Fix PR52976

2012-04-14 Thread William J. Schmidt
This patch corrects two errors in reassociating expressions with
repeated factors.  First, undistribution needs to recognize repeated
factors.  For now, repeated factors will be ineligible for this
optimization.  In the future, this can be improved.  Second, when a
__builtin_powi call is introduced, its target SSA name must be given a
rank higher than other operands in the operand list.  Otherwise, uses of
the call result may be introduced prior to the call.

Bootstrapped and regression tested on powerpc64-linux.  Confirmed that
cpu2000 and cpu2006 SPEC tests build cleanly.  OK for trunk?

Thanks,
Bill


2012-04-14  Bill Schmidt  wschm...@linux.vnet.ibm.com

PR tree-optimization/52976
* tree-ssa-reassoc.c (add_to_ops_vec_max_rank): New function.
(undistribute_ops_list): Ops with repeat counts aren't eligible for
undistribution.
(attempt_builtin_powi): Call add_to_ops_vec_max_rank.


Index: gcc/tree-ssa-reassoc.c
===
--- gcc/tree-ssa-reassoc.c  (revision 186393)
+++ gcc/tree-ssa-reassoc.c  (working copy)
@@ -544,6 +544,28 @@ add_repeat_to_ops_vec (VEC(operand_entry_t, heap)
   reassociate_stats.pows_encountered++;
 }
 
+/* Add an operand entry to *OPS for the tree operand OP, giving the
+   new entry a larger rank than any other operand already in *OPS.  */
+
+static void
+add_to_ops_vec_max_rank (VEC(operand_entry_t, heap) **ops, tree op)
+{
+  operand_entry_t oe = (operand_entry_t) pool_alloc (operand_entry_pool);
+  operand_entry_t oe1;
+  unsigned i;
+  unsigned max_rank = 0;
+
+  FOR_EACH_VEC_ELT (operand_entry_t, *ops, i, oe1)
+if (oe1-rank  max_rank)
+  max_rank = oe1-rank;
+
+  oe-op = op;
+  oe-rank = max_rank + 1;
+  oe-id = next_operand_entry_id++;
+  oe-count = 1;
+  VEC_safe_push (operand_entry_t, heap, *ops, oe);
+}
+
 /* Return true if STMT is reassociable operation containing a binary
operation with tree code CODE, and is inside LOOP.  */
 
@@ -1200,6 +1222,7 @@ undistribute_ops_list (enum tree_code opcode,
   dcode = gimple_assign_rhs_code (oe1def);
   if ((dcode != MULT_EXPR
dcode != RDIV_EXPR)
+ || oe1-count != 1
  || !is_reassociable_op (oe1def, dcode, loop))
continue;
 
@@ -1243,6 +1266,8 @@ undistribute_ops_list (enum tree_code opcode,
  oecount c;
  void **slot;
  size_t idx;
+ if (oe1-count != 1)
+   continue;
  c.oecode = oecode;
  c.cnt = 1;
  c.id = next_oecount_id++;
@@ -1311,7 +1336,7 @@ undistribute_ops_list (enum tree_code opcode,
 
  FOR_EACH_VEC_ELT (operand_entry_t, subops[i], j, oe1)
{
- if (oe1-op == c-op)
+ if (oe1-op == c-op  oe1-count == 1)
{
  SET_BIT (candidates2, i);
  ++nr_candidates2;
@@ -3275,8 +3300,10 @@ attempt_builtin_powi (gimple stmt, VEC(operand_ent
  gsi_insert_before (gsi, pow_stmt, GSI_SAME_STMT);
}
 
-  /* Append the result of this iteration to the ops vector.  */
-  add_to_ops_vec (ops, iter_result);
+  /* Append the result of this iteration to the ops vector.
+ Give it a rank higher than all other ranks in the ops vector
+ so that all uses of it will be forced to come after it.  */
+  add_to_ops_vec_max_rank (ops, iter_result);
 
   /* Decrement the occurrence count of each element in the product
 by the count found above, and remove this many copies of each




Re: [PATCH 01/11] Fix cpp_sys_macro_p with -ftrack-macro-expansion

2012-04-14 Thread Dodji Seketeli
Jason Merrill ja...@redhat.com writes:

 On 04/10/2012 10:55 AM, Dodji Seketeli wrote:
 +  if (CPP_OPTION (pfile, track_macro_expansion))

 I think this should check context-tokens_kind rather than the
 compiler flag.

OK.

Below is the updated patch that does that.

Tested and bootstrapped on x86_64-unknown-linux-gnu against trunk.

libcpp/

* macro.c (cpp_sys_macro_p):  Support -ftrack-macro-expansion.
---
 libcpp/macro.c |7 ++-
 1 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/libcpp/macro.c b/libcpp/macro.c
index 54de3e3..4f8e52f 100644
--- a/libcpp/macro.c
+++ b/libcpp/macro.c
@@ -2436,7 +2436,12 @@ cpp_get_token_with_location (cpp_reader *pfile, 
source_location *loc)
 int
 cpp_sys_macro_p (cpp_reader *pfile)
 {
-  cpp_hashnode *node = pfile-context-c.macro;
+  cpp_hashnode *node = NULL;
+
+  if (pfile-context-tokens_kind == TOKENS_KIND_EXTENDED)
+node = pfile-context-c.mc-macro_node;
+  else
+node = pfile-context-c.macro;
 
   return node  node-value.macro  node-value.macro-syshdr;
 }
-- 
Dodji


Re: [Patch, Fortran] PR52916 - fix TREE_PUBLIC() = 0 for module procedures

2012-04-14 Thread Tobias Burnus

* PING *

It is a rather serious rejects-valid regression. It also affects SPEC 
CPU 2006 and the patch has been confirmed (cf. PR) to fix the regression.


Tobias

Tobias Burnus wrote:

Dear all,

my recent patch for setting PRIVATE module variables and procedures to 
TREE_PUBLIC()=0 had a flaw: I completely forgot about generic 
interfaces. Even if the specific name is PRIVATE, the specific 
function is still callable through the a (public) generic name.


Thanks to HJ for the report. (The bug causes a failures of SPEC CPU 
2006.)


I think the handling of type-bound procedures is correct. However, I 
wouldn't mind if someone could confirm it. I only check for the 
specific entries as GENERIC, OPERATOR and ASSIGNMENT use a 
type-bound-proc name, which is already handled. I also didn't try to 
optimize for private DT, private generics etc. First, I think it is 
not needed. And secondly, through inheritance, it can get extremely 
complicated.


Build and regtested on x86-64-linux.
OK for the trunk?

Tobias




Re: [Patch, Fortran] PR52916 - fix TREE_PUBLIC() = 0 for module procedures

2012-04-14 Thread Thomas Koenig

Hi Tobias,


* PING *

It is a rather serious rejects-valid regression. It also affects SPEC
CPU 2006 and the patch has been confirmed (cf. PR) to fix the regression.


OK for trunk.  Thanks for the patch!

Thomas


Re: [patch, fortran] Trim spaces on list-directed reads

2012-04-14 Thread Jerry DeLisle

On 04/10/2012 08:32 AM, Thomas Koenig wrote:

Hello world,

this patch effectively trims the spaces from the string on
list-directed reads. This avoids the large overhead on
processing these spaces when reading from long lines.

I didn't do this for internal units which are arrays because
this would need a separate calculation for each record, and
would mean a major change to the way the code is structured.

The running time for the test case from PR 50673 is reduced
from ~8 s to 0.1 s by this patch.

Regression-tested. OK for trunk?



yes, very good, OK for trunk.

Sorry for delay, time is squeezed badly here.

Regards,

Jerry


Re: [PATCH] Prevent 'control reaches end of non-void function' warning for DO_WHILE

2012-04-14 Thread Tom de Vries
Jason,

On 18/02/12 09:33, Jason Merrill wrote:
 On 01/22/2012 03:38 AM, Tom de Vries wrote:
 
 Sorry I didn't notice this patch until now; please CC me on C++ patches, 
 or at least mention C++ in the subject line.
 

OK, will do.

 +  tree expr = NULL;
 +  append_to_statement_list (*block, expr);
 +  *block = expr;
 
 Rather than doing this dance here, I think it would be better to enhance 
 append_to_statement_list to handle the case of the list argument being a 
 non-list.
 

Added return value to append_to_statement_list, so now it's:

*block = append_to_statement_list (*block, NULL);

 +  cp_walk_tree (incr, cp_genericize_r, data, NULL);

 +  if (incr  EXPR_P (incr))
 +SET_EXPR_LOCATION (incr, start_locus);
 
 It might be better to set the location on incr before genericizing, so 
 that the location can trickle down.
 

I see. I reordered the code.

 + t = build1 (GOTO_EXPR, void_type_node, get_bc_label (bc_break));
 + SET_EXPR_LOCATION (t, start_locus);
 
 Here you can use build1_loc instead of two separate statements.
 

Done.

 -  /* If we use a LOOP_EXPR here, we have to feed the whole thing
 -back through the main gimplifier to lower it.  Given that we
 -have to gimplify the loop body NOW so that we can resolve
 -break/continue stmts, seems easier to just expand to gotos.  */
 
 Since we're now lowering the loop at genericize time rather than 
 gimplify, what's the rationale for not using LOOP_EXPR/EXIT_EXPR?  We 
 should still say something here.  I suppose the rationale is that the C 
 front end currently goes straight to gotos.
 

I think we could indeed use LOOP_EXPR/EXIT_EXPR. I'm not sure what the gain
would be, since at least for C it would be a construct alive only between
genericize and gimplify. But I guess it makes sense from orthogonality point of
view.

Added a comment with todo.

   if (cond != error_mark_node)
 {
 - gimplify_expr (cond, exit_seq, NULL, is_gimple_val, 
 fb_rvalue);
 - stmt = gimple_build_cond (NE_EXPR, cond,
 -   build_int_cst (TREE_TYPE (cond), 0),
 -   gimple_label_label (top),
 -   get_bc_label (bc_break));
 - gimple_seq_add_stmt (exit_seq, stmt);
 + cond = build2 (NE_EXPR, boolean_type_node, cond,
 +build_int_cst (TREE_TYPE (cond), 0));
 
 I don't think we still need this extra comparison to 0, that seems like 
 a gimple-specific thing.
 

OK, removed.

if (FOR_INIT_STMT (stmt))
 +append_to_statement_list (FOR_INIT_STMT (stmt), expr);

 +  genericize_cp_loop (loop, EXPR_LOCATION (stmt), FOR_COND (stmt),
 + FOR_BODY (stmt), FOR_EXPR (stmt), 1, walk_subtrees, 
 data);
 
 The call to genericize_cp_loop will clear *walk_subtrees, which means we 
 don't genericize the FOR_INIT_STMT.
 

I see. Added call to cp_generize_r.

 +  tree jump = build_and_jump (label);
 
 Again, let's use build1_loc.
 

Done.

 +  *stmt_p = build_and_jump (label);
 +  SET_EXPR_LOCATION (*stmt_p, location);
 
 And here.
 

Done.

 +  stmt = make_node (OMP_FOR);
 
 Why make a new OMP_FOR rather than repurpose the one we already have? 
 We've already modified its operands.
 

Done.

Bootstrapped and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom

 Jason

2012-04-14  Tom de Vries  t...@codesourcery.com

* tree-iterator.c (append_to_statement_list_1, append_to_statement_list)
(append_to_statement_list_force): Return resulting list.  Handle
list_p == NULL.
* tree-iterator.h (append_to_statement_list)
(append_to_statement_list_force): Add tree return type.

* cp-gimplify.c (begin_bc_block): Add location parameter and use as
location argument to create_artificial_label.
(finish_bc_block): Change return type to void.  Remove body_seq
parameter, and add block parameter.  Append label to STMT_LIST and
return in block.
(gimplify_cp_loop, gimplify_for_stmt, gimplify_while_stmt)
(gimplify_do_stmt, gimplify_switch_stmt): Remove function.
(genericize_cp_loop, genericize_for_stmt, genericize_while_stmt)
(genericize_do_stmt, genericize_switch_stmt, genericize_continue_stmt)
(genericize_break_stmt, genericize_omp_for_stmt): New function.
(cp_gimplify_omp_for): Remove bc_continue processing.
(cp_gimplify_expr): Genericize VEC_INIT_EXPR.
(cp_gimplify_expr): Mark FOR_STMT, WHILE_STMT, DO_STMT, SWITCH_STMT,
CONTINUE_STMT, and BREAK_STMT as unreachable.
(cp_genericize_r): Genericize FOR_STMT, WHILE_STMT, DO_STMT,
SWITCH_STMT, CONTINUE_STMT, BREAK_STMT and OMP_FOR.
(cp_genericize_tree): New function, factored out of ...
(cp_genericize): ... this function.

* g++.dg/pr51264-4.C: New test.
Index: gcc/tree-iterator.c

[v3] libstdc++/52699

2012-04-14 Thread Paolo Carlini

Hi,

this is what I'm going to commit to mainline (and likely 4.7.1 too) to 
solve a number of issues affecting the implementation of 
independent_bits_engine::operator()() as originally contributed: 
essentially we want to be very careful with overflows (+ other lesser 
issues and small optimizations). Many thanks to Marc Glisse for off-line 
reviewing and helping over the last day or so!!


Tested x86_64-linux multilib.

Paolo.


2012-04-14  Paolo Carlini  paolo.carl...@oracle.com

PR libstdc++/52699
* include/bits/random.tcc (independent_bits_engine::operator()())
Avoid various overflows; use common_type on result_type and
_RandomNumberEngine::result_type; avoid floating point computations;
other smaller tweaks.

* include/bits/random.tcc (uniform_int_distribution::operator())
Use common_type; assume _UniformRandomNumberGenerator::result_type
unsigned; tidy.

* include/bits/stl_algobase.h (__lg(unsigned), __lg(unsigned long),
__lg(unsigned long long)): Add.
Index: include/bits/stl_algobase.h
===
--- include/bits/stl_algobase.h (revision 186448)
+++ include/bits/stl_algobase.h (working copy)
@@ -989,14 +989,26 @@
   __lg(int __n)
   { return sizeof(int) * __CHAR_BIT__  - 1 - __builtin_clz(__n); }
 
+  inline unsigned
+  __lg(unsigned __n)
+  { return sizeof(int) * __CHAR_BIT__  - 1 - __builtin_clz(__n); }
+
   inline long
   __lg(long __n)
   { return sizeof(long) * __CHAR_BIT__ - 1 - __builtin_clzl(__n); }
 
+  inline unsigned long
+  __lg(unsigned long __n)
+  { return sizeof(long) * __CHAR_BIT__ - 1 - __builtin_clzl(__n); }
+
   inline long long
   __lg(long long __n)
   { return sizeof(long long) * __CHAR_BIT__ - 1 - __builtin_clzll(__n); }
 
+  inline unsigned long long
+  __lg(unsigned long long __n)
+  { return sizeof(long long) * __CHAR_BIT__ - 1 - __builtin_clzll(__n); }
+
 _GLIBCXX_END_NAMESPACE_VERSION
 
 _GLIBCXX_BEGIN_NAMESPACE_ALGO
Index: include/bits/random.tcc
===
--- include/bits/random.tcc (revision 186448)
+++ include/bits/random.tcc (working copy)
@@ -730,40 +730,65 @@
 independent_bits_engine_RandomNumberEngine, __w, _UIntType::
 operator()()
 {
-  const long double __r = static_castlong double(_M_b.max())
-   - static_castlong double(_M_b.min()) + 1.0L;
-  const result_type __m = std::log(__r) / std::log(2.0L);
-  result_type __n, __n0, __y0, __y1, __s0, __s1;
+  typedef typename _RandomNumberEngine::result_type _Eresult_type;
+  const _Eresult_type __r
+   = (_M_b.max() - _M_b.min()  std::numeric_limits_Eresult_type::max()
+  ? _M_b.max() - _M_b.min() + 1 : 0);
+  const unsigned __edig = std::numeric_limits_Eresult_type::digits;
+  const unsigned __m = __r ? std::__lg(__r) : __edig;
+
+  typedef typename std::common_type_Eresult_type, result_type::type
+   __ctype;
+  const unsigned __cdig = std::numeric_limits__ctype::digits;
+
+  unsigned __n, __n0;
+  __ctype __s0, __s1, __y0, __y1;
+
   for (size_t __i = 0; __i  2; ++__i)
{
  __n = (__w + __m - 1) / __m + __i;
  __n0 = __n - __w % __n;
- const result_type __w0 = __w / __n;
- const result_type __w1 = __w0 + 1;
- __s0 = result_type(1)  __w0;
- __s1 = result_type(1)  __w1;
- __y0 = __s0 * (__r / __s0);
- __y1 = __s1 * (__r / __s1);
- if (__r - __y0 = __y0 / __n)
+ const unsigned __w0 = __w / __n;  // __w0 = __m
+
+ __s0 = 0;
+ __s1 = 0;
+ if (__w0  __cdig)
+   {
+ __s0 = __ctype(1)  __w0;
+ __s1 = __s0  1;
+   }
+
+ __y0 = 0;
+ __y1 = 0;
+ if (__r)
+   {
+ __y0 = __s0 * (__r / __s0);
+ if (__s1)
+   __y1 = __s1 * (__r / __s1);
+
+ if (__r - __y0 = __y0 / __n)
+   break;
+   }
+ else
break;
}
 
   result_type __sum = 0;
   for (size_t __k = 0; __k  __n0; ++__k)
{
- result_type __u;
+ __ctype __u;
  do
__u = _M_b() - _M_b.min();
- while (__u = __y0);
- __sum = __s0 * __sum + __u % __s0;
+ while (__y0  __u = __y0);
+ __sum = __s0 * __sum + (__s0 ? __u % __s0 : __u);
}
   for (size_t __k = __n0; __k  __n; ++__k)
{
- result_type __u;
+ __ctype __u;
  do
__u = _M_b() - _M_b.min();
- while (__u = __y1);
- __sum = __s1 * __sum + __u % __s1;
+ while (__y1  __u = __y1);
+ __sum = __s1 * __sum + (__s1 ? __u % __s1 : __u);
}
   return __sum;
 }
@@ -840,12 +865,11 @@
   operator()(_UniformRandomNumberGenerator __urng,
 const 

[cxx-conversion] COMPILER_FOR_BUILD in C++ mode

2012-04-14 Thread Pedro Lamarão

2012-04-14   Pedro Lamarão  pedro.lama...@gmail.com

* Makefile.in: define COMPILER_FOR_BUILD etc. conditionally
according with ENABLE_BUILD_WITH_CXX.
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -731,12 +731,22 @@ CC_FOR_BUILD = @CC_FOR_BUILD@
 BUILD_CFLAGS= @BUILD_CFLAGS@ -DGENERATOR_FILE
 
 # Native compiler that we use.  This may be C++ some day.
+ifneq ($(ENABLE_BUILD_WITH_CXX),yes)
 COMPILER_FOR_BUILD = $(CC_FOR_BUILD)
 BUILD_COMPILERFLAGS = $(BUILD_CFLAGS)
+else
+COMPILER_FOR_BUILD = $(COMPILER)
+BUILD_COMPILERFLAGS = $(COMPILER_FLAGS) -DIN_GCC -DGENERATOR_FILE
+endif
 
 # Native linker that we use.
+ifneq ($(ENABLE_BUILD_WITH_CXX),yes)
 LINKER_FOR_BUILD = $(CC_FOR_BUILD)
 BUILD_LINKERFLAGS = $(BUILD_CFLAGS)
+else
+LINKER_FOR_BUILD = $(LINKER)
+BUILD_LINKERFLAGS = $(LINKER_FLAGS)
+endif
 
 # Native linker and preprocessor flags.  For x-fragment overrides.
 BUILD_LDFLAGS=@BUILD_LDFLAGS@



Re: [PATCH 01/11] Fix cpp_sys_macro_p with -ftrack-macro-expansion

2012-04-14 Thread Jason Merrill

OK.

Jason